[TRAFODION-2645] First draft of a rewrite of the MDAM costing code #1246

DaveBirdsall · 2017-09-27T21:59:22Z

This set of changes is a first draft of a rewrite of the MDAM costing code.

The rewritten code uses a model much more closely aligned to how the MDAM run-time works. It estimates the number of MDAM probes and fetches directly. I/O cost is estimated differently. I/O cost is not additive across disjuncts, because the more parts of a file that are touched, the more like sequential I/O matters become. On the other hand, the cost of an HBase scan (that is, a begin-key/end-key subset in executor terms) is significant, and its contribution to cost is additive. A knob, MDAM_SUBSET_FACTOR, has been added to tune that cost.

The cost formulas used to determine optimal disjunct prefix are as close as possible to the cost formula used to cost the MDAM scan as a whole. The only thing left out in the former is the I/O cost, as that is not additive. In contrast, in the old code, the costing formulas used for optimal disjunct prefix are quite different than that used for the scan as a whole, and it is hard to see their relationship.

I have done a performance test of the test bed in JIRA TRAFODION-1641, using old and new costing code, and forcing both serial and parallel MDAM plans of various depths, and also simple scan plans. The new code aggregate execution time over that test bed is about 6% better than the old. So the code seems to be at least as good as the old. The new code picks the optimal plan more frequently than the old. There are about eight queries (out of 92) where the old code picks a better plan than the new code.

There is still some testing work to be done on this code. Costing of the inner table of a nested join has not been fully explored yet.

In this check-in, the new costing code is turned off by default. Use CQD MDAM_COSTING_REWRITE 'ON' to turn on the new costing code.

Also included in this set of changes is a fix to logsort: If a missing statistics warning was present, logsort was not sorting the result rows.

Also included in this set of changes is a test script, testMdam.py, which can be used to test the performance of various MDAM plans and determine whether the old or new costing code is picking the better plan.

Traf-Jenkins · 2017-09-27T22:03:11Z

Check Test Started: https://jenkins.esgyn.com/job/Check-PR-master/2084/

Traf-Jenkins · 2017-09-27T22:21:32Z

Test Failed. https://jenkins.esgyn.com/job/Check-PR-master/2084/

Traf-Jenkins · 2017-09-27T22:21:50Z

Previous Test Aborted. New Check Test Started: https://jenkins.esgyn.com/job/Check-PR-master/2086/

Traf-Jenkins · 2017-09-28T01:09:09Z

Test Passed. https://jenkins.esgyn.com/job/Check-PR-master/2086/

zellerh

+1
The general idea looks good to me. Sorry, I did not look into the details of the new cost formula, given that it is still experimental and guarded by a CQD.

[TRAFODION-2645] First draft of a rewrite of the MDAM costing code

fe2a6f6

Merge branch 'master' into MDAMCostingRewrite

200735a

zellerh approved these changes Oct 2, 2017

View reviewed changes

asfgit merged commit 200735a into apache:master Oct 3, 2017

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[TRAFODION-2645] First draft of a rewrite of the MDAM costing code #1246

[TRAFODION-2645] First draft of a rewrite of the MDAM costing code #1246

DaveBirdsall commented Sep 27, 2017 •

edited

Loading

Traf-Jenkins commented Sep 27, 2017

Traf-Jenkins commented Sep 27, 2017

Traf-Jenkins commented Sep 27, 2017

Traf-Jenkins commented Sep 28, 2017

zellerh left a comment

[TRAFODION-2645] First draft of a rewrite of the MDAM costing code #1246

[TRAFODION-2645] First draft of a rewrite of the MDAM costing code #1246

Conversation

DaveBirdsall commented Sep 27, 2017 • edited Loading

Traf-Jenkins commented Sep 27, 2017

Traf-Jenkins commented Sep 27, 2017

Traf-Jenkins commented Sep 27, 2017

Traf-Jenkins commented Sep 28, 2017

zellerh left a comment

Choose a reason for hiding this comment

DaveBirdsall commented Sep 27, 2017 •

edited

Loading