Move index costing into join planning phase #2191

max-hoffman · 2023-12-07T00:17:34Z

Put index costing inside join planning, so that in the future join planning will have better cardinalities (statistics) for join ordering. Most of the changes will look like refactoring the way we expression index lookups in the memo. I attempted to do this in a way that makes as few changes as possible to join planning; the goal here is to set me up for rewriting cardinality checks with stats objects. It didn't go as cleanly as I wanted, I ended up shifting a lot of join plans back to lookup plans because HASH_JOIN was beating LOOKUP_JOIN in several key places.

One downside of the current PR is that it converts a sysbench MERGE_JOIN into a LOOKUP_JOIN. I would prefer fixing this in the next PR when I do a bigger costing overhaul.

Variety of fixes for join hinting, correctness, etc.

At some point we appeared to fix this:
#1893

nicktobey

It looks like memo.Lookup and memo.BuildLookup are now unused and can be deleted.

nicktobey · 2023-12-11T20:49:38Z

sql/memo/memo.go

+	return grp
+}
+
+func (m *Memo) MemoizeConcatLookupJoin(grp, left, right *ExprGroup, op plan.JoinType, filter []sql.Expression, lookups []*IndexScan) *ExprGroup {


I'm not clear on what a "Concat Lookup Join" is and why this isn't just "MemoizeLookupJoin". I suspect it's a single node that does multiple lookups on multiple indexes, but I'm not 100% sure. Can you leave a docstring?

nicktobey · 2023-12-11T20:53:03Z

sql/memo/memo.go

@@ -213,6 +257,21 @@ func (m *Memo) MemoizeProject(grp, child *ExprGroup, projections []sql.Expressio
 	return grp
 }

+func (m *Memo) MemoizeIta(grp *ExprGroup, ita *plan.IndexedTableAccess, alias string, index *Index) *ExprGroup {


What does ITA stand for? Can you add a docstring?

nicktobey · 2023-12-11T20:53:34Z

sql/memo/memo.go

@@ -297,6 +356,7 @@ func (m *Memo) optimizeMemoGroup(grp *ExprGroup) error {
 		n = n.Next()
 	}

+	grp.fixEnforcers()


Why is this call needed? A comment would help.

nicktobey · 2023-12-11T21:08:48Z

sql/memo/expr_group.go

+func (e *ExprGroup) fixEnforcers() {
+	switch n := e.Best.(type) {
+	case *MergeJoin:
+		// todo: no ITA children that aren't the same index as sorting index


These comments are unclear. Can you elaborate?

nicktobey · 2023-12-11T21:09:13Z

sql/memo/expr_group.go

 	}
 	return result, nil
 }

+// fixEnforcers edits the children of a new best plan to account


This docstring is confusing. Can you maybe give an example?

nicktobey · 2023-12-11T21:11:57Z

sql/memo/expr_group.go

+}
+
+// Update best to a DFS path to a tablescan
+func (e *ExprGroup) fixItaConflict() {


I understand why you may want this to be a separate method from findTableScanPath, but I think it needs a docstring that explains how this fixes ita conflicts (and maybe what an ita conflict is.)

nicktobey · 2023-12-11T21:38:31Z

sql/analyzer/costed_index_scan.go

+	}
+
+	// create ranges, lookup, ITA for best indexScan
+	// TODO pass up FALSE filter information


This TODO is vague.

nicktobey · 2023-12-11T21:38:55Z

sql/analyzer/costed_index_scan.go

+	var retFilters []sql.Expression
+	if !iat.PreciseMatch() {
+		// cannot drop any filters
+		//itaGrp = m.MemoizeIta(nil, ret, aliasName, idx)


Remove this commented out code?

nicktobey · 2023-12-11T21:41:12Z

sql/memo/coster.go

+		return l*(cpuCostFactor+randIOCostFactor) - r*seqIOCostFactor - l*seqIOCostFactor, nil
+	}
+	if l*r*sel < l {
+		// 1 - (total rows - covered rows / total rows)


Make this comment more clear?

nicktobey · 2023-12-11T21:47:37Z

sql/memo/coster.go

+	if isInjectiveLookup(lookup.Index, n.JoinBase, lookup.Table.Expressions(), lookup.Table.NullMask()) {
+		sel = 0
+	} else {
+		sel = lookupJoinSelectivity(lookup) * optimisticJoinSel


We could probably pull the isInjectiveLookup check (and multiplying by optimisticJoinSel) into lookupJoinSelectivity. This pattern repeats everywhere that lookupJoinSelectivity is called.

max-hoffman marked this pull request as ready for review December 8, 2023 22:51

max-hoffman changed the title ~~Index cost refactor~~ Move index costing into join planning phase Dec 8, 2023

Move index costing into join planning

f4ef143

max-hoffman force-pushed the max/index-cost-refactor branch from d3f4abd to f4ef143 Compare December 8, 2023 23:06

max-hoffman requested a review from nicktobey December 8, 2023 23:08

max-hoffman added 2 commits December 11, 2023 12:06

skip index plan tests in server context, which can't access ctx data

660a2fd

skip more server context plans

9bdc11b

nicktobey requested changes Dec 11, 2023

View reviewed changes

nicktobey approved these changes Dec 11, 2023

View reviewed changes

max-hoffman added 4 commits December 11, 2023 15:44

nick's comments

33db422

missed lookupSelectivity

89ced5b

delete lookup

451e3b2

merge main

bd15eab

max-hoffman merged commit 5b03152 into main Dec 14, 2023
7 checks passed

max-hoffman deleted the max/index-cost-refactor branch December 14, 2023 19:25

BrewTestBot mentioned this pull request Dec 18, 2023

dolt 1.29.6 Homebrew/homebrew-core#157667

Merged

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Move index costing into join planning phase #2191

Move index costing into join planning phase #2191

max-hoffman commented Dec 7, 2023 •

edited

Loading

nicktobey left a comment

nicktobey Dec 11, 2023

nicktobey Dec 11, 2023

nicktobey Dec 11, 2023

nicktobey Dec 11, 2023

nicktobey Dec 11, 2023

nicktobey Dec 11, 2023

nicktobey Dec 11, 2023

nicktobey Dec 11, 2023

nicktobey Dec 11, 2023

nicktobey Dec 11, 2023

Move index costing into join planning phase #2191

Move index costing into join planning phase #2191

Conversation

max-hoffman commented Dec 7, 2023 • edited Loading

nicktobey left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

max-hoffman commented Dec 7, 2023 •

edited

Loading