Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

sql/{plan,analyzer}: Add IndexedInSubqueryFilter for selecting the subquery results first and then making indexed lookups into the child. #299

Merged
merged 2 commits into from Feb 15, 2021

Conversation

reltuk
Copy link
Contributor

@reltuk reltuk commented Feb 12, 2021

No description provided.

…bquery results first and then making indexed lookups into the child.
Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks pretty good, just a couple comments

sql/analyzer/apply_indexes_for_subquery_comparisons.go Outdated Show resolved Hide resolved
if !isGetField || !isSubquery || !isResolved {
return nil
}
referencesChildRow := nodeHasGetFieldReferenceBetween(subq.Query, len(scope.Schema()), len(scope.Schema())+len(node.Schema()))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You want node.Child.Schema here (should be the same since it's a Filter, but in principle this can vary)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it true? It seems like I want to know if it references this node's schema...if this node had more columns than node.Child, for example, I would will want to know if it referenced those columns...right?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A GetField expression always applies to rows coming from the child of the current node. Think about Project nodes. They can have a totally different schema from the node underneath them.

// range [low, high).
func nodeHasGetFieldReferenceBetween(n sql.Node, low, high int) bool {
var found bool
plan.Inspect(n, func(n sql.Node) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You can simplifiy this with plan.InspectExpressions (no need to do plan.Inspect and then sql.Inspect separately)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think InspectExpressions or InspectExpressionWithNode lets me do the shenanigans with excluding the walk into IndexedInSubqueryFilter nodes.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's true

func (i *IndexedInSubqueryFilter) RowIter(ctx *sql.Context, row sql.Row) (sql.RowIter, error) {
padded := make(sql.Row, len(row)+i.Padding)
copy(padded[:], row[:])
var res []interface{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sql.Row?

sql/plan/indexed_in_subquery_filter.go Outdated Show resolved Hide resolved

type indexedInSubqueryIter struct {
Ctx *sql.Context
Rows []interface{}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

[]sql.Row

if i.I >= len(i.Rows) {
return nil, io.EOF
}
iter, err := i.Child.RowIter(i.Ctx, sql.NewRow(i.Rows[i.I]))
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should be no need for sql.NewRow here after the above change

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't know...this is a little janky. EvalMultiple isn't returning []sql.Row, it's returning []interface{}, where each entry in the slice is a sql engine value. But the call to Child.RowIter(...) is expecting a sql.Row with a single column which is that value. I think it's typed appropriately as is?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, that's a good point

}

func (i *IndexedInSubqueryFilter) RowIter(ctx *sql.Context, row sql.Row) (sql.RowIter, error) {
padded := make(sql.Row, len(row)+i.Padding)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You sure this padding is necessary? A subquery expects to get a row for the outer scope. Why does it need nils on the end?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you have a getField expression that reaches into the nil part of the row given, it's not going to give you a correct result

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's necessary because I didn't rewrite the subquery. It's going to append its rows onto the end of the row that comes in, and it's going to be allowed to do field accesses into the prefix and the suffix of that row. Based on the inspection, it shouldn't access these nil values though.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

OK, I get it now. The field indexes in the subquery are based off analysis of the subquery expression using the scope node, and you're not giving it this scope node. You're giving it a fake one you construct that has the same length. And these null vals won't get used because this node won't be constructed in that case.

Some time I'm going to have to clean up the analyzer to make scope management more principled / consistent.

@reltuk
Copy link
Contributor Author

reltuk commented Feb 14, 2021

@zachmu Took a pass but didn't address everything. TAL if you want :).

Copy link
Member

@zachmu zachmu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

// range [low, high).
func nodeHasGetFieldReferenceBetween(n sql.Node, low, high int) bool {
var found bool
plan.Inspect(n, func(n sql.Node) bool {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh that's true

@reltuk reltuk merged commit 620d95d into master Feb 15, 2021
@Hydrocharged Hydrocharged deleted the aaron/indexed-in-subquery-filter branch February 16, 2021 10:32
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants