Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP Enable converting filter -> join when an index is available. #2164

Open
wants to merge 2 commits into
base: master
from

Conversation

@wangandi
Copy link
Member

wangandi commented Mar 2, 2020

Resolves #1962

Since join order was redone when delta queries appeared, I've just uncommented the filter to join transform to see what the new plans look like.

Do the new plans look good? What other kinds of interaction should I be testing for besides what I have in test/sqllogictest/index_planning.slt?

@wangandi wangandi requested a review from frankmcsherry Mar 2, 2020
@wangandi wangandi force-pushed the wangandi:filterjoin branch 2 times, most recently from d845e43 to 72fdaea Mar 3, 2020
Get { materialize.public.bar (u3) },
ArrangeBy { keys: [[]], Constant [["this"]] },
Comment on lines +121 to +127

This comment has been minimized.

Copy link
@frankmcsherry

frankmcsherry Mar 4, 2020

Member

This is maybe an example of why we might want ScalarExpr in the join constraints. We want to introduce the constant into the expression, but doing so with a cross join will result in rough behavior.

Copy link
Member

frankmcsherry left a comment

This looks sane!

I think there are still some things to sort out in join planning, where we should be able to communicate that forming an arrangement of a constant collection is cheap. I don't think we do that at the moment, and I don't know what the negative implications are (possibly that we never create a delta query for such a join plan, because we can never find the collection arranged). From the examples, it looks like we do build delta queries, but probably only because the linear implementation forces an arrangement which the delta query picks up.

Also, per comment at least one of the examples highlights that if we need cross joins to introduce constant values that are subsequently used in indexes, we might be paying a high price in order to get the improved efficiency later in the join.

@wangandi wangandi force-pushed the wangandi:filterjoin branch from 72fdaea to 94cdd47 Mar 4, 2020
#indexes on bar(a), foo(a), and foo(b) exist
query T multiline
explain plan for select foo.a, b, c, d, e from foo, bar where foo.a = bar.a and b = 'this'
----
Project {
outputs: [0 .. 2, 5, 6],
Join {
variables: [[(0, 0), (2, 0)], [(1, 0), (2, 1)]],
implementation: DifferentialLinear,
Get { materialize.public.bar (u3) },
ArrangeBy { keys: [[]], Constant [["this"]] },
ArrangeBy {
keys: [[#0, #1]],
Get { materialize.public.foo (u1) }
}
}
}
Comment on lines +147 to +163

This comment has been minimized.

Copy link
@wangandi

wangandi Mar 4, 2020

Author Member

As discussed with Frank, this is not a good plan, and we should figure out some way of remedying it.

This was how the plan was produced:

  • First, the filter to join step noticed that there is a filter on equality of b to 'this', so it changed the filter to a join on the foo(b) index (a.k.a ArrangeBy{keys: [[#1]], Get {...(u1)}) + this.
  • At some point the two joins got fused.
  • Then the join implementation determination step replaced the ArrangeBy{keys: [[#1]], Get {...(u1)} with an ArrangeBy on something that is not already an index.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

2 participants
You can’t perform that action at this time.