Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Enable rewriting certain inner joins as filters. #11068

Merged
merged 9 commits into from
Apr 14, 2021

Commits on Apr 5, 2021

  1. Enable rewriting certain inner joins as filters.

    The main logic for doing the rewrite is in JoinableFactoryWrapper's
    segmentMapFn method. The requirements are:
    
    - It must be an inner equi-join.
    - The right-hand columns referenced by the condition must not contain any
      duplicate values. (If they did, the inner join would not be guaranteed
      to return at most one row for each left-hand-side row.)
    - No columns from the right-hand side can be used by anything other than
      the join condition itself.
    
    HashJoinSegmentStorageAdapter is also modified to pass through to
    the base adapter (even allowing vectorization!) in the case where 100%
    of join clauses could be rewritten as filters.
    
    In support of this goal:
    
    - Add Query getRequiredColumns() method to help us figure out whether
      the right-hand side of a join datasource is being used or not.
    - Add JoinConditionAnalysis getRequiredColumns() method to help us
      figure out if the right-hand side of a join is being used by later
      join clauses acting on the same base.
    - Add Joinable getNonNullColumnValuesIfAllUnique method to enable
      retrieving the set of values that will form the "in" filter.
    - Add LookupExtractor canGetKeySet() and keySet() methods to support
      LookupJoinable in its efforts to implement the new Joinable method.
    - Add "enableRewriteJoinToFilter" feature flag to
      JoinFilterRewriteConfig. The default is disabled.
    gianm committed Apr 5, 2021
    Configuration menu
    Copy the full SHA
    156878e View commit details
    Browse the repository at this point in the history
  2. Test improvements.

    gianm committed Apr 5, 2021
    Configuration menu
    Copy the full SHA
    64508f3 View commit details
    Browse the repository at this point in the history

Commits on Apr 6, 2021

  1. Test fixes.

    gianm committed Apr 6, 2021
    Configuration menu
    Copy the full SHA
    c7710b1 View commit details
    Browse the repository at this point in the history
  2. Avoid slow size() call.

    gianm committed Apr 6, 2021
    Configuration menu
    Copy the full SHA
    620129a View commit details
    Browse the repository at this point in the history
  3. Remove invalid test.

    gianm committed Apr 6, 2021
    Configuration menu
    Copy the full SHA
    62f88b7 View commit details
    Browse the repository at this point in the history
  4. Fix style.

    gianm committed Apr 6, 2021
    Configuration menu
    Copy the full SHA
    32ea480 View commit details
    Browse the repository at this point in the history
  5. Fix mistaken default.

    gianm committed Apr 6, 2021
    Configuration menu
    Copy the full SHA
    c4feee8 View commit details
    Browse the repository at this point in the history
  6. Small fixes.

    gianm committed Apr 6, 2021
    Configuration menu
    Copy the full SHA
    c262a24 View commit details
    Browse the repository at this point in the history
  7. Fix logic error.

    gianm committed Apr 6, 2021
    Configuration menu
    Copy the full SHA
    e7c06bf View commit details
    Browse the repository at this point in the history