Skip to content

Optimize JoinCondition matching#9200

Merged
fjy merged 6 commits intoapache:masterfrom
suneet-s:join-perf
Jan 21, 2020
Merged

Optimize JoinCondition matching#9200
fjy merged 6 commits intoapache:masterfrom
suneet-s:join-perf

Conversation

@suneet-s
Copy link
Copy Markdown
Contributor

@suneet-s suneet-s commented Jan 16, 2020

Description

The LookupJoinMatcher needs to check if a condition is always true or false
multiple times. This can be pre-computed to speed up the match checking

This change reduces the time it takes to perform a for joining on a long key
from ~ 36 ms/op to 23 ms/ op

Screen Shot 2020-01-16 at 3 34 16 PM

This PR has:

  • been self-reviewed.
  • added documentation for new or modified features or behaviors.
  • added Javadocs for most classes and all non-trivial methods. Linked related entities via Javadoc links.
  • added or updated version, license, or notice information in licenses.yaml
  • added comments explaining the "why" and the intent of the code wherever would not be obvious for an unfamiliar reader.
  • added unit tests or modified existing tests to cover new code paths.
  • added integration tests.
  • been tested in a test Druid cluster.

The LookupJoinMatcher needs to check if a condition is always true or false
multiple times. This can be pre-computed to speed up the match checking

This change reduces the time it takes to perform a for joining on a long key
from ~ 36 ms/op to 23 ms/ op
Copy link
Copy Markdown
Contributor

@jihoonson jihoonson left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cool, thank you for the PR. It seems like the benchmark class is not in master yet. Would you please link the benchmark class here?

return equiConditions.isEmpty() &&
nonEquiConditions.stream()
.allMatch(expr -> expr.isLiteral() && expr.eval(ExprUtils.nilBindings()).asBoolean());
return equiConditions.isEmpty() && allTrueLiteralNonEquiConditions;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems like allTrueLiteralNonEquiConditions is only used here; how about caching isAlwaysTrue directly?

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done

{
return nonEquiConditions.stream()
.anyMatch(expr -> expr.isLiteral() && !expr.eval(ExprUtils.nilBindings()).asBoolean());
return anyFalseLiteralNonEquiConditions;
Copy link
Copy Markdown
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why not call this isAlwaysFalse? (It looks like it isn't used anywhere else, and it seems to me to be easier to understand the meaning of the field if it's named after what we want it to mean.)

Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah. I wasn't sure if isAlwaysFalse could be more complex going forward. This is clearer - it took me a while to wrap my head around what isAlwaysFalse was trying to check. Added comments, and used isAlways...

@suneet-s
Copy link
Copy Markdown
Contributor Author

Cool, thank you for the PR. It seems like the benchmark class is not in master yet. Would you please link the benchmark class here?

@jihoonson here's the benchmark I was using. I will push it up to master once I clean it up a little more and add more tests
https://github.com/gianm/druid/blob/joins/benchmarks/src/main/java/org/apache/druid/benchmark/JoinAndLookupBenchmark.java

Copy link
Copy Markdown
Contributor

@gianm gianm left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, thanks.

@fjy fjy merged commit a2939bb into apache:master Jan 21, 2020
@jihoonson jihoonson added this to the 0.18.0 milestone Mar 26, 2020
@suneet-s suneet-s deleted the join-perf branch April 24, 2021 22:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants