-
Notifications
You must be signed in to change notification settings - Fork 5.4k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Prefilter join build side when it's too large #22667
Conversation
0eea294
to
6b56668
Compare
Awesome ! Is this strictly opt in or can it be hbo or cbo'd ? Is there also a way to fail the select distinct early if its cardinality is too big ? Finally, I did not follow: can it be applied to multiple equi join keys too ? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Great idea!
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueries.java
Show resolved
Hide resolved
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueries.java
Show resolved
Hide resolved
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueries.java
Show resolved
Hide resolved
presto-tests/src/main/java/com/facebook/presto/tests/AbstractTestQueries.java
Show resolved
Hide resolved
0c67ec3
to
8cf9cdc
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can you share some performance numbers that you see in your workloads? Maybe even add a SqlBenchmark that showcases this optimization?
IIUC - this optimization reduces the size of the hash table that is built out of T2. In order to do this it adds a second table scan on T1 and then builds a second hash table to compute the distinct join key from T1. I'm curious where the benefit comes from? Is it just the improved performance of the semijoin?
Any thoughts on how this can be used in practice? I ask because this is not a cost based decision and can degrade performance in many usecases
presto-main/src/main/java/com/facebook/presto/sql/planner/optimizations/JoinPrefilter.java
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/optimizations/JoinPrefilter.java
Show resolved
Hide resolved
presto-main/src/main/java/com/facebook/presto/sql/planner/optimizations/JoinPrefilter.java
Show resolved
Hide resolved
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm curious, if we already know the distinct keys in T1, why not just make it as the build side? No need to calculate the hash values, just use the distinct values as the hash values. This way there is no need to scan T1 twice.
presto-main/src/main/java/com/facebook/presto/SystemSessionProperties.java
Outdated
Show resolved
Hide resolved
suggest minor revision of the release notes entry
|
Two potential cases - a) build side is very big and only a few keys actually match so we shuffle a lot less right side, b) after the semijoin, the build side becomes small enough to broadcast which can eliminate shuffling the full left side which could have a lot of payload. |
You need to get the rest of the fields! |
2db1b6e
to
74cf802
Compare
Added benchmark with: select count(1) from part join lineitem using (partkey) where part.name like '%x%' |
Original: join_prefilter_build_side :: 2158.693 cpu ms :: 4.17MB peak memory :: in 75.2K, 0B, 34.8K/s, 0B/s :: out 381, 26.7KB, 176/s, 12.4KB/s With optimization: join_prefilter_build_side :: 2189.438 cpu ms :: 2.02MB peak memory :: in 90.2K, 0B, 41.2K/s, 0B/s :: out 381, 26.7KB, 174/s, 12.2KB/s See the memory reduction
Add a task for making it cost based - needs tracking some new stats in hbo: |
presto-main/src/main/java/com/facebook/presto/SystemSessionProperties.java
Show resolved
Hide resolved
@ClarenceThreepwood - can you take a look when you get a chance? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please update the release note with the new name of the session property
presto-main/src/main/java/com/facebook/presto/sql/planner/optimizations/JoinPrefilter.java
Show resolved
Hide resolved
This is meta internal only? |
Oops. Sorry. here correct link: |
Done |
addressed comments |
It still has the old name here |
OK for real this time lol - damn scrollbar! |
OK all comments addressed (again) |
@kaikalur Could you explain a bit more? If we just make T1 the build side, we can still get the rest of the fields, can't we? For example, |
No but that's a join reordering problem? |
Yes, that's what I was asking. Can't we just reorder the join? Or does the SQL standard say the table on the right side of the JOIN keyword has to be the build side? It seems to me the join order is a decision of the engine, isn't it? |
My understanding is that it works when the probe side has large payload to output. For example: |
@feilong-liu Thanks for your message. It makes sense. Maybe we can keep the pointers instead of copying all these fields to the hash table in that case. |
Keep pointers sounds an interesting idea. Can you elaborate more? One challenge I can think of is that, even we keep pointers, we still need to read these payload and attach to the result later, which still need to load the payload from memory. |
The payload is already in memory from the incoming pages. But you're right, if both sides are too big to be held in memory, then the side that produces smaller hash table (including the output payload) should be the build side. |
Description
Optimize the build side of join using the distinct keys from left when the right (and left too) are large.
Motivation and Context
SELECT .. FROM T1 JOIN T2 USING(x)
can be very slow/memory intense when T2 is big (and T1 is also big). So the idea is to do something like dynamic filter except on the build side! So the above query becomes:
SELECT ... FROM T1 LEFT JOIN (SELECT * FROM T2 WHERE x IN (SELECT DISTINCT x FROM T1)) T2 USING(x)
This has helped us tremendously in some of our production workloads. So making it an optimization.
Impact
Test Plan
Added tests
Contributor checklist
Release Notes
Please follow release notes guidelines and fill in the release notes below.