Issue #7969: Prefer Range Join #8092

hawkfish · 2023-06-27T20:14:28Z

Make sure IEJoin is ready for right side projections.
Add PRAGMA for preferring range joins.
Disable PRAGMA in benchmark because it doesn't help when the code is correct(!)

Make sure IEJoin is ready for right side projections.

Finish IEJoin projection map support. Add benchmark.

Add PRAGMA for preferring range joins.

Add projection support. Disable pragma in benchmark because it doesn't help when the code is correct.

Fix smart pointer tidy madness.

…into iejoin-projection

Mytherin · 2023-06-28T13:47:20Z

Thanks for the PR! Looks good to me in principle - but I wonder if this will not regress other queries. Which join is better to use (hash join vs range join) is likely heavily dependent on the selectivity of the join predicates. If the range predicate is non-selective then this will cause large regressions.

Could we run some more benchmarks testing the various scenarios?

Could we also add some more tests that trigger the various projection map scenarios with the IE Join?

Review feedback: Add test.

hawkfish · 2023-06-29T15:04:59Z

Thanks for the PR! Looks good to me in principle - but I wonder if this will not regress other queries. Which join is better to use (hash join vs range join) is likely heavily dependent on the selectivity of the join predicates. If the range predicate is non-selective then this will cause large regressions.

Could we run some more benchmarks testing the various scenarios?

Right now this is behind a pragma that is off by default. It's really just for users to force the issue if perf is terrible.

My thinking on how to make this smarter is to check the selectivity of the equality predicates and switch if they are obviously horrible. We could throw in some estimates from here but they specifically don't work in the common case (intervals) so it would be another vague heuristic. Ideally we would have something sort of smart but extend the pragma to force either way if we guess wrong.

Could we also add some more tests that trigger the various projection map scenarios with the IE Join?

AFAICT there is only one case here where we remove unused RHS columns so I have added a test for that (just a really small version of the benchmark). All the other cases seemed to be for indexed loop joins and the like.

Attempt to stabilise random number generation.

Another attempt to generate stable random data on Linux...

Try casting to DECIMAL to fix test...

Match cast precision to ROUND.

Switch to exact aggregate.

Add magic skip_reload requirement.

hawkfish · 2023-07-07T22:52:57Z

Think the only thing that failed was Node download.

Mytherin · 2023-07-08T11:12:33Z

Thanks!

Richard Wesley added 12 commits June 15, 2023 16:22

IEJoin Planning

59a2740

Make sure IEJoin is ready for right side projections.

Merge branch 'feature' into iejoin-projection

afb2b59

Issue duckdb#7969: IEJoin Selectivity

7e602ec

Finish IEJoin projection map support. Add benchmark.

Issue duckdb#7969: Prefer Range Join

ae759cd

Add PRAGMA for preferring range joins.

Merge branch 'feature' into iejoin-projection

252e047

Issue duckdb#7969: IEJoin Selectivity

0910c30

Add projection support. Disable pragma in benchmark because it doesn't help when the code is correct.

Merge branch 'feature' into iejoin-projection

2e0c2ae

Merge branch 'feature' into iejoin-projection

3677550

Issue duckdb#7969: IEJoin Selectivity

e21e51e

Fix smart pointer tidy madness.

Merge branch 'feature' into iejoin-projection

86edc42

Merge branch 'feature' into iejoin-projection

6c9648c

Merge branch 'iejoin-projection' of https://github.com/hawkfish/duckdb …

2145fbf

…into iejoin-projection

hawkfish requested a review from Mytherin June 27, 2023 20:14

Richard Wesley added 2 commits June 28, 2023 11:56

Merge branch 'feature' into iejoin-projection

40aefa1

Issue duckdb#7969: IEJoin Projection

b6ef28f

Review feedback: Add test.

Merge branch 'feature' into iejoin-projection

9cd751b

github-actions bot marked this pull request as draft June 30, 2023 20:48

hawkfish marked this pull request as ready for review June 30, 2023 20:54

Issue duckdb#7969: IEJoin Projection

b92bb84

Attempt to stabilise random number generation.

github-actions bot marked this pull request as draft July 1, 2023 03:28

Issue duckdb#7969: IEJoin Projection Test

0913104

Another attempt to generate stable random data on Linux...

hawkfish marked this pull request as ready for review July 1, 2023 16:36

Issue duckdb#7969: IEJoin Projection Test

2d9d53f

Try casting to DECIMAL to fix test...

hawkfish added the enhancement label Jul 2, 2023

github-actions bot marked this pull request as draft July 2, 2023 12:52

Richard Wesley added 3 commits July 2, 2023 08:53

Issue duckdb#7969: IEJoin Projection Test

032a805

Match cast precision to ROUND.

Issue duckdb#7969: IEJoin Projection Test

0907dee

Switch to exact aggregate.

Issue duckdb#7969: IEJoin Projection Test

ea80946

Add magic skip_reload requirement.

Mytherin changed the base branch from feature to master July 4, 2023 13:18

Mytherin marked this pull request as ready for review July 4, 2023 15:29

Richard Wesley added 2 commits July 4, 2023 14:33

Merge branch 'master' into iejoin-projection

d1ee96e

Merge branch 'master' into iejoin-projection

553256f

github-actions bot marked this pull request as draft July 5, 2023 15:40

Merge branch 'master' into iejoin-projection

9569452

Mytherin marked this pull request as ready for review July 7, 2023 07:10

Mytherin merged commit b4c1529 into duckdb:master Jul 8, 2023
53 checks passed

hawkfish deleted the iejoin-projection branch July 26, 2023 21:30

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Issue #7969: Prefer Range Join #8092

Issue #7969: Prefer Range Join #8092

hawkfish commented Jun 27, 2023

Mytherin commented Jun 28, 2023

hawkfish commented Jun 29, 2023

hawkfish commented Jul 7, 2023

Mytherin commented Jul 8, 2023

Issue #7969: Prefer Range Join #8092

Issue #7969: Prefer Range Join #8092

Conversation

hawkfish commented Jun 27, 2023

Mytherin commented Jun 28, 2023

hawkfish commented Jun 29, 2023

hawkfish commented Jul 7, 2023

Mytherin commented Jul 8, 2023