Hash join with direct indexing

**Is your feature request related to a problem or challenge? Please describe what you are trying to do.**
When the range of the join keys is small and we have integer keys, instead of using a hashtable we can use a array/vec which starts on the minimum value of the values in the build-side key range.

Idea from:
https://github.com/duckdb/duckdb/pull/1959

**Describe the solution you'd like**
Filter on
* join constraint should be on single integer (boolean/byte also possible, as long as it is limited in domain) keys.
* no duplicates (?) - not sure whether its required
* If no duplicates - > we can see how much unique values we have -> should be less than max limit, e.g. less than 100K values.
* Get statistics on min/max values. It should be a small enough build-side (e.g. max 100K key values)

If this is all true we can copy the offsets to somethig like a `Vec<Option<u64>>` (or arrow equivalent) which contains the offsets starting at the minimum offset at index 0. Each element is indexed by `x - MIN(x)`.

Instead of hashing the right-side values and probing, we can compute `y - MIN(x)` for the right side and index directly into the array. Checking hash collisions is not necessary for this approach.  

**Describe alternatives you've considered**

**Additional context**


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Hash join with direct indexing #816

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Hash join with direct indexing #816

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions