Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
After repeated issues with the index join, we decided that removing it is the best choice for duckdb as of its current state. Currently, the index join is slow, and the results are often incorrect. Making it both performant and correct requires significant effort and with the current state and all the past changes to our indexes, it might be faster to reimplement the index join than attempting to fix it.
I did more benchmarking as suggested by @pdet. Looking at the profiler, we primarily see a lot of idle time spent waiting to acquire locks. From this, I concluded that we should use both write and read locks for indexes to support fast index joins. If we only use indexes for constraints, we always need to acquire a write lock, as we cannot allow concurrent changes to the index while validating constraints. In this scenario, we do not need to distinguish between read and write locks.
Benchmarks
Here is an example of a scenario where an index join should beat a hash join. However, duckdb performs significantly worse when using index joins.
Index Join
Hash Join