Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

#1791: Nested hash table entries #1863

Merged
merged 32 commits into from Jun 15, 2021
Merged

Conversation

hawkfish
Copy link
Contributor

Add support for nested types as aggregate and join keys.

Richard Wesley added 25 commits June 7, 2021 10:31
Refactor the RowOperations::Gather to not require a RowLayout
as the only thing it uses is the col_offset
and most callers have easy access to that.
Test grouping by nested types.
Implement joining on nested types.
This required some questionable
template specialisation for Values.
Fix Vector passing to actually work.
Replace Value code for STRUCT comparisons with recursive templates.
Delegate nested row matching to the column matching logic.
Fix incorrect short-circuiting for IS (NOT) DISTINCT tests
(the values may not be valid if they are NULL).
Implement PositionComparator variants.
Convert STRUCT hashing from Values
to recursive scalar evaluation.
Convert LIST hashing from Values
to recursive scalar evaluation.
Factor out common STRUCT code for reuse with LIST.
Compile LIST variant, but don't deploy.
Switch over to recursive vectorised LIST comparisons.
Fix bug in Is(Not)Distinct code where matches were not
being selected.
Back out unnecessary Value changes.
Remove pointless STRUCT child munging.
Implement nested Is (Not) Distinct comparators.
Add tests to verify grouping by nested types.
Refactor to avoid name collisions.
Fix confusing cancelling coding errors.
Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks fantastic, and a very exciting feature! Some comments:

src/common/vector_operations/vector_hash.cpp Outdated Show resolved Hide resolved
Richard Wesley added 2 commits June 14, 2021 12:16
Implement (in)equality boolean comparison functions
for nested types
Expose hash table selection logic as comparison/distinction predicates.
@hawkfish hawkfish requested a review from Mytherin June 14, 2021 22:56
Replace duplicate code with calls to the original.
@Mytherin Mytherin merged commit 82f9bf1 into duckdb:master Jun 15, 2021
@Mytherin
Copy link
Collaborator

Thanks for the fixes! Looks great.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants