Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rework row matching #8979

Merged
merged 16 commits into from
Sep 20, 2023
Merged

Rework row matching #8979

merged 16 commits into from
Sep 20, 2023

Conversation

lnkuiper
Copy link
Contributor

Fixes #8328, which triggered this PR.

I've reworked row matching. I initially wanted to do a small patch instead of a rework, but @taniabogatsch and I could not figure out how the old code worked in the first place, as it seemed to handle cases like COMPARE_EQUAL and COMPARE_NOT_DISTINCT_FROM the same even though they are not. This is probably why the bug appeared in the first case.

Instead of a static function with multiple switches every time it is called, we now initialize a RowMatcher object that does the switches exactly once, obtaining the correct function pointers. So, other than correctly matching rows, this PR should also give a slight performance boost.

@github-actions github-actions bot marked this pull request as draft September 18, 2023 14:51
@lnkuiper lnkuiper marked this pull request as ready for review September 19, 2023 08:35
Copy link
Contributor

@taniabogatsch taniabogatsch left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hey @lnkuiper! Thanks for the PR and for picking this up to properly rework it! 🎉 Looks neat now, and seeing a lot of const there is very nice.

My comments are minor notes on readability. I tried to follow the logic as best as possible, and it looks good to me!

src/common/row_operations/row_matcher.cpp Outdated Show resolved Hide resolved
src/common/row_operations/row_matcher.cpp Outdated Show resolved Hide resolved
src/common/row_operations/row_matcher.cpp Outdated Show resolved Hide resolved
src/common/row_operations/row_matcher.cpp Show resolved Hide resolved
src/common/row_operations/row_matcher.cpp Outdated Show resolved Hide resolved
@github-actions github-actions bot marked this pull request as draft September 19, 2023 13:42
@lnkuiper lnkuiper marked this pull request as ready for review September 19, 2023 16:12
@lnkuiper
Copy link
Contributor Author

Thanks for the review! I think this is good to go. Somehow I'm getting a coverage failure in the csv reader, but that shouldn't be because of my changes, and is probably a spurious failure

Copy link
Collaborator

@Mytherin Mytherin left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for the PR! Looks great - some minor comments:

expr->type <= ExpressionType::COMPARE_GREATERTHANOREQUALTO) ||
expr->type == ExpressionType::COMPARE_DISTINCT_FROM ||
expr->type == ExpressionType::COMPARE_NOT_DISTINCT_FROM) {
} else if (expr->type == ExpressionType::COMPARE_EQUAL || expr->type == ExpressionType::COMPARE_NOTEQUAL ||
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unless I'm mistaken this doesn't appear to functionally change anything, it just lists out the enums instead of using a range comparison?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes, @taniabogatsch and I changed this because we found it more readable this way

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It was a leftover from the initial debugging, but we decided to keep it for readability, yes.

@Mytherin Mytherin merged commit eb1039d into duckdb:main Sep 20, 2023
48 of 50 checks passed
@Mytherin
Copy link
Collaborator

Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Join bug using DISTINCT FROM condition
3 participants