New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix bugs in MergeJoin when 'not_processed' is not null #40335
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you for your contribution!
Could you please describe how the bug affects the result? Does it lead to duplicated rows in the result? Do you have a reproducible example (if yes, it's good to add a test case)?
Probably related to #31009
@@ -778,7 +779,10 @@ void MergeJoin::joinSortedBlock(Block & block, ExtraBlockPtr & not_processed) | |||
if (intersection < 0) | |||
break; /// (left) ... (right) | |||
if (intersection > 0) | |||
{ | |||
skip_right = 0; | |||
continue; /// (right) ... (left) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If not reset skip_right, may be cause wrong right cursor position in the following leftJoin<is_all>(...)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@@ -884,7 +891,7 @@ bool MergeJoin::leftJoin(MergeJoinCursor & left_cursor, const Block & left_block | |||
{ | |||
right_cursor.nextN(range.right_length); | |||
right_block_info.skip = right_cursor.position(); | |||
left_cursor.nextN(range.left_length); | |||
left_key_tail = range.left_length; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could not update the left_cursor yet, because, maybe there are still equal keys in the right side. Forwarding the left_cursor right now, will cause the loss of result.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @vdimir, I can not see the details of the failing check. Could you please help me confirm the cause of the failure? Thank you very much! :-) |
Looks like internal CI failure for this task |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Overall looks reasonable. It's still quite tricky to understand without an example. Is it difficult to reproduce (maybe changing max_block_size
) in any example? Existing tests are green, though.
Set left block: two blocks at right side: ┌id┬value┐ Step 1: left column 1 matches with right column 1
Step 3: start next round merge join according to the |
Should this script reproduce the error? SET join_algorithm='partial_merge';
SET max_block_size=3;
SET max_joined_block_size_rows = 2;
DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
DROP TABLE IF EXISTS t3;
CREATE TABLE t1 (x UInt64) ENGINE = TinyLog;
INSERT INTO t1 VALUES (1), (2), (3);
CREATE TABLE t2 (x UInt64, value String) ENGINE = TinyLog;
INSERT INTO t2 VALUES (1, 'a'), (2, 'b'), (2, 'c');
INSERT INTO t2 VALUES (3, 'd'), (3, 'e'), (4, 'f');
SELECT * FROM t1 JOIN t2 ON t1.x = t2.x;
|
In This bug only exists in At last, thanks for your review again ! |
Changelog category:
Changelog entry:
Fix bugs in MergeJoin when 'not_processed' is not null.