Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix bugs in MergeJoin when 'not_processed' is not null #40335

Merged
merged 1 commit into from Aug 23, 2022

Conversation

liql2007
Copy link
Contributor

Changelog category:

  • Bug Fix

Changelog entry:

Fix bugs in MergeJoin when 'not_processed' is not null.

Information about CI checks: https://clickhouse.com/docs/en/development/continuous-integration/

@CLAassistant
Copy link

CLAassistant commented Aug 18, 2022

CLA assistant check
All committers have signed the CLA.

@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-bugfix Pull request with bugfix, not backported by default label Aug 18, 2022
@vdimir vdimir self-assigned this Aug 18, 2022
Copy link
Member

@vdimir vdimir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you for your contribution!

Could you please describe how the bug affects the result? Does it lead to duplicated rows in the result? Do you have a reproducible example (if yes, it's good to add a test case)?

Probably related to #31009

@liql2007
Copy link
Contributor Author

@vdimir Thanks for your review!
This bug is not same as #31009.
It will cause loss of partial results, not duplicated results.

@@ -778,7 +779,10 @@ void MergeJoin::joinSortedBlock(Block & block, ExtraBlockPtr & not_processed)
if (intersection < 0)
break; /// (left) ... (right)
if (intersection > 0)
{
skip_right = 0;
continue; /// (right) ... (left)
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If not reset skip_right, may be cause wrong right cursor position in the following leftJoin<is_all>(...)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@@ -884,7 +891,7 @@ bool MergeJoin::leftJoin(MergeJoinCursor & left_cursor, const Block & left_block
{
right_cursor.nextN(range.right_length);
right_block_info.skip = right_cursor.position();
left_cursor.nextN(range.left_length);
left_key_tail = range.left_length;
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could not update the left_cursor yet, because, maybe there are still equal keys in the right side. Forwarding the left_cursor right now, will cause the loss of result.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@vdimir vdimir added the can be tested Allows running workflows for external contributors label Aug 18, 2022
@liql2007
Copy link
Contributor Author

Hi @vdimir, I can not see the details of the failing check. Could you please help me confirm the cause of the failure? Thank you very much! :-)

@vdimir
Copy link
Member

vdimir commented Aug 22, 2022

Performance Comparison Aarch64 [1/4] — Failed to parse the report
PullRequestCI / PerformanceComparisonAarch-0 (pull_request) Successful in 17s

Looks like internal CI failure for this task

Copy link
Member

@vdimir vdimir left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Overall looks reasonable. It's still quite tricky to understand without an example. Is it difficult to reproduce (maybe changing max_block_size) in any example? Existing tests are green, though.

@liql2007
Copy link
Contributor Author

Set MergeJoin.max_joined_block_rows = 2, and in LEFT JOIN scenario, we have:

left block:
┌id┐
│ 1 │
│ 2 │
│ 3 │
└─┘

two blocks at right side:
┌id┬value┐
│ 1 │ 1 │
│ 2 │ 2 │
│ 2 │ 3 │
└─┴─┘

┌id┬value┐
│ 3 │ 4 │
│ 3 │ 5 │
│ 4 │ 6 │
└─┴─┘

Step 1: left column 1 matches with right column 1
Step 2: left column 2 matches with right column [2, 3], but the joined (result) block size can not exceed the threshold MergeJoin.max_joined_block_rows (= 2), then:

  • generate the joined block:
    ┌id┬id┬value┐
    │ 1 │ 1 │ 1 │
    │ 2 │ 2 │ 2 │
    └─┴─┴─┘
  • skip right cursor to the third row
  • set skip_right = 2
  • skip left cursor to the third row (this is a bug ) @vdimir
  • create not_processed

Step 3: start next round merge join according to the not_processed. Because of the wrong left cursor position, and the wrong updating of skip_right, left block do not match anything with the second right block.

@vdimir
Copy link
Member

vdimir commented Aug 23, 2022

Should this script reproduce the error?

SET join_algorithm='partial_merge';

SET max_block_size=3;
SET max_joined_block_size_rows = 2;

DROP TABLE IF EXISTS t1;
DROP TABLE IF EXISTS t2;
DROP TABLE IF EXISTS t3;

CREATE TABLE t1 (x UInt64) ENGINE = TinyLog;
INSERT INTO t1 VALUES (1), (2), (3);

CREATE TABLE t2 (x UInt64, value String) ENGINE = TinyLog;
INSERT INTO t2 VALUES (1, 'a'), (2, 'b'), (2, 'c');
INSERT INTO t2 VALUES (3, 'd'), (3, 'e'), (4, 'f');

SELECT * FROM t1 JOIN t2 ON t1.x = t2.x;
┌─x─┬─t2.x─┬─value─┐
│ 1 │    1 │ a     │
│ 2 │    2 │ b     │
│ 2 │    2 │ c     │
│ 3 │    3 │ d     │
│ 3 │    3 │ e     │
└───┴──────┴───────┘

@vdimir vdimir merged commit 3ee52eb into ClickHouse:master Aug 23, 2022
@liql2007
Copy link
Contributor Author

In INNER JOIN, after the join interrupt because of the too large joined block, we did not skip the left table cursor (code), so, INNER JOIN is okay.

This bug only exists in LEFT JOIN scenario.

At last, thanks for your review again !

vdimir added a commit that referenced this pull request Aug 24, 2022
@vdimir vdimir mentioned this pull request Aug 24, 2022
vdimir added a commit that referenced this pull request Aug 24, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
can be tested Allows running workflows for external contributors pr-bugfix Pull request with bugfix, not backported by default
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants