Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix max granules size in MergeTreeDataWriter #19123

Merged
merged 2 commits into from Jan 16, 2021

Conversation

alesapin
Copy link
Member

@alesapin alesapin commented Jan 15, 2021

I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix bug in merge tree data writer which can lead to marks with bigger size than fixed granularity size. Fixes #18913.

@robot-clickhouse robot-clickhouse added the pr-not-for-changelog This PR should not be mentioned in the changelog label Jan 15, 2021
@alesapin
Copy link
Member Author

alesapin commented Jan 15, 2021

Ok, I know how to fix it now, but the error caused by the code complexity.... It's sad.

@robot-clickhouse robot-clickhouse added pr-bugfix Pull request with bugfix, not backported by default and removed pr-not-for-changelog This PR should not be mentioned in the changelog labels Jan 15, 2021

client.query("SYSTEM STOP MERGES t")

# These blocks size taken from the real table which reproduces the error
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Bug description here.

@alesapin alesapin changed the title Additional check for huge granules in MergeTreeDataWriter Fix max granules size in MergeTreeDataWriter Jan 15, 2021
@@ -607,6 +614,11 @@ void MergeTreeDataPartWriterWide::fillIndexGranularity(size_t index_granularity_

void MergeTreeDataPartWriterWide::adjustLastMarkIfNeedAndFlushToDisk(size_t new_rows_in_last_mark)
{
/// We don't want to split already written granules to smaller
if (rows_written_in_last_mark > new_rows_in_last_mark)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Additional self-check. We have quite a lot of them, but I think it's better than possible data corruption.

else /// We still can write some rows from new block into previous granule.
adjustLastMarkIfNeedAndFlushToDisk(index_granularity_for_block - rows_written_in_last_mark);
else /// We still can write some rows from new block into previous granule. So the granule size will be block granularity size.
adjustLastMarkIfNeedAndFlushToDisk(index_granularity_for_block);
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The whole fix is here.

@alesapin
Copy link
Member Author

I think it's better to merge it to avoid strange failures in debug tests.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-bugfix Pull request with bugfix, not backported by default
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Incorrect mark is found while running debug tests
2 participants