New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix SYSTEM UNFREEZE for ordinary database #38262
Fix SYSTEM UNFREEZE for ordinary database #38262
Conversation
src/Storages/Freeze.cpp
Outdated
if (!disk->exists(store_path)) | ||
continue; | ||
for (auto prefix_it = disk->iterateDirectory(store_path); prefix_it->isValid(); prefix_it->next()) | ||
for (auto store_path: store_paths) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just for loop added
I looked through the changes in previous PR (just to understand changes in this one) and left some comments in #36424, please take a look. You can address the comments in this PR. |
639b84f
to
3296ba2
Compare
src/Storages/Freeze.cpp
Outdated
if (version == 1) { | ||
/// is_replicated and is_remote are not used | ||
bool is_replicated = true; | ||
writeBoolText(is_replicated, buffer); | ||
buffer.write("\n", 1); | ||
bool is_remote = true; | ||
writeBoolText(is_remote, buffer); | ||
buffer.write("\n", 1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would prefer to completely remove these flags without changing version. It will break compatibility with 22.6.1
, but I think it's not a problem, because 22.6
was just released recently and I doubt that someone has already started to use 22.6.1
with this freeze-metadata-stuff in production. And previous serialization format was invalid anyway (I mean issues with escaping).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The thing is, flags were introduced before this PR. See history of removed code for details
writeBoolText(is_remote, buffer); | ||
buffer.write("\n", 1); | ||
} | ||
writeString(escapeForFileName(replica_name), buffer); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please also check that other strings cannot contain unexpected chars
} | ||
} | ||
if (disk->exists(backup_path)) | ||
{ | ||
/// After unfreezing we need to clear revision.txt file and empty directories |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Still not clear why we cannot just remove everything recursively. Why do we need complex "unfreeze" logic?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
/// After unfreezing we need to clear revision.txt file and empty directories.
/// revision.txt file shouldn't be unfreezed, it should just be deleted
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is that clear?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
No. I mean that FREEZE PARTITION only creates hardlinks. So UNFREEZE should only remove hardlinks. It's not clear why do we need some complex logic for "unfreezing", it was supposed to do just rm -rf
. I understand that all this stuff is needed for "s3 zero copy replication", but I do not understand how "s3 zero copy replication" (and especially "unfreezing" with "s3 zero copy replication") work because of lack of comments. A comment that describes how all this stuff works is required.
See also #36424 (comment) |
Issue with zk session should be fixed as well |
Also: It does not work correctly if table was created with old syntax. It's ok to drop support for UNFREEZE in this case, but we should throw an exception at least. Freeze.cpp: 178 |
Ordinary databases are deprecated in 22.7. |
Yes, because it also addresses review comments from #36424. Previous PR was merged by mistake and it has to be fixed. We can drop support for Ordinary, but we need the rest fixes. |
@tavplubix please check zookeeper and partition matching. do you mean something like that? 85bf022 |
src/Storages/Freeze.cpp
Outdated
@@ -179,8 +184,13 @@ PartitionCommandsResultInfo Unfreezer::unfreezePartitionsFromTableDirectory(Merg | |||
for (auto it = disk->iterateDirectory(table_directory); it->isValid(); it->next()) | |||
{ | |||
const auto & partition_directory = it->name(); | |||
|
|||
int count_underscores = std::count_if(partition_directory.begin(), partition_directory.end(), []( char c ){return c =='_';}); | |||
if ((format_version.has_value() && format_version == 0) || count_underscores == 4) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Number of underscores can be 4 with new syntax as well (if mutation version is not 0)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Also we don't need to parse partition_id at all if matcher
always returns true
(when it called from Unfreezer::unfreeze
or similar place). Currently it can parse some garbage, but simply does not throw exception in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't understand. Parsing partition_id is needed in order to output it. If I remove check with underscores,
- ALTER TABLE UNFREEZE will delete data only in case if format_version is more than zero.
- SYSTEM UNFREEZE will delete any data despite of format_version
So not checking this leads to different behaviour of these queries. Am I right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I missed that SYSTEM UNFREEZE prints partition ids in query result, I though ids are needed only for filtering. So currently it prints invalid ids if table was created with old syntax. It's not clear how to fix it, I see the following options:
- Do not return list of partitions for SYSTEM UNFREEZE
- Autodetect format_version somehow for SYSTEM UNFREEZE (not sure if it's possible, but maybe checking some files inside frozen data part can help)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, I think not returning list is easier and for sure possible. But still what about deleting data? Different data will be deleted for alter and system queries if not doing autodetect
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Because if format_version is 0, there will be an exception and since then the partition will not be removed with ALTER query. But as soon as there is no exception when called from Unfreezer::unfreeze
, the partition will be removed with SYSTEM query
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's ok, SYSTEM UNFREEZE will simply remove everything, because it's impossible to distinguish partitions in this case.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, it seems like throwing an exception may break some user queries since this logic was introduced for alter unfreeze. May it? Is it critical?
if (!matcher(partition_id)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I would rather mark it as TODO and / or create a new issue about this. If I touch this code, the only solution that doesn't break any query I see is to autodetect format version. Otherwise, it may break query and behave differently for alter and system queries. I would create a separate issue for autodetection as soon as it is not about just SYSTEM UNFREEZE feature. It is about fixing small bug for both ALTER UNFREEZE and SYSTEM UNFREEZE which may introduce bigger bug
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
…to enable system unfreeze
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, but a few things left:
- fix tests (see
tests/config/install.sh
) - add a comment describing how all this stuff with hardlinks and unfreezing works
@tavplubix Can you see now? TestsBugfixCheck failed with "nothing to run" for intergation tests - is this mean that integration test required too? |
No, it means that TestsBugfixCheck works incorrectly, cc: @vdimir |
It expects that added test would fail on the previous stable release (either functional or integration), but it's passed. In some cases, if the test is not stably reproduced, it can be fine. But "error: Nothing to run" looks confusing, I'll fix the message. |
Also it should post check status with link to the report (as other checks do) and actions task should not fail |
Integration tests failures look suspicious |
Yes, we will fix it. |
Integration tests still fail |
Integration tests have failed. |
…mist/ClickHouse into fix-ordinary-system-unfreeze
…-system-unfreeze Fix SYSTEM UNFREEZE for ordinary database
…tem-unfreeze-for-ordinary-database Merge pull request #38262 from PolyProgrammist/fix-ordinary-system-un…
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix SYSTEM UNFREEZE query for Ordinary database. Fix for #36424