New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Don't mark a part as broken on Poco::TimeoutException
#50811
Conversation
This is an automated comment for commit 5db3b39 with description of existing statuses. It's updated for the latest CI running
|
@@ -1252,6 +1252,10 @@ MergeTreeData::LoadPartResult MergeTreeData::loadDataPart( | |||
mark_broken(); | |||
return res; | |||
} | |||
catch (const Poco::TimeoutException &) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok. What about NetException?
catch (const Poco::TimeoutException &) | ||
{ | ||
throw; | ||
} | ||
catch (...) | ||
{ | ||
mark_broken(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks dangerous - why don't explicitly list the exceptions leading to broken parts?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For example, what about the "memory limit exceeded" exception?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks dangerous
Yes, but it always worked this way. It's not that dangerous because it's easy to attach broken parts back
why don't explicitly list the exceptions leading to broken parts?
I guess because we don't really know what exceptions lead to broken parts. Just like we don't know what exceptions are retriable.
what about the "memory limit exceeded" exception?
ClickHouse/src/Storages/MergeTree/MergeTreeData.cpp
Lines 1245 to 1252 in eb698a7
catch (const Exception & e) | |
{ | |
/// Don't count the part as broken if there was a retryalbe error | |
/// during loading, such as "not enough memory" or network error. | |
if (isRetryableException(e)) | |
throw; | |
mark_broken(); |
ClickHouse/src/Storages/MergeTree/MergeTreeData.cpp
Lines 1172 to 1175 in eb698a7
static bool isRetryableException(const Exception & e) | |
{ | |
if (isNotEnoughMemoryErrorCode(e.code())) | |
return true; |
Tests are broken. |
Tests are broken in master. |
That's the problem. We need to fix the master. |
Backport #50811 to 23.5: Don't mark a part as broken on `Poco::TimeoutException`
Backport #50811 to 23.4: Don't mark a part as broken on `Poco::TimeoutException`
Backport #50811 to 23.3: Don't mark a part as broken on `Poco::TimeoutException`
Changelog category (leave one):
https://pastila.nl/?00195371/ea6ef20d9ba9ee9f0dbd5ca034008515