New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Turn off parallel parsing when memory limit is small. #16721
Turn off parallel parsing when memory limit is small. #16721
Conversation
@@ -166,6 +166,9 @@ BlockInputStreamPtr FormatFactory::getInput( | |||
// (segmentator + two parsers + reader). | |||
bool parallel_parsing = settings.input_format_parallel_parsing && file_segmentation_engine && settings.max_threads >= 4; | |||
|
|||
if (settings.min_chunk_bytes_for_parallel_parsing * settings.max_threads * 2 > settings.max_memory_usage) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like max_memory_usage_for_user
also should be checked
BTW can't this be done by lowering the number of threads dynamically instead?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
BTW can't this be done by lowering the number of threads dynamically instead?
I don't know( I think there will be a complex logic because of corner cases. Chosen number of threads must be > 4, otherwise it will be ineffective. and if settings.max_memory_usage < settings.min_chunk_bytes_for_parallel_parsing, we have to turn it off...
|
c1872af
to
9803565
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@nikitamikhaylov you forget about max_memory_usage_for_user
, or that was the intention?
Totally forgot, will fix. |
I hereby agree to the terms of the CLA available at: https://yandex.ru/legal/cla/?lang=en
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Turn off parallel parsing when there is no enough memory for all threads to work simultaneously. Also there could be exceptions like "Memory limit exceeded" when somebody will try to insert extremely huge rows (> min_chunk_bytes_for_parallel_parsing), because each piece to parse has to be independent set of strings (one or more).