Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix executable dictionary source hang #14525

Merged
merged 7 commits into from Sep 10, 2020

Conversation

alexey-milovidov
Copy link
Member

@alexey-milovidov alexey-milovidov commented Sep 7, 2020

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix executable dictionary source hang. In previous versions, when using some formats (e.g. JSONEachRow) data was not feed to a child process before it outputs at least something. This closes #1697. This closes #2455.

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Sep 7, 2020
@alexey-milovidov alexey-milovidov changed the title Fix executable dictionary source hangup Fix executable dictionary source hang Sep 7, 2020
@alexey-milovidov alexey-milovidov added the force tests Force test ignoring fast test output. label Sep 7, 2020
@nikitamikhaylov nikitamikhaylov self-assigned this Sep 7, 2020
Copy link
Member

@nikitamikhaylov nikitamikhaylov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've read the code of ExecutableDictionarySource and everything near with it. Why do you think that BOM caused deadlock? I think it is because of std::packaged_task which is executed asynchronously (by itself, but we call it from another thread for unknown reason). So, we called operator() here, but according to notes here it is synchronized only with calling of get() on its std::future. I think it means that we can't guarantee whether or not the function is executed after the operator() is called. It seems that in our case the underlying function wasn't executed. The author wrote in the issue, that ClickHouse writes to stdin only after the process writes something to stdout, and this behavior is explained by reasoning above.

@@ -18,7 +18,8 @@
"00152_insert_different_granularity",
"00151_replace_partition_with_different_granularity",
"00157_cache_dictionary",
"01193_metadata_loading"
"01193_metadata_loading",
"01474_executable_dictionary" /// informational stderr from sanitizer at start
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think in JSON we don't have comments.

Suggested change
"01474_executable_dictionary" /// informational stderr from sanitizer at start
"01474_executable_dictionary"

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We do.

@alexey-milovidov
Copy link
Member Author

Integration test:
test_adaptive_granularity/test.py::test_version_update_two_nodes

@alexey-milovidov
Copy link
Member Author

AST Fuzzer: #14677

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
force tests Force test ignoring fast test output. no-docs-needed pr-bugfix Pull request with bugfix, not backported by default
Projects
None yet
4 participants