New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
clickhouse-test: improve left queries after the test hardening #36649
Merged
alexey-milovidov
merged 4 commits into
ClickHouse:master
from
azat:system.processes-is_all_data_sent
Apr 27, 2022
Merged
clickhouse-test: improve left queries after the test hardening #36649
alexey-milovidov
merged 4 commits into
ClickHouse:master
from
azat:system.processes-is_all_data_sent
Apr 27, 2022
Conversation
This file contains bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
robot-clickhouse
added
the
pr-improvement
Pull request with some product improvements
label
Apr 26, 2022
azat
force-pushed
the
system.processes-is_all_data_sent
branch
from
April 26, 2022 07:56
82e8f2b
to
8742072
Compare
v2: fix SHOW PROCESSLIST (does not have process list entry) Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Right now it is possible to have "false-positive" for this hardening, because there is a tiny delay (that can be quite significant on CI, when it is under pressure) between when the server sends EndOfStream and the server removes an entry from system.processes. But now system.processes has is_all_data_sent column, that means that the EndOfStream was sent, and we can use it to avoid false positive. Here is an example of such report [1]: 2022-04-25 03:47:18 00806_alter_update: [ FAIL ] 0.95 sec. - Queries left in background after the test finished: 2022-04-25 03:47:18 "elapsed": 0.100084746, 2022-04-25 03:47:18 "is_cancelled": 0, 2022-04-25 03:47:18 "query": "DROP TABLE alter_update_00806;", 2022-04-25 03:47:18 "thread_ids": [ 2022-04-25 03:47:18 "8950" 2022-04-25 03:47:18 ], 2022.04.25 03:47:17.887095 [ 8950 ] {7c062004-4c22-486c-934a-f405846e2c81} <Debug> executeQuery: (from [::1]:52012) (comment: 00806_alter_update.sql) DROP TABLE alter_update_00806; 2022.04.25 03:47:17.887493 [ 8950 ] {7c062004-4c22-486c-934a-f405846e2c81} <Trace> ContextAccess (default): Access granted: DROP TABLE ON test_7ntsjn.alter_update_00806 2022.04.25 03:47:17.887765 [ 8950 ] {7c062004-4c22-486c-934a-f405846e2c81} <Trace> test_7ntsjn.alter_update_00806 (1bc92bca-10a7-444e-be5e-7f61f4650169): Found 2 old parts to remove. 2022.04.25 03:47:17.887947 [ 8950 ] {7c062004-4c22-486c-934a-f405846e2c81} <Debug> test_7ntsjn.alter_update_00806 (1bc92bca-10a7-444e-be5e-7f61f4650169): Removing part from filesystem 20180101_20180101_1_1_0 2022.04.25 03:47:17.888960 [ 8950 ] {7c062004-4c22-486c-934a-f405846e2c81} <Debug> test_7ntsjn.alter_update_00806 (1bc92bca-10a7-444e-be5e-7f61f4650169): Removing part from filesystem 20180102_20180102_2_2_0 2022.04.25 03:47:17.890620 [ 8950 ] {7c062004-4c22-486c-934a-f405846e2c81} <Debug> DatabaseCatalog: Waiting for table 1bc92bca-10a7-444e-be5e-7f61f4650169 to be finally dropped 2022.04.25 03:47:17.895046 [ 8950 ] {7c062004-4c22-486c-934a-f405846e2c81} <Debug> MemoryTracker: Peak memory usage (for query): 0.00 B. ... 2022.04.25 03:47:17.938328 [ 4422 ] {aa01985a-78f5-4c0e-b646-8d04a4a1dc77} <Debug> executeQuery: (from [::1]:59416) (comment: 00806_alter_update.sql) DROP DATABASE test_7ntsjn 2022.04.25 03:47:17.938667 [ 4422 ] {aa01985a-78f5-4c0e-b646-8d04a4a1dc77} <Trace> ContextAccess (default): Access granted: DROP DATABASE ON test_7ntsjn.* ... 2022.04.25 03:47:18.154847 [ 8950 ] {} <Debug> TCPHandler: Processed in 0.269358257 sec. 2022.04.25 03:47:18.154991 [ 8950 ] {} <Debug> TCPHandler: Done processing connection. 2022.04.25 03:47:18.155181 [ 8950 ] {} <Debug> TCP-Session: e1d8176a-ee62-4e0a-9855-fe9eb52e06dc Destroying unnamed session of user 94309d50-4f52-5250-31bd-74fecac179db [1]: https://s3.amazonaws.com/clickhouse-test-reports/36319/a646cf76b6d4699f06aea1e8d777edb1ad6fd2c5/stateless_tests__debug__actions__[1/3]/runlog.log So as you can see here DROP TABLE was captured when elapsed was 0.1, while TCPHandler processes it for 0.26 seconds. Also from the same report you are seeing that DROP DATABASE was executed before TCPHandler stopoped processing DROP TABLE. Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
Signed-off-by: Azat Khuzhin <a.khuzhin@semrush.com>
azat
force-pushed
the
system.processes-is_all_data_sent
branch
from
April 26, 2022 09:15
8742072
to
87421d8
Compare
|
alexey-milovidov
approved these changes
Apr 27, 2022
This test requires attention. |
Looks like it makes
|
Indeed, thanks for the report, should be fixed by #36767 |
This was referenced Apr 30, 2022
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
Apr 30, 2022
This will avoid clickhouse-test complains about left queries after the test, like in [1]. [1]: https://s3.amazonaws.com/clickhouse-test-reports/36258/9646487c093a75dc31e3e818417cfad83580b40f/stateful_tests__memory__actions_.html Follow-up for: ClickHouse#36649
azat
added a commit
to azat/ClickHouse
that referenced
this pull request
May 4, 2022
Before tests can fail if there was implicit reconnect, with queries left, and without referenced PR, it requires manual debugging to know that the reason was reconnect. But the problem is, that the server does send EndOfStream but hanged after, but before removing this query from the system.processes. But after adding is_all_data_sent (ClickHouse#36816, ClickHouse#36767, ClickHouse#36649), clickhouse-test can check queries left only for which server did not sent EndOfStream/Exception. In other words after adding is_all_data_sent, it should not be possible to have queries left in such cases. Reverts: 53be9c5 Reverts: ClickHouse#36587
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Labels
pr-improvement
Pull request with some product improvements
testing
Special issue with list of bugs found by CI
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Changelog category (leave one):
Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Add
is_all_data_sent
column intosystem.processes
, and improve internal testing hardening check based on it.TL;DR;
Right now it is possible to have "false-positive" for this hardening,
because there is a tiny delay (that can be quite significant on CI, when
it is under pressure) between when the server sends EndOfStream and the
server removes an entry from system.processes.
But now system.processes has is_all_data_sent column, that means that
the EndOfStream was sent, and we can use it to avoid false positive.
Here is an example of such report 1:
So as you can see here DROP TABLE was captured when elapsed was 0.1,
while TCPHandler processes it for 0.26 seconds.
Also from the same report you are seeing that DROP DATABASE was executed
before TCPHandler stopoped processing DROP TABLE