Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix optimize_skip_unused_shards_rewrite_in for non-UInt64 types #25798

Conversation

azat
Copy link
Collaborator

@azat azat commented Jun 28, 2021

Changelog category (leave one):

  • Bug Fix

Changelog entry (a user-readable short description of the changes that goes to CHANGELOG.md):
Fix optimize_skip_unused_shards_rewrite_in for non-UInt64 types (may select incorrect shards eventually or throw Cannot infer type of an empty tuple or Function tuple requires at least one argument)

Reported-by: @amosbird

P.S. the fix is a3add4f, other is a tiny refactoring.

@robot-clickhouse robot-clickhouse added the pr-bugfix Pull request with bugfix, not backported by default label Jun 28, 2021
@azat azat force-pushed the optimize_skip_unused_shards_rewrite_in-types-fix branch from dda3576 to 68ae618 Compare June 28, 2021 19:56
@azat
Copy link
Collaborator Author

azat commented Jun 29, 2021

Integration tests (asan) — fail: 2, passed: 1471, flaky: 16
Integration tests (release) — fail: 3, passed: 1472, flaky: 15

Integration tests (thread) — fail: 61, passed: 824, flaky: 10

Seems that container does not have enough resources (after parallel run of integration tests had been enabled), and clickhouse-server got killed, which eventually leads to container is not running error.

Performance — Errors while building the report.

Broken by #25773

@azat azat force-pushed the optimize_skip_unused_shards_rewrite_in-types-fix branch 2 times, most recently from 3d7616d to daa320f Compare July 4, 2021 18:25
@azat
Copy link
Collaborator Author

azat commented Jul 5, 2021

Integration tests (thread) — fail: 14, passed: 1469, flaky: 47

Container is not running issue. Not enough resources for such level of parallelism?

E               Exception: Command ['docker', 'exec', 'roottestdistributedqueriesstress_node1_r1_1', 'bash', '-c', "echo 'select * from dist_two_over_dist where key = 0;\n    select * from dist_two_over_dist where key = 1;\n    select * from dist_two_over_dist where key = 2;\n    select * from dist_two_over_dist where key = 3;\n    select * from dist_two_over_dist;' | clickhouse benchmark --concurrency=100 --cumulative --delay=0 --timelimit=3 --hedged_connection_timeout_ms=200 --connect_timeout_with_failover_ms=200 --connections_with_failover_max_tries=5 --prefer_localhost_replica=0 --optimize_skip_unused_shards=1"] return non-zero code 1: Error response from daemon: Container 39c984334f7280985a2fdfae8d7275c6e137f40bc572cda67336e5aa83bc61e9 is not running
2021.07.05 03:11:57.466621 [ 1 ] {} <Fatal> Application: Child process was terminated by signal 9 (KILL). If it is not done by 'forcestop' command or manually, the possible cause is OOM Kille
r (see 'dmesg' and look at the '/var/log/kern.log' for the details)

And AFAICS this test (test_distributed_queries_stress) hadn't passed, and it should be marked as FAIL?

  • integration_run_parallel2_1.txt and integration_run_parallel2_2.txt contains only error - runner: error: argument -t/--tests_list: expected at least one argument
  • and the row with OK contains failed test.

@azat
Copy link
Collaborator Author

azat commented Jul 5, 2021

Functional stateless tests (release, DatabaseReplicated) — Some tests restarted, fail: 1, passed: 3076, skipped: 111

2021-07-04 22:51:33 01502_long_log_tinylog_deadlock_race:                                   [ FAIL ] 31.80 sec. - having exception:
2021-07-04 22:51:33 Testing TinyLog
2021-07-04 22:51:33 Done TinyLog
2021-07-04 22:51:33 Testing StripeLog
2021-07-04 22:51:33 Code: 999. DB::Exception: Received from localhost:9000. DB::Exception: Session expired (Session expired). 
2021-07-04 22:51:33 Done StripeLog
2021-07-04 22:51:33 Testing Log
2021-07-04 22:51:33 Done Log

Functional stateless tests (release, wide parts enabled) — fail: 1, passed: 3186, skipped: 1

01781_merge_tree_deduplication - it is not flaky (like report says), but failed.

2021-07-04 22:36:27 01781_merge_tree_deduplication:                                         [ FAIL ] 0.47 sec. - result differs with reference:
2021-07-04 22:36:27 --- /usr/share/clickhouse-test/queries/0_stateless/01781_merge_tree_deduplication.reference	2021-07-04 21:30:02.000000000 +0300
2021-07-04 22:36:27 +++ /tmp/clickhouse-test/0_stateless/01781_merge_tree_deduplication.stdout	2021-07-04 22:36:27.750998547 +0300
2021-07-04 22:36:27 @@ -80,6 +80,7 @@
2021-07-04 22:36:27  1	1
2021-07-04 22:36:27  1	1
2021-07-04 22:36:27  1	1
2021-07-04 22:36:27 +1	1
2021-07-04 22:36:27  2	2
2021-07-04 22:36:27  3	3
2021-07-04 22:36:27  4	4

@azat
Copy link
Collaborator Author

azat commented Jul 6, 2021

And AFAICS this test (test_distributed_queries_stress) hadn't passed, and it should be marked as FAIL?

Checked in test_results.tsv, and according to it the test marked correctly as FLAKY (I see retries there).

azat added 5 commits July 7, 2021 00:17
Why it hadn't been skipped before? And I guess this test passed in
arcadia?
The following should happens before:
- incorrect shards was selected
- "Cannot infer type of an empty tuple" exception for empty tuple() for localhost
- "Function tuple requires at least one argument" exception for empty tuple() for remote node
@azat azat force-pushed the optimize_skip_unused_shards_rewrite_in-types-fix branch from daa320f to d5cb792 Compare July 6, 2021 21:18
@azat
Copy link
Collaborator Author

azat commented Jul 7, 2021

Integration tests (thread) — fail: 20, passed: 616, flaky: 7

E               toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

@alexey-milovidov
Copy link
Member

E               Pulling hdfs1 (sequenceiq/hadoop-docker:2.7.0)...
E               toomanyrequests: You have reached your pull rate limit. You may increase the limit by authenticating and upgrading: https://www.docker.com/increase-rate-limit

@alexey-milovidov alexey-milovidov merged commit af7ecb7 into ClickHouse:master Jul 13, 2021
@alexey-milovidov alexey-milovidov self-assigned this Jul 13, 2021
robot-clickhouse pushed a commit that referenced this pull request Jul 13, 2021
alexey-milovidov added a commit that referenced this pull request Jul 14, 2021
Backport #25798 to 21.8: Fix optimize_skip_unused_shards_rewrite_in for non-UInt64 types
@azat azat deleted the optimize_skip_unused_shards_rewrite_in-types-fix branch July 15, 2021 06:12
zhanglistar pushed a commit to zhanglistar/ClickHouse that referenced this pull request Nov 12, 2021
zhanglistar pushed a commit to zhanglistar/ClickHouse that referenced this pull request Nov 12, 2021
Backport ClickHouse#25798 to 21.8: Fix optimize_skip_unused_shards_rewrite_in for non-UInt64 types
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
pr-bugfix Pull request with bugfix, not backported by default
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants