Skip to content

Cherry pick #102961 to 26.2: Use max_insert_threads for plain INSERTs without materialized views#103238

Merged
robot-ch-test-poll4 merged 8 commits intobackport/26.2/102961from
cherrypick/26.2/102961
Apr 21, 2026
Merged

Cherry pick #102961 to 26.2: Use max_insert_threads for plain INSERTs without materialized views#103238
robot-ch-test-poll4 merged 8 commits intobackport/26.2/102961from
cherrypick/26.2/102961

Conversation

@robot-ch-test-poll4
Copy link
Copy Markdown
Contributor

Original pull-request #102961

Do not merge this PR manually

This pull-request is a first step of an automated backporting.
It contains changes similar to calling git cherry-pick locally.
If you intend to continue backporting the changes, then resolve all conflicts if any.
Otherwise, if you do not want to backport them, then just close this pull-request.

The check results does not matter at this step - you can safely ignore them.

Troubleshooting

If the conflicts were resolved in a wrong way

If this cherry-pick PR is completely screwed by a wrong conflicts resolution, and you want to recreate it:

  • delete the pr-cherrypick label from the PR
  • delete this branch from the repository

You also need to check the Original pull-request for pr-backports-created label, and delete if it's presented there

The PR source

The PR is created in the CI job

CheSema and others added 8 commits April 17, 2026 12:44
buildInsertPipeline was requesting max_threads ConcurrencyControl slots
for all INSERTs regardless of pipeline width. For plain INSERTs without
MVs the pipeline is 1-wide, so this wasted CC capacity and spawned
unnecessary threads (up to ~10 for max_threads=16).

Now we use max_insert_threads when no MVs are attached, and max_threads
only when MVs are involved (their inner SELECTs benefit from full
parallelism).

Also adds a trace log in PipelineExecutor::allocateCPU to make CC slot
allocation visible, and a regression test covering both cases.

Fixes: #102947
ConcurrencyControl has an issue (#102947) where CC slots are allocated
eagerly at query start, and spawnThreads has a race where it spawns
threads faster than they can register as idle, resulting in many
unnecessary threads being created.

Until lazy CC slot allocation is implemented, we have no choice other
than to set strict limits on the thread count for insert queries.

buildInsertPipeline was requesting max_threads CC slots for all INSERTs
regardless of pipeline width. For plain INSERTs without MVs the pipeline
is 1-wide, so this wasted CC capacity and spawned unnecessary threads.

Now we use max_insert_threads when no MVs are attached, and max_threads
only when MVs are involved (their inner SELECTs benefit from full
parallelism).

Fixes: #102947
Style check requires specifying the log table name explicitly.
…ized settings

Pin `input_format_parallel_parsing=0` in INSERT commands so that
parallel parsing threads do not inflate `peak_threads_usage` beyond
the threshold — the test measures INSERT pipeline concurrency, not
parser parallelism.

Add `SETTINGS optimize_if_transform_strings_to_enum = 0` to diagnostic
queries against `system.query_log` to prevent a column name mismatch
when randomized settings enable both this optimization and
`parallel_replicas_local_plan`.

CI report: #102961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
DatabaseReplicated: `query_log` lookup by `query_id` assumes
single-node execution; replicated setup may miss entries or return
duplicates from other replicas.

AsyncInsert: injected `async_insert=1` queues the INSERT and returns
immediately, so `peak_threads_usage` reflects only the queuing step,
not the MV processing pipeline the test is designed to verify.

CI report: #102961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
ParallelReplicas settings alter query execution plans and thread
allocation, making `peak_threads_usage` checks unreliable. S3 I/O
threads (multi-part uploads, cache connections) inflate
`peak_threads_usage` beyond the pipeline thread count the test
measures.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=102961&sha=43b3cf5f2a16a3ddc9683efa36016d69558b0531&name_0=PR
PR: #102961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Use max_insert_threads for plain INSERTs without materialized views
@robot-ch-test-poll4 robot-ch-test-poll4 added pr-cherrypick Cherry-pick of merge-commit before backporting. Do not use manually - automated use only! do not test disable testing on pull request pr-bugfix Pull request with bugfix, not backported by default labels Apr 21, 2026
@robot-ch-test-poll4 robot-ch-test-poll4 merged commit 67a8866 into backport/26.2/102961 Apr 21, 2026
168 of 183 checks passed
@robot-ch-test-poll4 robot-ch-test-poll4 deleted the cherrypick/26.2/102961 branch April 21, 2026 08:52
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

do not test disable testing on pull request pr-bugfix Pull request with bugfix, not backported by default pr-cherrypick Cherry-pick of merge-commit before backporting. Do not use manually - automated use only!

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants