Skip to content

Use max_insert_threads for plain INSERTs without materialized views#102961

Merged
CheSema merged 7 commits intomasterfrom
sema/fix-insert-thread-count
Apr 20, 2026
Merged

Use max_insert_threads for plain INSERTs without materialized views#102961
CheSema merged 7 commits intomasterfrom
sema/fix-insert-thread-count

Conversation

@CheSema
Copy link
Copy Markdown
Member

@CheSema CheSema commented Apr 17, 2026

ConcurrencyControl has an issue (#102947) where CC slots are allocated eagerly at query start, and spawnThreads has a race where it spawns threads faster than they can register as idle, resulting in many unnecessary threads being created.

Until lazy CC slot allocation is implemented, we have no choice other than to set strict limits on the thread count for insert queries.

buildInsertPipeline was requesting max_threads CC slots for all INSERTs regardless of pipeline width. For plain INSERTs without MVs the pipeline is 1-wide, so this wasted CC capacity and spawned unnecessary threads.

Now we use max_insert_threads when no MVs are attached, and max_threads only when MVs are involved (their inner SELECTs benefit from full parallelism).

Mitigates #102947

Changelog category (leave one):

  • Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry

Plain INSERTs without materialized views no longer request excessive ConcurrencyControl slots and threads (max_threads instead of max_insert_threads), preventing CC slot starvation and thread count blowup on clusters with high INSERT throughput.

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

buildInsertPipeline was requesting max_threads ConcurrencyControl slots
for all INSERTs regardless of pipeline width. For plain INSERTs without
MVs the pipeline is 1-wide, so this wasted CC capacity and spawned
unnecessary threads (up to ~10 for max_threads=16).

Now we use max_insert_threads when no MVs are attached, and max_threads
only when MVs are involved (their inner SELECTs benefit from full
parallelism).

Also adds a trace log in PipelineExecutor::allocateCPU to make CC slot
allocation visible, and a regression test covering both cases.

Fixes: #102947
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 17, 2026

Workflow [PR], commit [c4f5e7e]

Summary:

job_name test_name status info comment
Upgrade check (amd_release) FAIL
Error message in clickhouse-server.log (see upgrade_error_messages.txt) FAIL cidb

AI Review

Summary

This PR changes InterpreterInsertQuery::buildInsertPipeline to use max_insert_threads for plain INSERT without dependent materialized views, and max_threads only when views are involved. The core C++ change is small and aligned with the reported concurrency-slot/thread over-allocation problem, but there are two merge-blocking review issues in metadata/test quality that should be addressed before approval.

Findings

⚠️ Majors

  • [tests/queries/0_stateless/04102_insert_threads_eagerly_spawned.sh:2,37-45,67-76] The test is over-constrained by no-replicated-database and no-async-insert, which narrows CI coverage. Also, system.query_log lookups do not pin is_initial_query = 1 and do not select the latest single row, making the assertion more fragile in mixed-profile runs.
    • Suggested fix: remove those restrictive tags and make the query deterministic in-script (--async_insert=0 --wait_for_async_insert=1, plus is_initial_query = 1 and ORDER BY event_time_microseconds DESC LIMIT 1 in both checks).

💡 Nits

  • [PR description: Changelog entry] The changelog text inverts the settings names and describes the opposite behavior (max_threads vs max_insert_threads) from the code.
    • Suggested fix: update the entry to state that plain INSERT without MVs now uses max_insert_threads, while inserts with MVs continue to use max_threads.
ClickHouse Rules
Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsChangesHistory.cpp
PR metadata quality Changelog entry currently contradicts the implemented behavior
Safe rollout
Compilation time
No large/binary files
Final Verdict
  • Status: ⚠️ Request changes
  • Minimum required actions:
    1. Fix Changelog entry wording to match the actual setting switch (max_insert_threads for plain INSERT, max_threads when MVs are involved).
    2. Make the new stateless test deterministic without profile-narrowing tags (or otherwise preserve equivalent CI profile coverage).

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Apr 17, 2026
ConcurrencyControl has an issue (#102947) where CC slots are allocated
eagerly at query start, and spawnThreads has a race where it spawns
threads faster than they can register as idle, resulting in many
unnecessary threads being created.

Until lazy CC slot allocation is implemented, we have no choice other
than to set strict limits on the thread count for insert queries.

buildInsertPipeline was requesting max_threads CC slots for all INSERTs
regardless of pipeline width. For plain INSERTs without MVs the pipeline
is 1-wide, so this wasted CC capacity and spawned unnecessary threads.

Now we use max_insert_threads when no MVs are attached, and max_threads
only when MVs are involved (their inner SELECTs benefit from full
parallelism).

Fixes: #102947
Comment thread src/Interpreters/InterpreterInsertQuery.cpp
CheSema added 2 commits April 17, 2026 13:03
Style check requires specifying the log table name explicitly.
CheSema added a commit that referenced this pull request Apr 17, 2026
…ized settings

Pin `input_format_parallel_parsing=0` in INSERT commands so that
parallel parsing threads do not inflate `peak_threads_usage` beyond
the threshold — the test measures INSERT pipeline concurrency, not
parser parallelism.

Add `SETTINGS optimize_if_transform_strings_to_enum = 0` to diagnostic
queries against `system.query_log` to prevent a column name mismatch
when randomized settings enable both this optimization and
`parallel_replicas_local_plan`.

CI report: #102961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
…ized settings

Pin `input_format_parallel_parsing=0` in INSERT commands so that
parallel parsing threads do not inflate `peak_threads_usage` beyond
the threshold — the test measures INSERT pipeline concurrency, not
parser parallelism.

Add `SETTINGS optimize_if_transform_strings_to_enum = 0` to diagnostic
queries against `system.query_log` to prevent a column name mismatch
when randomized settings enable both this optimization and
`parallel_replicas_local_plan`.

CI report: #102961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@CheSema CheSema force-pushed the sema/fix-insert-thread-count branch from a6bb0a1 to c208281 Compare April 17, 2026 20:00
groeneai added a commit to groeneai/ClickHouse that referenced this pull request Apr 18, 2026
Regression from ClickHouse#92141: server aborts with signal 6 during `loadDataPart`
in debug and sanitizer builds when a part directory contains a leftover
`txn_version.txt.tmp` file with no corresponding `txn_version.txt`. This
state is produced by `VersionMetadataOnDisk::loadMetadata` case 2 (an
incomplete write that was not atomically renamed). It returns a
`VersionInfo` with `creation_tid = Tx::DummyTID` and
`creation_csn = Tx::RolledBackCSN`.

`Tx::DummyTID = {NonTransactionalCSN, DummyLocalTID, Nil} = {1, 2, Nil}`
has `start_csn == NonTransactionalCSN` but `local_tid == DummyLocalTID`
— which fails the invariant asserted in `TransactionID::isNonTransactional`.
The new `VersionMetadata::validateInfo` (added by ClickHouse#92141) and
`VersionInfo::wasInvolvedInTransaction` both call `isNonTransactional` on
the loaded `creation_tid`, so the assertion aborts the server.

Two changes:
- `TransactionID::isNonTransactional`: widen the assertion to allow the
  `DummyTID` special case. The function still correctly returns `false`
  for `DummyTID` because `local_tid == DummyLocalTID != NonTransactionalLocalTID`.
- `VersionMetadata::validateInfo`: early-return for rolled-back parts
  (`creation_csn == RolledBackCSN`). Such parts are marked `Outdated`
  immediately after loading by `MergeTreeData::loadDataPart` and cleaned
  up, so the live-part invariants validated here do not apply.

CI reports showing the crash:
- PR ClickHouse#103047: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=103047&sha=ec0124d77b94c97a4b1b48049b0215b1dddb94a5&name_0=PR&name_1=Stateless%20tests%20%28amd_debug%2C%20distributed%20plan%2C%20s3%20storage%2C%20parallel%29
- PR ClickHouse#102961: same `Stateless tests (amd_debug, distributed plan, s3 storage, parallel)` check

Regression test `04104_transaction_version_metadata_dummy_tid_load.sh`
deterministically creates the scenario by dropping a bogus
`txn_version.txt.tmp` into a part directory and running `DETACH` +
`ATTACH`. Before the fix the server aborts with signal 6 at
`TransactionID.h:90`; after the fix the rolled-back part is marked
`Outdated` and invisible.

Session: cron:clickhouse-ci-task-worker:20260418-221500
DatabaseReplicated: `query_log` lookup by `query_id` assumes
single-node execution; replicated setup may miss entries or return
duplicates from other replicas.

AsyncInsert: injected `async_insert=1` queues the INSERT and returns
immediately, so `peak_threads_usage` reflects only the queuing step,
not the MV processing pipeline the test is designed to verify.

CI report: #102961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
Comment thread tests/queries/0_stateless/04102_insert_threads_eagerly_spawned.sh Outdated
ParallelReplicas settings alter query execution plans and thread
allocation, making `peak_threads_usage` checks unreliable. S3 I/O
threads (multi-part uploads, cache connections) inflate
`peak_threads_usage` beyond the pipeline thread count the test
measures.

CI report: https://s3.amazonaws.com/clickhouse-test-reports/json.html?PR=102961&sha=43b3cf5f2a16a3ddc9683efa36016d69558b0531&name_0=PR
PR: #102961

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented Apr 20, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.00% 84.00% +0.00%
Functions 91.10% 91.10% +0.00%
Branches 76.50% 76.50% +0.00%

Changed lines: 100.00% (4/4) · Uncovered code

Full report · Diff report

@serxa serxa self-assigned this Apr 20, 2026
@CheSema CheSema added this pull request to the merge queue Apr 20, 2026
Merged via the queue into master with commit 3cda729 Apr 20, 2026
159 of 161 checks passed
@CheSema CheSema deleted the sema/fix-insert-thread-count branch April 20, 2026 17:35
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 20, 2026
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Apr 21, 2026
robot-ch-test-poll4 added a commit that referenced this pull request Apr 21, 2026
Cherry pick #102961 to 26.1: Use max_insert_threads for plain INSERTs without materialized views
robot-clickhouse added a commit that referenced this pull request Apr 21, 2026
robot-ch-test-poll4 added a commit that referenced this pull request Apr 21, 2026
Cherry pick #102961 to 26.2: Use max_insert_threads for plain INSERTs without materialized views
robot-clickhouse added a commit that referenced this pull request Apr 21, 2026
robot-ch-test-poll4 added a commit that referenced this pull request Apr 21, 2026
Cherry pick #102961 to 26.3: Use max_insert_threads for plain INSERTs without materialized views
robot-clickhouse added a commit that referenced this pull request Apr 21, 2026
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Apr 21, 2026
clickhouse-gh Bot added a commit that referenced this pull request Apr 21, 2026
Backport #102961 to 26.1: Use max_insert_threads for plain INSERTs without materialized views
clickhouse-gh Bot added a commit that referenced this pull request Apr 21, 2026
Backport #102961 to 26.2: Use max_insert_threads for plain INSERTs without materialized views
CheSema added a commit that referenced this pull request Apr 22, 2026
Backport #102961 to 26.3: Use max_insert_threads for plain INSERTs without materialized views
@CheSema CheSema removed pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels Apr 30, 2026
@robot-ch-test-poll1 robot-ch-test-poll1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label Apr 30, 2026
@robot-ch-test-poll robot-ch-test-poll added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label Apr 30, 2026
clickhouse-gh Bot pushed a commit that referenced this pull request Apr 30, 2026
Use max_insert_threads for plain INSERTs without materialized views
clickhouse-gh Bot added a commit that referenced this pull request Apr 30, 2026
Backport #102961 to 25.12: Use max_insert_threads for plain INSERTs without materialized views
CheSema added a commit that referenced this pull request Apr 30, 2026
Backport #102961 to 25.10: Use max_insert_threads for plain INSERTs without materialized views
kewin-robetti pushed a commit to viasoftkorp/ClickHouse that referenced this pull request May 1, 2026
…thread-count

Use max_insert_threads for plain INSERTs without materialized views
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-bugfix Pull request with bugfix, not backported by default pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo v25.10-must-backport v25.12-must-backport v26.2-must-backport v26.3-must-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants