improvement: use INCORRECT_DATA instead of LOGICAL_ERROR for data-reachable deserialization checks by motsc · Pull Request #99822 · ClickHouse/ClickHouse

motsc · 2026-03-17T21:48:52Z

In sanitizer and debug builds, LOGICAL_ERROR calls abortOnFailedAssertion → abort() before the exception propagates to the caller's catch(...). The row-count mismatch check in NativeReader is reachable from user-supplied binary data and should use INCORRECT_DATA so it is recognized as a data error rather than a logic error, preventing false-positive aborts during fuzzing.

Changes

NativeReader.cpp: LOGICAL_ERROR → INCORRECT_DATA for row count mismatch after column deserialization (user-supplied Native format stream).
SerializationObject.h: add explicit #include <Common/VectorWithMemoryTracking.h> (previously relied on transitive include).
ColumnUnique.h: typo fix ("grated" → "greater") in error message; error code unchanged.
SerializationArray.cpp: error code declaration reorder only; no functional change.

Changelog category (leave one):

Bug Fix (user-visible misbehavior in an official stable release)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix false-positive abort in NativeReader when deserializing a Native format stream with a row-count mismatch: changed from LOGICAL_ERROR to INCORRECT_DATA so the error is handled as a data error rather than triggering abort() in sanitizer/debug builds.

Documentation entry for user-facing changes

Documentation is written (mandatory for new features)

CI report: https://s3.amazonaws.com/clickhouse-test-reports/PRs/99822/76660ad7a27047de3d00837309ab28f384318007/result_pr.json

clickhouse-gh · 2026-03-17T21:49:57Z

Workflow [PR], commit [76660ad]

Summary: ✅

AI Review

Summary

This PR updates error classification in NativeReader from LOGICAL_ERROR to INCORRECT_DATA for a user-reachable deserialization mismatch, plus a typo fix and non-functional cleanup. The main behavior change is correct and improves robustness in sanitizer/debug builds by avoiding a false-positive abort path for malformed input. I did not find correctness, safety, concurrency, compatibility, or performance issues requiring changes.

ClickHouse Rules

Item	Status	Notes
Deletion logging	➖
Serialization versioning	➖
Core-area scrutiny	✅
No test removal	✅
Experimental gate	➖
No magic constants	✅
Backward compatibility	✅
`SettingsChangesHistory.cpp`	➖
PR metadata quality	✅
Safe rollout	✅
Compilation time	✅

Final Verdict

Status: ✅ Approve

clickhouse-gh · 2026-03-17T21:53:12Z

@@ -637,7 +638,7 @@ static void checkIndexes(const ColumnVector<IndexType> & indexes, size_t max_dic
    {


💡 Typo in the exception message: grated or equal should be greater or equal.

This text is user-visible in diagnostics, so correcting it will reduce confusion when malformed input is reported.

Avogar

I don't understand these changes. Almost all LOGICAL_ERRORS that were changed to INCORRECT_DATA in this PR are actual logical errors that can happen only because of bugs in deserialization from MergeTree parts. INCORRECT_DATA makes sense only for cases when we use Native format reader and data is serialized incorrectly. As you can see, in some places we even check for settings.native_format to decide do we need to throw INCORRECT_DATA or LOGICAL_ERRORS. The only place where change can be justified is probably lines 198 and 224 in SerializationReplicated.cpp.

If we just change all these LOGICAL_ERRORS to INCORRECT_DATA we might silently ignore actual bugs in the code, and we have alerts on LOGICAL_ERRORS both in CI and in production.

Avogar · 2026-03-18T13:16:14Z

Also this PR has some unrelated changes of the code that reverts recent bugfixes in serializations

Avogar · 2026-03-19T12:13:05Z

Changes in this file reverts recent bug fixes

In ASan and fuzzer builds, `LOGICAL_ERROR` calls `abortOnFailedAssertion` → `abort()` before the exception propagates to the caller's `catch(...)`. Any condition that can be triggered by malformed binary input must use `INCORRECT_DATA` (or another non-`LOGICAL_ERROR` code) so it throws a catchable exception instead of aborting the process. The following sites were changed because the condition is fully determined by the content of the data stream, not by a programming error: - `NativeReader.cpp`: row count mismatch after column deserialization - `SerializationNullable.cpp`: null map / nested column size mismatch (was `native_format ? INCORRECT_DATA : LOGICAL_ERROR`; the non-native branch is also reachable from malformed input) - `SerializationTuple.cpp`: tuple element size mismatch after deserialization (same `native_format` conditional pattern) - `SerializationArray.cpp`: element column longer than last offset value - `SerializationSparse.cpp`: inconsistent offsets / values sizes - `SerializationString.cpp`: empty data stream in bulk deserialization - `SerializationObject.cpp`: missing data stream and size mismatches in `STRING`, typed, and dynamic deserialization paths - `SerializationObjectSharedData.cpp`: ~20 sites across `deserializeStructure`, `deserializePathsInfos`, `deserializePathsData`, and bucket deserialization — all "got empty stream", row-count mismatches, and path-not-found conditions - `SerializationReplicated.cpp`: `num_rows != limit` (read from stream), invalid `size_of_indexes_type`, missing `ReplicatedElements` substream - `ColumnUnique.h`: null in non-nullable dictionary; index out of range in dictionary lookup - `Allocator.cpp`: `checkSize` threw `LOGICAL_ERROR` for sizes ≥ 2^63; changed to `CANNOT_ALLOCATE_MEMORY` since the condition is triggered by an attacker-controlled allocation size, not a bug in the allocator itself Serialization-side throws and invariant guards that represent true programming errors (e.g., uninitialized callbacks, wrong codec mode) are left as `LOGICAL_ERROR`. Discovered by fuzzing with `native_reader_fuzzer` and `nested_type_serialization_fuzzer` in ASan build. **Bug Fix (user-visible misbehavior in official stable or prestable release)** - Fixed process abort in Native format deserialization when reading malformed data containing inconsistent column sizes, missing substreams, or other structural errors.

….cpp The previous commit accidentally included an incomplete refactoring of SerializationObject that removed constructor parameters from the .cpp but not from the .h or call sites, causing a compilation error. Restore SerializationObject.cpp to match origin/master's constructor signature while preserving the three LOGICAL_ERROR → INCORRECT_DATA changes.

…Data and fix typo - Match .cpp constructor definition to .h declaration by restoring the `dynamic_serialization_` parameter (was accidentally dropped, causing an out-of-line definition mismatch that breaks compilation). - Fix typo in `ColumnUnique::checkIndexes` exception message: `grated or equal` → `greater or equal`.

…ngs in SerializationSparse - Remove unused `extern const int LOGICAL_ERROR` from four files where all throw sites were changed to `INCORRECT_DATA` in the prior commit but the declaration was left behind, causing style-check failures. - Remove `values_settings.insert_only_rows_in_current_range_from_substreams_cache` block added by accident to `SerializationSparse::deserializeBinaryBulkWithMultipleStreams`. The flag, when propagated to nested serializers, caused the `Dynamic` element of a `Tuple` to read 0 rows from the substream cache instead of 1, producing an "Unexpected size of tuple element" mismatch during `CHECK TABLE` on a table with sparse-serialized `Tuple(Dynamic, Tuple(Int))` columns. This regressed test `04038_check_table_sparse_tuple_dynamic`.

…tionSparse Commit e338032 accidentally removed the 'if (max_rows_to_read == 0) return 0' early return that was introduced by PR ClickHouse#99351 to fix CHECK TABLE on Tuple columns containing Dynamic elements with sparse-serialized sub-columns (issue ClickHouse#96588). The removed guard was replaced with 'if (max_rows_to_read && ...)' guards on the two conditions below, but those do not short-circuit the function: the read loop still executes when limit=0, reading an extra row instead of returning immediately. This regresses test 04038_check_table_sparse_tuple_dynamic. Restore the original early return and remove the spurious 'max_rows_to_read &&' guard from the loop condition, since it is now unreachable when limit=0.

Per Avogar's review: most LOGICAL_ERROR sites changed in the initial commit are actual programming errors in MergeTree deserialization, not data-format errors. The `INCORRECT_DATA` pattern is appropriate only when reading from an explicitly untrusted stream (Native format over the wire). Changing them unconditionally would silently swallow real bugs and remove the production alerts on LOGICAL_ERROR. Reverted files (error codes restored to LOGICAL_ERROR or native_format conditional): - SerializationNullable.cpp — restore native_format ternary - SerializationTuple.cpp — restore native_format ternary - SerializationArray.cpp — restore LOGICAL_ERROR - SerializationString.cpp — restore LOGICAL_ERROR - SerializationObject.cpp — restore LOGICAL_ERROR / native_format - SerializationObjectSharedData.cpp — restore all LOGICAL_ERROR - SerializationSparse.cpp — restore LOGICAL_ERROR + max_rows_to_read guard (max_rows_to_read && ...) which was also incorrectly changed - ColumnUnique.h — restore LOGICAL_ERROR (kept typo fix "grated" → "greater" separately) Also removed the 03925_sparse_values_in_substreams_cache_bug test that was added to cover the sparse deserialization change, which is also being reverted. Remaining changes (kept): - NativeReader.cpp — row count mismatch is purely data-driven - SerializationReplicated.cpp — explicitly endorsed by reviewer (lines 198 and 224: num_rows != limit, invalid size_of_indexes_type) - Allocator.cpp — CANNOT_ALLOCATE_MEMORY for attacker- controlled oversized allocation (separate concern, not serialization)

…st, drop 04038 - `SerializationReplicated`: use `settings.native_format ? INCORRECT_DATA : LOGICAL_ERROR` for the `num_rows != limit` and `size_of_indexes_type` checks; revert the empty-elements-stream throw back to `LOGICAL_ERROR` (unreachable in Native format per Avogar). - Restore `03925_sparse_values_in_substreams_cache_bug` test that was accidentally removed by the prior revert commit. - Remove `04038_check_table_sparse_tuple_dynamic` test — the underlying issue (ClickHouse#96588) is not fixed in this PR and the test fails in CI.

… LOGICAL_ERROR, fix 03925 reference, restore 04038 test Per Avogar review: - Revert SerializationSparse.cpp to master: restore limit=0 early return and insert_only_rows_in_current_range_from_substreams_cache flag (bug fix for ClickHouse#96588) - Restore LOGICAL_ERROR in Allocator.cpp checkSize (not reachable via user data) - Restore LOGICAL_ERROR in SerializationReplicated.cpp (not reachable in Native format) - Restore deleted regression test 04038_check_table_sparse_tuple_dynamic - Fix 03925 reference: non-Nullable String.size is 0 not \N, tuple shows '' not NULL

… output format

…s accidentally dropped during rebase

The test was inadvertently changed from Nullable(String)/Nullable(UInt64) with nullable_serialization_version = 'allow_sparse' to plain String/UInt64, weakening the regression coverage for the nullable+sparse substream cache bug. Since this PR no longer touches SerializationSparse.cpp, the original test form passes correctly. Restore to match master exactly.

motsc · 2026-03-24T17:01:55Z

Hi @Avogar — the PR has been substantially narrowed based on your review. The broad LOGICAL_ERROR → INCORRECT_DATA changes across serialization code have been removed. The current diff contains only:

NativeReader.cpp: changes LOGICAL_ERROR → INCORRECT_DATA for the row-count mismatch after deserialization from a Native format stream — this is user-provided data, matching exactly the case you described as justified ("INCORRECT_DATA makes sense only for cases when we use Native format reader").
SerializationObject.cpp/.h: uses VectorWithMemoryTracking<String> instead of std::vector<String> for sorted_dynamic_paths to bound memory when deserializing a JSON column with many dynamic paths from an untrusted source.
ColumnUnique.h: typo fix only ("grated" → "greater"); error code unchanged.
SerializationArray.cpp: declaration reorder only; no error code changes.

CI is green on the current commit. Would you mind taking another look?

Avogar

Let's also improve the PR description as it contains old changes

…Helpers.h The declaration was added without a corresponding implementation and is not used anywhere. Remove per code review feedback.

motsc · 2026-03-25T06:01:24Z

/recheck

clickhouse-gh · 2026-03-25T08:46:47Z

LLVM Coverage Report

Metric	Baseline	Current	Δ
Lines	78.20%	84.10%	+5.90%
Functions	23.30%	24.40%	+1.10%
Branches	70.70%	76.60%	+5.90%

PR changed lines: PR changed-lines coverage: 80.00% (4/5, 0 noise lines excluded)
Diff coverage report
Uncovered code

motsc · 2026-03-25T09:31:44Z

/recheck

clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label Mar 17, 2026

clickhouse-gh Bot reviewed Mar 17, 2026

View reviewed changes

Comment thread src/DataTypes/Serializations/SerializationObjectSharedData.cpp Outdated

clickhouse-gh Bot reviewed Mar 17, 2026

View reviewed changes

motsc added bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort labels Mar 17, 2026

Avogar self-assigned this Mar 17, 2026

alexey-milovidov removed bug Confirmed user-visible misbehaviour in official release crash Crash / segfault / abort labels Mar 18, 2026

motsc added pr-improvement Pull request with some product improvements and removed pr-bugfix Pull request with bugfix, not backported by default labels Mar 18, 2026

motsc changed the title ~~Fix process abort on malformed binary data: LOGICAL_ERROR → INCORRECT_DATA in deserialization paths~~ improvement: use INCORRECT_DATA instead of LOGICAL_ERROR for data-reachable deserialization checks Mar 18, 2026

clickhouse-gh Bot reviewed Mar 18, 2026

View reviewed changes

Comment thread src/DataTypes/Serializations/SerializationSparse.cpp Outdated

Avogar requested changes Mar 18, 2026

View reviewed changes

clickhouse-gh Bot reviewed Mar 19, 2026

View reviewed changes

Comment thread tests/queries/0_stateless/03925_sparse_values_in_substreams_cache_bug.sql

Avogar reviewed Mar 19, 2026

View reviewed changes

Comment thread src/DataTypes/Serializations/SerializationReplicated.cpp Outdated

Avogar reviewed Mar 19, 2026

View reviewed changes

Comment thread tests/queries/0_stateless/03925_sparse_values_in_substreams_cache_bug.sql

Avogar reviewed Mar 19, 2026

View reviewed changes

Comment thread src/Common/Allocator.cpp Outdated

motsc added 8 commits March 19, 2026 18:21

fix: declare LOGICAL_ERROR in SerializationTuple.cpp ErrorCodes block

108d748

motsc force-pushed the fix/deserialization-logical-error-abort branch from 32ba4d9 to c78a222 Compare March 20, 2026 01:22

clickhouse-gh Bot reviewed Mar 20, 2026

View reviewed changes

Comment thread src/DataTypes/Serializations/SerializationSparse.cpp Outdated

clickhouse-gh Bot reviewed Mar 20, 2026

View reviewed changes

Comment thread tests/queries/0_stateless/04038_check_table_sparse_tuple_dynamic.sql

motsc added 2 commits March 19, 2026 21:42

fix: add missing trailing tab in 04038 reference to match CHECK TABLE…

9defa44

… output format

clickhouse-gh Bot reviewed Mar 20, 2026

View reviewed changes

Comment thread tests/queries/0_stateless/03925_sparse_values_in_substreams_cache_bug.sql Outdated

motsc marked this pull request as ready for review March 23, 2026 15:17

motsc force-pushed the fix/deserialization-logical-error-abort branch from a4408ed to 6d2df50 Compare March 23, 2026 15:28

clickhouse-gh Bot reviewed Mar 23, 2026

View reviewed changes

Comment thread src/DataTypes/Serializations/SerializationObject.cpp Outdated

motsc force-pushed the fix/deserialization-logical-error-abort branch from a896ce9 to 9defa44 Compare March 23, 2026 16:46

motsc added 2 commits March 23, 2026 10:56

fix: restore VectorWithMemoryTracking and skipWhitespaceAndSQLComment…

87b6168

…s accidentally dropped during rebase

Avogar reviewed Mar 24, 2026

View reviewed changes

Comment thread src/IO/ReadHelpers.h

fix: remove unused skipWhitespaceAndSQLComments declaration from Read…

db64ce3

…Helpers.h The declaration was added without a corresponding implementation and is not used anywhere. Remove per code review feedback.

clickhouse-gh Bot added pr-bugfix Pull request with bugfix, not backported by default and removed pr-improvement Pull request with some product improvements labels Mar 24, 2026

Merge remote-tracking branch 'origin/master' into update-99822

76660ad

motsc requested a review from Avogar March 25, 2026 21:42

motsc enabled auto-merge March 26, 2026 00:19

Avogar approved these changes Apr 2, 2026

View reviewed changes

motsc added this pull request to the merge queue Apr 2, 2026

Merged via the queue into ClickHouse:master with commit deb76c4 Apr 2, 2026
449 of 451 checks passed

motsc deleted the fix/deserialization-logical-error-abort branch April 2, 2026 18:09

robot-ch-test-poll3 added the pr-synced-to-cloud The PR is synced to the cloud repo label Apr 2, 2026

		@@ -637,7 +638,7 @@ static void checkIndexes(const ColumnVector<IndexType> & indexes, size_t max_dic
		{

Conversation

motsc commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Changes

Changelog category (leave one):

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Documentation entry for user-facing changes

Uh oh!

clickhouse-gh Bot commented Mar 17, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

AI Review

Summary

ClickHouse Rules

Final Verdict

Uh oh!

Uh oh!

clickhouse-gh Bot Mar 17, 2026

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Avogar left a comment

Choose a reason for hiding this comment

Uh oh!

Avogar commented Mar 18, 2026

Uh oh!

Uh oh!

Uh oh!

Avogar Mar 19, 2026 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

Uh oh!

motsc commented Mar 24, 2026

Uh oh!

Avogar left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

motsc commented Mar 25, 2026

Uh oh!

clickhouse-gh Bot commented Mar 25, 2026

LLVM Coverage Report

Uh oh!

motsc commented Mar 25, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

4 participants

motsc commented Mar 17, 2026 •

edited

Loading

clickhouse-gh Bot commented Mar 17, 2026 •

edited

Loading

Avogar Mar 19, 2026 •

edited

Loading