Skip to content

Fix use-after-free in AvroConfluentRowInputFormat#104751

Merged
mstetsyuk merged 4 commits into
masterfrom
fix-avro-confluent-deserializer-copy
May 13, 2026
Merged

Fix use-after-free in AvroConfluentRowInputFormat#104751
mstetsyuk merged 4 commits into
masterfrom
fix-avro-confluent-deserializer-copy

Conversation

@mstetsyuk
Copy link
Copy Markdown
Member

@mstetsyuk mstetsyuk commented May 12, 2026

AvroConfluentRowInputFormat::getOrCreateDeserializer built an AvroDeserializer on the stack and copied it into deserializer_cache. The deserializer holds a symbolic_skip_fn_map, and a lambda in AvroDeserializer::row_action which captures a reference into that map ([&skip_fn = it->second]). Copying deep-copies the map to a new address but leaves the lambda referencing the original. When the stack local is destroyed, those references dangle, and the next row deserialization invokes a std::function over freed memory. We should use std::move instead of copying.

Changelog category (leave one):

  • Critical Bug Fix (crash, data loss, RBAC)

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix segfault due to a use-after-free bug in AvroConfluentRowInputFormat.

`AvroConfluentRowInputFormat::getOrCreateDeserializer` built an
`AvroDeserializer` on the stack and copied it into `deserializer_cache`.
The deserializer holds `symbolic_skip_fn_map`, and skip lambdas in its
action tree capture references into that map. Copying deep-copies the
map to a new address while the lambdas still reference the original;
once the stack local is destroyed those references dangle, and the next
row deserialization invokes a `std::function` over freed memory.

Move into the cache instead of copying. `std::map` move-construction
transfers existing nodes without relocating them, so the captured
references remain valid.
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented May 12, 2026

Workflow [PR], commit [5448d84]

Summary:

job_name test_name status info comment
Unit tests (asan_ubsan, function_prop_fuzzer) FAIL
FunctionsStress.stress FAIL cidb
AllTests FAIL cidb

AI Review

Summary

This PR fixes a use-after-free in AvroConfluentRowInputFormat by moving AvroDeserializer into deserializer_cache and making AvroDeserializer non-copyable. The fix itself is directionally correct, but the added regression test is still not strong enough to prove the full contract described in the PR.

Findings

⚠️ Majors

  • [tests/integration/test_format_avro_confluent/test.py:102-122] The PR contract is to fix a lifetime bug around cached deserializers, but test_select_skip_symbolic inserts only one AvroConfluent message. That covers the recursive symbolic skip shape, but it does not verify the same-schema cache reuse path that this PR description calls out (next row deserialization). A regression can still slip through without a multi-row same-schema check.
    Suggested fix: make this test insert at least two records encoded with the same schema id in one insert and assert both rows are decoded correctly.
Final Verdict

Status: ⚠️ Request changes

Minimum required actions:

  1. Strengthen test_select_skip_symbolic to include at least two AvroConfluent messages with the same schema id while preserving the recursive symbolic field that is skipped.

@clickhouse-gh clickhouse-gh Bot added the pr-bugfix Pull request with bugfix, not backported by default label May 12, 2026
@mstetsyuk mstetsyuk changed the title Fix dangling references in cached AvroConfluent deserializer Fix use-after-free in AvroConfluentRowInputFormat May 12, 2026
@mstetsyuk mstetsyuk marked this pull request as draft May 12, 2026 19:31
@Ergus Ergus self-assigned this May 12, 2026
@mstetsyuk mstetsyuk marked this pull request as ready for review May 12, 2026 19:37
Comment thread src/Processors/Formats/Impl/AvroRowInputFormat.cpp
Prevent reintroduction of the use-after-free fixed in the previous commit.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown
Member

@Ergus Ergus left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. I am just wondering if we can add a test for this somehow to ensure we exercise this path.

AvroDeserializer(const AvroDeserializer &) = delete;
AvroDeserializer & operator=(const AvroDeserializer &) = delete;
AvroDeserializer(AvroDeserializer &&) = default;
AvroDeserializer & operator=(AvroDeserializer &&) = delete;
Copy link
Copy Markdown
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am a bit old school, I really like this part ;).

@mstetsyuk mstetsyuk added pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! and removed pr-bugfix Pull request with bugfix, not backported by default labels May 13, 2026
mstetsyuk and others added 2 commits May 13, 2026 05:50
The previous schema's symbolic reference sat at the top level of a
skipped field, so createAction resolved it before createSkipFn was
called, never building the reference-capturing lambda.

A recursive Node schema with `next: ["null", "Node"]` puts the
symbolic node inside a union branch, which is one of the spots
createSkipFn recurses into on its own.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@clickhouse-gh
Copy link
Copy Markdown
Contributor

clickhouse-gh Bot commented May 13, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 84.10% 84.10% +0.00%
Functions 91.10% 91.10% +0.00%
Branches 76.60% 76.60% +0.00%

Changed lines: 100.00% (6/6) · Uncovered code

Full report · Diff report

@mstetsyuk mstetsyuk added this pull request to the merge queue May 13, 2026
Merged via the queue into master with commit d0ad032 May 13, 2026
164 of 166 checks passed
@mstetsyuk mstetsyuk deleted the fix-avro-confluent-deserializer-copy branch May 13, 2026 11:30
@robot-clickhouse-ci-2 robot-clickhouse-ci-2 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label May 13, 2026
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-synced-to-cloud The PR is synced to the cloud repo label May 13, 2026
robot-clickhouse-ci-1 added a commit that referenced this pull request May 13, 2026
Cherry pick #104751 to 25.8: Fix use-after-free in `AvroConfluentRowInputFormat`
robot-clickhouse-ci-1 added a commit that referenced this pull request May 13, 2026
Cherry pick #104751 to 26.2: Fix use-after-free in `AvroConfluentRowInputFormat`
robot-clickhouse-ci-1 added a commit that referenced this pull request May 13, 2026
Cherry pick #104751 to 26.3: Fix use-after-free in `AvroConfluentRowInputFormat`
robot-clickhouse-ci-1 added a commit that referenced this pull request May 13, 2026
Cherry pick #104751 to 26.4: Fix use-after-free in `AvroConfluentRowInputFormat`
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label May 13, 2026
@tavplubix tavplubix added v25.12-must-backport v26.2-must-backport and removed pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR labels May 13, 2026
mstetsyuk added a commit that referenced this pull request May 13, 2026
Backport #104751 to 26.2: Fix use-after-free in `AvroConfluentRowInputFormat`
mstetsyuk added a commit that referenced this pull request May 13, 2026
Backport #104751 to 26.3: Fix use-after-free in `AvroConfluentRowInputFormat`
mstetsyuk added a commit that referenced this pull request May 13, 2026
Backport #104751 to 26.4: Fix use-after-free in `AvroConfluentRowInputFormat`
@robot-clickhouse-ci-1 robot-clickhouse-ci-1 added the pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR label May 13, 2026
@robot-ch-test-poll4 robot-ch-test-poll4 added the pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore label May 13, 2026
mstetsyuk added a commit that referenced this pull request May 14, 2026
Backport #104751 to 25.8: Fix use-after-free in `AvroConfluentRowInputFormat`
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

pr-backports-created Backport PRs are successfully created, it won't be processed by CI script anymore pr-critical-bugfix pr-must-backport Pull request should be backported intentionally. Use this label with great care! pr-must-backport-synced The `*-must-backport` labels are synced into the cloud Sync PR pr-synced-to-cloud The PR is synced to the cloud repo v25.12-must-backport v26.2-must-backport

Projects

None yet

Development

Successfully merging this pull request may close these issues.

6 participants