Skip to content

Fix race condition in IPartitionStrategy::computePartitionKey (used by object storage sinks)#99400

Open
arthurpassos wants to merge 5 commits intoClickHouse:masterfrom
arthurpassos:fix_race_condition
Open

Fix race condition in IPartitionStrategy::computePartitionKey (used by object storage sinks)#99400
arthurpassos wants to merge 5 commits intoClickHouse:masterfrom
arthurpassos:fix_race_condition

Conversation

@arthurpassos
Copy link
Contributor

@arthurpassos arthurpassos commented Mar 13, 2026

IPartitionStrategy::computePartitionKey might be called from different threads, and it writes to cached_result concurrently without any sort of protection. It would be easier to add a mutex around it, but we can actually make it lock-free by moving the cache write to the constructor.

Changelog category (leave one):

  • Improvement

Changelog entry (a user-readable short description of the changes that goes into CHANGELOG.md):

Fix possible race condition in IPartitionStrategy::cached_result introduced in #92844

Documentation entry for user-facing changes

  • Documentation is written (mandatory for new features)

Note

Medium Risk
Touches object-storage partition key computation and caching; while intended to be behavior-preserving, it changes when expression actions are built/cached and could impact partitioning/perf under concurrency.

Overview
Fixes a potential race on IPartitionStrategy::cached_result by removing cache writes during computePartitionKey/getPartitionExpressionActions and instead precomputing and caching deterministic expression actions in the WildcardPartitionStrategy and HiveStylePartitionStrategy constructors.

Refactors action creation into small helpers (buildToStringPartitionAST, getCachedOrBuildActions, cacheDeterministicActions) and makes computePartitionKey/getPartitionExpressionActions const to reflect thread-safe usage.

Written by Cursor Bugbot for commit fb89e89. This will update automatically on new commits. Configure here.

Copy link

@cursor cursor bot left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Cursor Bugbot has reviewed your changes and found 1 potential issue.

Fix All in Cursor

@kssenii kssenii added the can be tested Allows running workflows for external contributors label Mar 16, 2026
@kssenii kssenii self-assigned this Mar 16, 2026
@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Mar 16, 2026

Workflow [PR], commit [20b9624]

Summary:

job_name test_name status info comment
Stateless tests (amd_tsan, s3 storage, parallel, 1/2) failure
04004_alter_modify_column_ttl_without_type FAIL cidb
Integration tests (arm_binary, distributed plan, 4/4) failure
test_dotnet_client/test.py::test_dotnet_client FAIL cidb

AI Review

Summary

This PR makes IPartitionStrategy::computePartitionKey and getPartitionExpressionActions const, and removes unsynchronized lazy writes to cached_result from concurrent call paths by computing/cache-populating deterministic actions during strategy construction. I did not find new high-confidence correctness, safety, or performance regressions in the final diff; the previously flagged const signature mismatch was already fixed in later commits.

Missing context

  • ⚠️ No CI logs or stress-test artifacts were provided in the review inputs, so this assessment is based on code inspection only.

ClickHouse Rules

Item Status Notes
Deletion logging
Serialization versioning
Core-area scrutiny
No test removal
Experimental gate
No magic constants
Backward compatibility
SettingsHistory.cpp
Safe rollout
Compilation time

Final Verdict

  • Status: ✅ Approve

@clickhouse-gh clickhouse-gh bot added the pr-improvement Pull request with some product improvements label Mar 16, 2026
WildcardPartitionStrategy(KeyDescription partition_key_description_, const Block & sample_block_, ContextPtr context_);

ColumnPtr computePartitionKey(const Chunk & chunk) override;
ColumnPtr computePartitionKey(const Chunk & chunk) const override;
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

WildcardPartitionStrategy::computePartitionKey signature was made const in the header, but the .cpp definition is still non-const (src/Storages/IPartitionStrategy.cpp, near line 299). This makes the member definition not match any declaration and breaks compilation.

Please update the definition to ColumnPtr WildcardPartitionStrategy::computePartitionKey(const Chunk & chunk) const.

BuildAST && build_ast)
{
if (cached_result)
return *cached_result;
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be honest code structure seems strange - like we pass cached_result into this method as an arg and return from this method if it already has a value, so I see no point of calling this method with cached_result arg unless we would have set cached_result value at the end of this method when it turned out empty and we rebuilt it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am also not a big fan of it. The idea of the cached_result if statement inside the method was to avoid duplicating it in the two computePartitionKey implementations.

I can make it simpler if you want.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

diff --git a/src/Storages/IPartitionStrategy.cpp b/src/Storages/IPartitionStrategy.cpp
index 77826c80afb..87a503c8685 100644
--- a/src/Storages/IPartitionStrategy.cpp
+++ b/src/Storages/IPartitionStrategy.cpp
@@ -97,27 +97,6 @@ namespace
         return makeASTFunction("toString", std::move(arguments));
     }
 
-    template <typename BuildAST>
-    PartitionExpressionActionsAndColumnName getCachedOrBuildActions(
-        const std::optional<PartitionExpressionActionsAndColumnName> & cached_result,
-        const IPartitionStrategy & partition_strategy,
-        BuildAST && build_ast)
-    {
-        if (cached_result)
-            return *cached_result;
-
-        auto expression_ast = build_ast();
-        return partition_strategy.getPartitionExpressionActions(expression_ast);
-    }
-
-    void cacheDeterministicActions(
-        std::optional<PartitionExpressionActionsAndColumnName> & cached_result,
-        const PartitionExpressionActionsAndColumnName & actions_with_column)
-    {
-        if (!actions_with_column.actions->getActionsDAG().hasNonDeterministic())
-            cached_result = actions_with_column;
-    }
-
     std::shared_ptr<IPartitionStrategy> createHivePartitionStrategy(
         ASTPtr partition_by,
         const Block & sample_block,
@@ -289,19 +268,29 @@ std::shared_ptr<IPartitionStrategy> PartitionStrategyFactory::get(StrategyType s
 WildcardPartitionStrategy::WildcardPartitionStrategy(KeyDescription partition_key_description_, const Block & sample_block_, ContextPtr context_)
     : IPartitionStrategy(partition_key_description_, sample_block_, context_)
 {
-    auto actions_with_column = getCachedOrBuildActions(
-        cached_result,
-        *this,
-        [&] { return buildToStringPartitionAST(partition_key_description.definition_ast); });
-    cacheDeterministicActions(cached_result, actions_with_column);
+    ASTs arguments(1, partition_key_description.definition_ast);
+    ASTPtr partition_by_string = makeASTFunction("toString", std::move(arguments));
+    auto actions_with_column = getPartitionExpressionActions(partition_by_string);
+    if (!actions_with_column.actions->getActionsDAG().hasNonDeterministic())
+    {
+        cached_result = actions_with_column;
+    }
 }
 
 ColumnPtr WildcardPartitionStrategy::computePartitionKey(const Chunk & chunk) const
 {
-    auto actions_with_column = getCachedOrBuildActions(
-        cached_result,
-        *this,
-        [&] { return buildToStringPartitionAST(partition_key_description.definition_ast); });
+    PartitionExpressionActionsAndColumnName actions_with_column;
+
+    if (cached_result)
+    {
+        actions_with_column = *cached_result;
+    }
+    else
+    {
+        ASTs arguments(1, partition_key_description.definition_ast);
+        ASTPtr partition_by_string = makeASTFunction("toString", std::move(arguments));
+        actions_with_column = getPartitionExpressionActions(partition_by_string);
+    }
 
     Block block_with_partition_by_expr = sample_block.cloneWithoutColumns();
     block_with_partition_by_expr.setColumns(chunk.getColumns());
@@ -341,11 +330,13 @@ HiveStylePartitionStrategy::HiveStylePartitionStrategy(
 
     block_without_partition_columns = buildBlockWithoutPartitionColumns(sample_block, partition_columns_name_set);
 
-    auto actions_with_column = getCachedOrBuildActions(
-        cached_result,
-        *this,
-        [&] { return buildHivePartitionAST(partition_key_description.definition_ast, getPartitionColumns()); });
-    cacheDeterministicActions(cached_result, actions_with_column);
+    auto hive_ast = buildHivePartitionAST(partition_key_description.definition_ast, getPartitionColumns());
+    auto actions_with_column = getPartitionExpressionActions(hive_ast);
+
+    if (!actions_with_column.actions->getActionsDAG().hasNonDeterministic())
+    {
+        cached_result = actions_with_column;
+    }
 }
 
 std::string HiveStylePartitionStrategy::getPathForRead(const std::string & prefix)
@@ -385,10 +376,17 @@ std::string HiveStylePartitionStrategy::getPathForWrite(
 
 ColumnPtr HiveStylePartitionStrategy::computePartitionKey(const Chunk & chunk) const
 {
-    auto actions_with_column = getCachedOrBuildActions(
-        cached_result,
-        *this,
-        [&] { return buildHivePartitionAST(partition_key_description.definition_ast, getPartitionColumns()); });
+    PartitionExpressionActionsAndColumnName actions_with_column;
+
+    if (cached_result)
+    {
+        actions_with_column = *cached_result;
+    }
+    else
+    {
+        auto hive_ast = buildHivePartitionAST(partition_key_description.definition_ast, getPartitionColumns());
+        actions_with_column = getPartitionExpressionActions(hive_ast);
+    }

Let me know if you want me to push this diff

Copy link
Member

@kssenii kssenii Mar 17, 2026

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ok let's leave as it already is, but add a comment at the beginning of this function mentioning that cached_result must have been cached in constructor if deterministic, otherwise we always rebuild from scratch...

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

auto hive_ast = buildHivePartitionAST(partition_key_description.definition_ast, getPartitionColumns());
auto actions_with_column = getPartitionExpressionActions(hive_ast);
auto actions_with_column = getCachedOrBuildActions(
cached_result,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We've cached the cached_result in HiveStylePartitionStrategy constructor, why call getCachedOrBuildActions again?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

cacheDeterministicActions might decide not to cache it because of ActionsDAG::hasNonDeterministic

@clickhouse-gh
Copy link
Contributor

clickhouse-gh bot commented Mar 18, 2026

LLVM Coverage Report

Metric Baseline Current Δ
Lines 83.70% 83.70% +0.00%
Functions 23.90% 23.90% +0.00%
Branches 76.30% 76.30% +0.00%

PR changed lines: PR changed-lines coverage: 100.00% (57/57, 0 noise lines excluded)
Diff coverage report
Uncovered code

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

can be tested Allows running workflows for external contributors pr-improvement Pull request with some product improvements

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants