Found via ClickGap automated review. Please close or comment if this is incorrect or needs adjustment.
Retrospective finding from a historical scan of PR #85134 (merged 2025-08-08). Confirmed on current codebase — close with a note if already fixed.
Describe what's wrong
SELECT * FROM s3('http://...', format = 'CSV', format = 'TSV') — and any other duplicated key-value pair in the new s3 KV-arg syntax — raises Code: 49 (LOGICAL_ERROR) to the user. LOGICAL_ERROR is reserved for impossible/programmer-bug states; ClickHouse stress checks treat any LOGICAL_ERROR as a critical failure.
Root cause: src/Storages/ObjectStorage/Utils.cpp:175 — the duplicate-key branch was newly added by this PR (introduced inside the new parseKeyValueArguments helper) and uses ErrorCodes::LOGICAL_ERROR for an entirely user-controlled input condition. It should use ErrorCodes::BAD_ARGUMENTS like the other validation throws in the same function (Utils.cpp:158, :168).
Why we believe this is a bug: User SQL → S3StorageParsedArguments::fromAST (src/Storages/ObjectStorage/S3/Configuration.cpp:374) → parseKeyValueArguments (src/Storages/ObjectStorage/Utils.cpp:144) → key_value_args.emplace(arg_name, arg_value).second == false on duplicate → throw Exception(ErrorCodes::LOGICAL_ERROR, ...) at Utils.cpp:175.
Affected locations:
src/Storages/ObjectStorage/Utils.cpp:175 — throw Exception(ErrorCodes::LOGICAL_ERROR, "Duplicate key value argument: {}", arg_name) inside parseKeyValueArguments
Impact: User-controlled query produces a LOGICAL_ERROR. This (1) misleads users into filing bug reports against ClickHouse for what is just a typo, (2) is escalated by stress / fuzzer / CI checks which treat any LOGICAL_ERROR as a regression, and (3) violates the documented contract that LOGICAL_ERROR is only thrown for invariant violations, never for user input. Reachable via any caller of S3StorageParsedArguments::fromAST or parseFromDisk (S3 disk).
Does it reproduce on most recent release?
Yes — confirmed on current master (commit ac12688dd9e).
How to reproduce
-- Test: duplicate key-value arguments in s3() table function should throw BAD_ARGUMENTS, not LOGICAL_ERROR.
-- Covers: src/Storages/ObjectStorage/Utils.cpp:175 — parseKeyValueArguments throws LOGICAL_ERROR on duplicate keys, but duplicates come from user input, so should be BAD_ARGUMENTS.
SELECT * FROM s3('http://localhost:1234/foo', format = 'CSV', format = 'TSV'); -- { serverError BAD_ARGUMENTS }
SELECT * FROM s3('http://localhost:1234/foo', secret_access_key = 'a', secret_access_key = 'b'); -- { serverError BAD_ARGUMENTS }
SELECT * FROM s3('http://localhost:1234/foo', 'CSV', structure = 'a Int32', structure = 'b String'); -- { serverError BAD_ARGUMENTS }
Try it on ClickHouse Fiddle
Expected behavior
(empty — three statements should each raise serverError BAD_ARGUMENTS and the test should pass with empty .reference)
Error message and/or stacktrace
Expected server error code: 36 but got: 49 (query: SELECT * FROM s3('http://localhost:1234/foo', format = 'CSV', format = 'TSV'); -- { serverError BAD_ARGUMENTS }).
Received exception from server (version 26.5.1):
Code: 49. DB::Exception: Received from 127.0.0.1:19010. DB::Exception: Duplicate key value argument: format. (LOGICAL_ERROR)
Additional context
Open risks:
- Same
parseKeyValueArguments is also called from parseFromDisk (Utils.cpp:214) used by S3-on-disk table functions, so the bad error code surfaces there too.
Field::safeGet<bool> in getFromPositionOrKeyValue throws BAD_GET (code 170) when a KV value has the wrong field type (e.g. partition_columns_in_data_file = 'true'); same convention violation but separate finding.
Suggested fix: Change ErrorCodes::LOGICAL_ERROR to ErrorCodes::BAD_ARGUMENTS at src/Storages/ObjectStorage/Utils.cpp:175. Same convention as the surrounding BAD_ARGUMENTS throws at lines 158 and 168.
Analysis details: Confidence HIGH | Severity P2 | Testability: STATELESS_SQL
Found during automated review of PR #85134.
ClickGapAI · Confidence: HIGH · Severity: P2 · Finding: h_pr85134_001
Found via ClickGap automated review. Please close or comment if this is incorrect or needs adjustment.
Retrospective finding from a historical scan of PR #85134 (merged 2025-08-08). Confirmed on current codebase — close with a note if already fixed.
Describe what's wrong
SELECT * FROM s3('http://...', format = 'CSV', format = 'TSV')— and any other duplicated key-value pair in the new s3 KV-arg syntax — raisesCode: 49 (LOGICAL_ERROR)to the user.LOGICAL_ERRORis reserved for impossible/programmer-bug states; ClickHouse stress checks treat anyLOGICAL_ERRORas a critical failure.Root cause: src/Storages/ObjectStorage/Utils.cpp:175 — the duplicate-key branch was newly added by this PR (introduced inside the new
parseKeyValueArgumentshelper) and usesErrorCodes::LOGICAL_ERRORfor an entirely user-controlled input condition. It should useErrorCodes::BAD_ARGUMENTSlike the other validation throws in the same function (Utils.cpp:158, :168).Why we believe this is a bug: User SQL →
S3StorageParsedArguments::fromAST(src/Storages/ObjectStorage/S3/Configuration.cpp:374) →parseKeyValueArguments(src/Storages/ObjectStorage/Utils.cpp:144) →key_value_args.emplace(arg_name, arg_value).second == falseon duplicate →throw Exception(ErrorCodes::LOGICAL_ERROR, ...)at Utils.cpp:175.Affected locations:
src/Storages/ObjectStorage/Utils.cpp:175— throw Exception(ErrorCodes::LOGICAL_ERROR, "Duplicate key value argument: {}", arg_name) inside parseKeyValueArgumentsImpact: User-controlled query produces a
LOGICAL_ERROR. This (1) misleads users into filing bug reports against ClickHouse for what is just a typo, (2) is escalated by stress / fuzzer / CI checks which treat anyLOGICAL_ERRORas a regression, and (3) violates the documented contract thatLOGICAL_ERRORis only thrown for invariant violations, never for user input. Reachable via any caller ofS3StorageParsedArguments::fromASTorparseFromDisk(S3 disk).Does it reproduce on most recent release?
Yes — confirmed on current
master(commitac12688dd9e).How to reproduce
Try it on ClickHouse Fiddle
Expected behavior
Error message and/or stacktrace
Additional context
Open risks:
parseKeyValueArgumentsis also called fromparseFromDisk(Utils.cpp:214) used by S3-on-disk table functions, so the bad error code surfaces there too.Field::safeGet<bool>ingetFromPositionOrKeyValuethrowsBAD_GET(code 170) when a KV value has the wrong field type (e.g.partition_columns_in_data_file = 'true'); same convention violation but separate finding.Suggested fix: Change
ErrorCodes::LOGICAL_ERRORtoErrorCodes::BAD_ARGUMENTSat src/Storages/ObjectStorage/Utils.cpp:175. Same convention as the surroundingBAD_ARGUMENTSthrows at lines 158 and 168.Analysis details: Confidence HIGH | Severity P2 | Testability:
STATELESS_SQLFound during automated review of PR #85134.
ClickGapAI · Confidence: HIGH · Severity: P2 · Finding:
h_pr85134_001