Skip to content

Duplicate KV argument in s3(...) throws LOGICAL_ERROR instead of BAD_ARGUMENTS #104267

@clickgapai

Description

@clickgapai

Found via ClickGap automated review. Please close or comment if this is incorrect or needs adjustment.

Retrospective finding from a historical scan of PR #85134 (merged 2025-08-08). Confirmed on current codebase — close with a note if already fixed.

Describe what's wrong

SELECT * FROM s3('http://...', format = 'CSV', format = 'TSV') — and any other duplicated key-value pair in the new s3 KV-arg syntax — raises Code: 49 (LOGICAL_ERROR) to the user. LOGICAL_ERROR is reserved for impossible/programmer-bug states; ClickHouse stress checks treat any LOGICAL_ERROR as a critical failure.

Root cause: src/Storages/ObjectStorage/Utils.cpp:175 — the duplicate-key branch was newly added by this PR (introduced inside the new parseKeyValueArguments helper) and uses ErrorCodes::LOGICAL_ERROR for an entirely user-controlled input condition. It should use ErrorCodes::BAD_ARGUMENTS like the other validation throws in the same function (Utils.cpp:158, :168).

Why we believe this is a bug: User SQL → S3StorageParsedArguments::fromAST (src/Storages/ObjectStorage/S3/Configuration.cpp:374) → parseKeyValueArguments (src/Storages/ObjectStorage/Utils.cpp:144) → key_value_args.emplace(arg_name, arg_value).second == false on duplicate → throw Exception(ErrorCodes::LOGICAL_ERROR, ...) at Utils.cpp:175.

Affected locations:

  • src/Storages/ObjectStorage/Utils.cpp:175 — throw Exception(ErrorCodes::LOGICAL_ERROR, "Duplicate key value argument: {}", arg_name) inside parseKeyValueArguments

Impact: User-controlled query produces a LOGICAL_ERROR. This (1) misleads users into filing bug reports against ClickHouse for what is just a typo, (2) is escalated by stress / fuzzer / CI checks which treat any LOGICAL_ERROR as a regression, and (3) violates the documented contract that LOGICAL_ERROR is only thrown for invariant violations, never for user input. Reachable via any caller of S3StorageParsedArguments::fromAST or parseFromDisk (S3 disk).

Does it reproduce on most recent release?

Yes — confirmed on current master (commit ac12688dd9e).

How to reproduce

-- Test: duplicate key-value arguments in s3() table function should throw BAD_ARGUMENTS, not LOGICAL_ERROR.
-- Covers: src/Storages/ObjectStorage/Utils.cpp:175 — parseKeyValueArguments throws LOGICAL_ERROR on duplicate keys, but duplicates come from user input, so should be BAD_ARGUMENTS.

SELECT * FROM s3('http://localhost:1234/foo', format = 'CSV', format = 'TSV'); -- { serverError BAD_ARGUMENTS }
SELECT * FROM s3('http://localhost:1234/foo', secret_access_key = 'a', secret_access_key = 'b'); -- { serverError BAD_ARGUMENTS }
SELECT * FROM s3('http://localhost:1234/foo', 'CSV', structure = 'a Int32', structure = 'b String'); -- { serverError BAD_ARGUMENTS }

Try it on ClickHouse Fiddle

Expected behavior

(empty — three statements should each raise serverError BAD_ARGUMENTS and the test should pass with empty .reference)

Error message and/or stacktrace

Expected server error code: 36 but got: 49 (query: SELECT * FROM s3('http://localhost:1234/foo', format = 'CSV', format = 'TSV'); -- { serverError BAD_ARGUMENTS }).
Received exception from server (version 26.5.1):
Code: 49. DB::Exception: Received from 127.0.0.1:19010. DB::Exception: Duplicate key value argument: format. (LOGICAL_ERROR)

Additional context

Open risks:

  • Same parseKeyValueArguments is also called from parseFromDisk (Utils.cpp:214) used by S3-on-disk table functions, so the bad error code surfaces there too.
  • Field::safeGet<bool> in getFromPositionOrKeyValue throws BAD_GET (code 170) when a KV value has the wrong field type (e.g. partition_columns_in_data_file = 'true'); same convention violation but separate finding.

Suggested fix: Change ErrorCodes::LOGICAL_ERROR to ErrorCodes::BAD_ARGUMENTS at src/Storages/ObjectStorage/Utils.cpp:175. Same convention as the surrounding BAD_ARGUMENTS throws at lines 158 and 168.

Analysis details: Confidence HIGH | Severity P2 | Testability: STATELESS_SQL

Found during automated review of PR #85134.


ClickGapAI · Confidence: HIGH · Severity: P2 · Finding: h_pr85134_001

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type
    No fields configured for issues without a type.

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions