Skip to content

fix: quote-aware store-type parser#29

Merged
alex-clickhouse merged 1 commit into
mainfrom
fix/quote-aware-store-type-parser
May 26, 2026
Merged

fix: quote-aware store-type parser#29
alex-clickhouse merged 1 commit into
mainfrom
fix/quote-aware-store-type-parser

Conversation

@alex-clickhouse
Copy link
Copy Markdown
Collaborator

Summary

ExtractInnerTypes (via SplitTopLevel) and FindMatchingCloseParen counted parens and top-level commas without tracking whether they were inside a single-quoted string literal. Enum value names containing parens or commas — e.g. Enum8(',' = 1, ';' = 2) or 'a)b' — would corrupt the depth counter and either return the wrong close-paren position or split the inner type list at the wrong place, causing nested types like Array(Enum8(...)) and Map(String, Enum8(...)) to fail to resolve.

Both helpers now track quote state and treat any paren or top-level comma inside a '...' literal as opaque content. ClickHouse's doubled-quote escape ('') for embedding a literal apostrophe inside a string is handled too, so Enum8('Bob''s' = 1) parses correctly.

The bug is pre-existing — #25's refactor moved the buggy split out of ExtractInnerTypes into a named SplitTopLevel helper but didn't change the logic.

Test plan

  • New FindMapping_NestedEnumWithSpecialChars_ResolvesWithoutCorruption theory covers 5 nested-type × special-char combinations (Array, Map, Tuple crossed with closing-paren, opening-paren, and comma in enum value names). Without the fix, 3 of the 5 cases fail with Assert.NotNull because the surrounding parameterized type fails to resolve; with it, all 5 pass.
  • Full unit suite (599/599) passes
  • Rebased onto current main (resolved one trivial conflict against Fix: Enum8 and AggregateFunction dropped #25's SplitTopLevel extraction)

🤖 Generated with Claude Code

ExtractInnerTypes and FindMatchingCloseParen counted parens and top-level
commas without tracking whether they were inside a single-quoted string
literal. Enum value names containing parens or commas (e.g.
Enum8(',' = 1, ';' = 2) or 'a)b') would corrupt the depth counter and
either return the wrong close-paren position or split the inner type list
at the wrong place, causing nested types like Array(Enum8(...)) and
Map(String, Enum8(...)) to fail to resolve.

Both helpers now track quote state and treat any paren/comma inside a
'...' literal as opaque content. ClickHouse's doubled-quote escape ('')
for embedding a literal apostrophe inside a string is also handled, so
Enum8('Bob''s' = 1) parses correctly.

Adds a parameterized regression test covering the surrounding-type shapes
(Array, Map, Tuple) crossed with the special-character patterns (closing
paren, opening paren, comma in enum value). Without the fix, three of the
five cases fail; with it, all five resolve correctly.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@codecov
Copy link
Copy Markdown

codecov Bot commented May 26, 2026

Codecov Report

❌ Patch coverage is 73.91304% with 6 lines in your changes missing coverage. Please review.

Files with missing lines Patch % Lines
...se/Storage/Internal/ClickHouseTypeMappingSource.cs 73.91% 4 Missing and 2 partials ⚠️

📢 Thoughts on this report? Let us know!

@alex-clickhouse alex-clickhouse merged commit ffe73ac into main May 26, 2026
2 of 3 checks passed
@alex-clickhouse alex-clickhouse deleted the fix/quote-aware-store-type-parser branch May 26, 2026 14:00
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant