Conversation
Add a structured `codec` field on `ColumnDefinition` covering general-purpose
codecs (NONE/LZ4/LZ4HC/ZSTD/T64/GCD/ALP), preprocessors (Delta/DoubleDelta/
Gorilla/FPC), and a `codec.raw(...)` escape hatch. Rendering emits the CODEC
clause in CREATE/ALTER; planner detects codec-only removal and emits
`MODIFY COLUMN ... REMOVE CODEC`. ClickHouse introspection reads
`system.columns.compression_codec` and plugin-pull round-trips it back into
structured form. Canonicalization fills ClickHouse defaults so `{kind:'ZSTD'}`
and `{kind:'ZSTD', level:1}` compare equal.
Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Replace free-form typeArgs string with a discriminated union per index type (minmax, set, bloom_filter, tokenbf_v1, ngrambf_v1). Argument validation moves from runtime to the type system, and plugin-pull emits typed fields back into schema files. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
There was a problem hiding this comment.
💡 Codex Review
Here are some automated review suggestions for this pull request.
Reviewed commit: a42cb603e1
ℹ️ About Codex in GitHub
Your team has set up Codex to review pull requests in this repo. Reviews are triggered when you
- Open a pull request for review
- Mark a draft as ready
- Comment "@codex review".
If Codex has suggestions, it will comment; otherwise it will react with 👍.
Codex can also answer questions or update the PR. Try commenting "@codex address that feedback".
| export type ColumnCodec = GeneralColumnCodec | PreprocessingColumnCodec | RawColumnCodec | ||
|
|
||
| /** Single codec or a chain (preprocessors then exactly one general codec). */ | ||
| export type ColumnCodecSpec = ColumnCodec | ColumnCodec[] |
There was a problem hiding this comment.
Forbid empty codec chains in schema types
ColumnCodecSpec currently allows ColumnCodec[], which includes []; that lets callers pass codec: [], and renderCodec then emits CODEC() (invalid SQL) without any validation error. This weakens the “typed codec chain” guarantee and can break generate/migrate at runtime when codec arrays are built programmatically.
Useful? React with 👍 / 👎.
| const level = Number(args[0]) | ||
| if (!Number.isFinite(level)) return undefined | ||
| return { kind: 'ZSTD', level } |
There was a problem hiding this comment.
Fallback to raw when known codecs have extra args
The parser reads only the first argument for ZSTD and ignores any additional ones, so parseCodec('CODEC(ZSTD(3, x))') becomes { kind: 'ZSTD', level: 3 } and re-renders as CODEC(ZSTD(3)), silently dropping information. This breaks pull round-trips for future/extended codec signatures that reuse known names, and should be treated as an unknown shape (raw) instead of truncating.
Useful? React with 👍 / 👎.
ClickHouse requires the CODEC clause to follow DEFAULT and COMMENT in column definitions. The previous order produced invalid SQL whenever a column declared both a codec and a default, surfaced by the e2e EXPLAIN AST check "codec + DEFAULT combined". Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Addresses two Codex review findings: empty codec chains (`codec: []`) emitted invalid `CODEC()` SQL without validation error, and known codec tokens with extra args (e.g. `ZSTD(3, 1)`) were silently truncated on `chkit pull` instead of round-tripping through the raw fallback. Co-Authored-By: Claude Opus 4.7 <noreply@anthropic.com>
Summary
Two related breaking DSL enhancements that replace free-form string fields with structured discriminated unions, moving validation from runtime to the type system.
codec: { kind: 'ZSTD', level: 3 }or a chain).CREATE TABLEandALTERemit theCODEC(...)clause in the correct position,chkit generateemitsMODIFY COLUMN ... REMOVE CODECwhen a codec is dropped, andchkit pullround-trips structured codecs (unknown tokens fall back tocodec.raw(...)).SkipIndexDefinitionbecomes a discriminated union keyed ontype, with typed fields per variant (maxRowsforset,falsePositiveRateforbloom_filter,sizeBytes/hashFunctions/randomSeedfortokenbf_v1, plusngramSizeforngrambf_v1). The oldtypeArgsstring is gone; theindex_type_missing_argsruntime check is removed because it is now a compile-time concern.Both changes cascade major bumps to
@chkit/core,@chkit/clickhouse,@chkit/plugin-pull, andchkit; migration tables live in the respective changesets.Test plan
bun run typecheckcleanbun run testpasses (unit)bun run test:envpasses (e2e via Doppler — SQL validation and pull round-trip)🤖 Generated with Claude Code