SPANDigital · richardwooding · Apr 27, 2026 · Apr 27, 2026
diff --git a/.claude/skills/add-cel-feature/SKILL.md b/.claude/skills/add-cel-feature/SKILL.md
@@ -0,0 +1,71 @@
+---
+name: add-cel-feature
+description: Adds new CEL operator, function, macro, or comprehension support to cel2sql across all SQL dialects, touching cel2sql.go visit*, the dialect.Dialect interface, every dialect/<name>/dialect.go, and testcases/<category>_tests.go. Use when wiring a new CEL function (e.g., string.toLowerCase, timestamp.toISOString), a new comprehension form, a new operator overload, or extending the converter to recognise a CEL extension package.
+---
+
+# Add CEL Feature
+
+Adding a new CEL surface area (function, operator, macro, comprehension) is the second-most-common contribution shape after adding a dialect (5+ feature PRs in the recent history: string-extension functions, comprehensions, multi-dim arrays, regex, get/setDayOfWeek). The work is the same shape every time — this skill captures it.
+
+## Quick start
+
+1. **Classify the feature** — function call, operator, macro (like `has()`), or comprehension form (like `all`/`exists`/`filter`/`map`).
+2. **Identify the converter file** that owns the surface area — see [references/converter-file-map.md](references/converter-file-map.md).
+3. **Decide whether a new `dialect.Dialect` method is needed** — see "New method or inline?" below.
+4. **If a new method is needed**, walk [references/dialect-method-checklist.md](references/dialect-method-checklist.md) — the new method must land in `dialect/dialect.go` plus all six dialect packages.
+5. **Add a test case** to the appropriate `testcases/<category>_tests.go` with a `WantSQL` entry for every dialect (use the gen-script from `add-sql-dialect` if outputs diverge significantly).
+6. **Run** `python .claude/skills/add-sql-dialect/scripts/check_testcase_coverage.py <each-dialect>` to confirm coverage is still complete.
+
+## New method or inline?
+
+The `dialect.Dialect` interface is already large (~47 methods). Add a new method only when needed; otherwise inline the SQL in the visitor.
+
+| Situation | Verdict |
+|---|---|
+| The SQL is identical across all six dialects | Inline in the visitor (e.g., `con.str.WriteString("MOD(")` for `%`). |
+| Any dialect needs a different syntax | New `dialect.Dialect` method. |
+| A subset of dialects don't support the feature at all | New method + each unsupported dialect returns `dialect.ErrUnsupportedFeature`. |
+| The feature is JSON-related, regex-related, or array-related | Look for an existing `Write…` method first — JSON/array/regex method coverage is broad. |
+
+If you add a method, the compile-time `var _ dialect.Dialect = (*Dialect)(nil)` assertion at the top of every dialect file will catch missing implementations. The build won't pass until all six dialects are updated.
+
+## Where features live
+
+For example: adding `string.toLowerCase()`. CEL parses this as a method call on a string value. The visitor entry is `visitCall` in `cel2sql.go`. CEL's standard string functions are dispatched by overload ID — see how `startsWith`/`endsWith`/`contains` are wired and follow the same pattern. The output SQL (`LOWER(expr)`) is identical across all dialects, so this is a candidate for inline conversion (no new `Dialect` method).
+
+For another example: adding `timestamp.toUnixSeconds()`. The output diverges (`EXTRACT(EPOCH FROM e)::bigint` PG, `UNIX_TIMESTAMP(e)` MySQL, etc.) — so add a new `Dialect.WriteUnixSeconds(w, writeExpr)` method. `cel2sql.go` is generic; per-dialect SQL goes in `dialect/<name>/dialect.go`.
+
+For the file-by-file map, see [references/converter-file-map.md](references/converter-file-map.md).
+
+## CEL environment registration
+
+For a new function, the CEL environment must declare it before the converter can compile expressions that use it. Check if it's already registered in:
+
+- `testutil/env.go` `NewDefaultEnv` / `NewTimestampEnv` — for the test environments.
+- The user's CEL environment (out of scope for this skill).
+
+If you add a function only to the test environment, the user-facing API (the documentation) needs to mention the new function so consumers know to register it themselves.
+
+## Testcase coverage
+
+Add a representative test to `testcases/<category>_tests.go` (or the appropriate category file). The test must have a `WantSQL[dialect.<Name>]` entry for every dialect, or a `SkipDialect[dialect.<Name>]` reason for any that legitimately can't express the feature.
+
+If the SQL output diverges significantly across dialects, use the `gen_expected_sql.go.tmpl` template from the `add-sql-dialect` skill to generate the actual outputs in bulk:
+
+**Run** `cp .claude/skills/add-sql-dialect/templates/gen_expected_sql.go.tmpl testutil/gen_<dialect>_test.go` for each dialect, transcribe outputs, delete the generators.
+
+After the test case is in place, **run** the coverage check for every dialect:
+
+```
+for d in postgresql mysql sqlite duckdb bigquery spark; do
+  python .claude/skills/add-sql-dialect/scripts/check_testcase_coverage.py "$d"
+done
+```
+
+A new test case missing a single dialect's `WantSQL` entry silently skips for that dialect — easy regression to ship without this check.
+
+## Resources
+
+- [references/converter-file-map.md](references/converter-file-map.md) — which converter file owns which CEL surface area.
+- [references/dialect-method-checklist.md](references/dialect-method-checklist.md) — checklist when adding a method to the `Dialect` interface.
+- **Run** `python .claude/skills/add-sql-dialect/scripts/check_testcase_coverage.py <dialect>` — verifies test coverage; reuse the script across this skill and `add-sql-dialect`.
diff --git a/.claude/skills/add-cel-feature/references/converter-file-map.md b/.claude/skills/add-cel-feature/references/converter-file-map.md
@@ -0,0 +1,70 @@
+# Converter File Map
+
+Where each CEL surface area lives in the converter. Use this to find the visitor that needs updating when adding a feature.
+
+## Contents
+
+- Top-level dispatch
+- Operators and special-cased operators
+- Comprehensions
+- JSON / JSONB
+- Timestamps and durations
+- Index analysis
+- Where it doesn't go
+
+## Top-level dispatch
+
+| File | Owns |
+|---|---|
+| `cel2sql.go` | The `converter` struct, `Convert` / `ConvertParameterized` entry points, the `visit` dispatch (`visitCall`, `visitSelect`, `visitIdent`, `visitList`, `visitConst`, `visitStruct`, `visitComprehension`), parameter-placeholder writing, identifier/struct-field validation, recursion-depth and output-length guards. |
+
+`visitCall` is the dispatch hub for nearly every CEL function and operator. New function support starts here, almost always by adding a new branch to the overload-ID switch (search for `overloads.Add`, `overloads.Contains`, etc., to see existing wiring).
+
+## Operators and special-cased operators
+
+| File | Owns |
+|---|---|
+| `operators.go` | `standardSQLBinaryOperators` map (CEL operator → SQL token), reverse lookup, helper predicates like `isStringLiteral`, `isNullLiteral`, `isBoolLiteral`, `isListType`, `isNumericComparison`, `isFieldAccessExpression`. |
+| `cel2sql.go` `visitCallBinary` | Binary operator emission with special cases (`||` for string/list concat, `IS`/`IS NOT` for null/bool comparisons, JSON-text numeric coercion via `WriteCastToNumeric`, IN-list dispatch to `WriteArrayMembership` or `WriteJSONArrayMembership`). |
+
+Adding a new operator overload (e.g., timestamp + duration) usually requires updates here.
+
+## Comprehensions
+
+| File | Owns |
+|---|---|
+| `comprehensions.go` | The `ComprehensionInfo` struct, `identifyComprehension`, comprehension-shape predicates. |
+| `cel2sql.go` `visitComprehension`, `visitAllComprehension`, `visitExistsComprehension`, `visitExistsOneComprehension`, `visitFilterComprehension`, `visitMapComprehension`, `writeComprehensionSource` | Per-shape SQL emission. |
+
+The comprehension source is dispatched to either `dialect.WriteUnnest` (native arrays) or `dialect.WriteJSONArrayElements` (JSON arrays).
+
+## JSON / JSONB
+
+| File | Owns |
+|---|---|
+| `json.go` | JSON-path detection (`shouldUseJSONPath`, `hasJSONFieldInChain`), JSON-variable handling (`isJSONVariable` for `WithJSONVariables`), `has()` macro emission, JSON path construction (`buildJSONPath`, `buildJSONPathInternal`, `buildJSONPathForArray`), JSON field name escaping. |
+
+The detection chain is: schema-driven (via `isFieldJSON`) → JSON-variable (via `WithJSONVariables`) → fallback to chain-walk (`hasJSONFieldInChain`). Adding new JSON behaviour usually plugs into `shouldUseJSONPath` plus a new `dialect.Dialect` method.
+
+## Timestamps and durations
+
+| File | Owns |
+|---|---|
+| `timestamps.go` | Custom CEL types for `DATE` / `TIME` / `DATETIME` / `INTERVAL`, duration parsing, EXTRACT helpers, timestamp arithmetic dispatch. |
+
+Adding a new timestamp accessor (e.g., `getQuarter()`) goes here. Day-of-week conversion is handled per-dialect via `Dialect.WriteExtract` (CEL convention: Sunday=0).
+
+## Index analysis
+
+| File | Owns |
+|---|---|
+| `analysis.go` | `AnalyzeQuery` entry point; pattern detection (`PatternComparison`, `PatternJSONAccess`, `PatternRegexMatch`, `PatternArrayMembership`, `PatternArrayComprehension`, `PatternJSONArrayComprehension`); pattern → DDL delegation to `dialect.IndexAdvisor`. |
+
+Adding a new pattern type means: extend `analysis.go` to detect it, add a `PatternType` constant in `dialect/index_advisor.go`, then implement the new pattern in each dialect's `index_advisor.go`.
+
+## Where it doesn't go
+
+- Per-dialect SQL syntax → `dialect/<name>/dialect.go`. The converter is dialect-agnostic.
+- Reserved keywords / identifier rules → `dialect/<name>/validation.go`.
+- Regex pattern translation (RE2 → engine-native) → `dialect/<name>/regex.go`.
+- Type provider / database-introspection → `<name>/provider.go`.
diff --git a/.claude/skills/add-cel-feature/references/dialect-method-checklist.md b/.claude/skills/add-cel-feature/references/dialect-method-checklist.md
@@ -0,0 +1,93 @@
+# Dialect Method Checklist
+
+Steps when adding a new method to the `dialect.Dialect` interface. Skip the new-method route entirely if the SQL output is identical across all six dialects (just inline in the converter).
+
+## Contents
+
+- Step-by-step
+- Compile-time safety net
+- Naming conventions
+- Error returns
+- Worked example: WriteUnixSeconds
+
+## Step-by-step
+
+1. **Declare the method on the interface** in `dialect/dialect.go`. Group it with related methods (Literals / Operators / Type Casting / Arrays / JSON / Timestamps / String Functions / Comprehensions / Struct / Validation / Regex / Capabilities). Add a one-line doc comment naming the canonical PostgreSQL form ("For PostgreSQL: …") to anchor reviewers.
+2. **Implement in all six dialect packages.** The canonical six are:
+   - `dialect/postgres/dialect.go`
+   - `dialect/mysql/dialect.go`
+   - `dialect/sqlite/dialect.go`
+   - `dialect/duckdb/dialect.go`
+   - `dialect/bigquery/dialect.go`
+   - `dialect/spark/dialect.go`
+3. **Verify the compile-time assertion fires.** Each dialect file ends with `var _ dialect.Dialect = (*Dialect)(nil)` — running `go build ./...` will fail if any dialect is missing the new method.
+4. **Wire the method into the converter** in `cel2sql.go` (or whichever visitor file owns the surface area — see the converter-file-map reference). The converter calls `con.dialect.<NewMethod>(...)`.
+5. **Add a representative test case** to `testcases/<category>_tests.go`. Set `WantSQL[dialect.<Name>]` for every dialect or document a `SkipDialect[dialect.<Name>]` reason.
+6. **Run the testcase coverage check** for every dialect:
+   ```
+   for d in postgresql mysql sqlite duckdb bigquery spark; do
+     python .claude/skills/add-sql-dialect/scripts/check_testcase_coverage.py "$d"
+   done
+   ```
+
+## Compile-time safety net
+
+The `var _ dialect.Dialect = (*Dialect)(nil)` line at the top of every dialect file is critical. It's the only mechanism that guarantees all dialects implement every method. If you add a new dialect, add this line first.
+
+## Naming conventions
+
+Methods on the `Dialect` interface that emit SQL fragments are prefixed `Write…`. They take a `*strings.Builder` (the converter's output buffer) plus callbacks `func() error` for any sub-expressions the dialect needs to interleave. They return `error` so dialects that don't support a feature can return `dialect.ErrUnsupportedFeature`.
+
+Capability check methods are `Supports…` and return `bool`.
+
+## Error returns
+
+`dialect.ErrUnsupportedFeature` is the sentinel for "this dialect cannot express this CEL construct." Wrap it with `fmt.Errorf("%w: <reason>", dialect.ErrUnsupportedFeature, ...)` so callers can `errors.Is()` it.
+
+Examples already in the codebase:
+- `dialect/sqlite/dialect.go` `WriteRegexMatch` returns `ErrUnsupportedFeature` (SQLite has no native regex).
+- `dialect/spark/dialect.go` `WriteArrayLength` returns `ErrUnsupportedFeature` for `dimension > 1` (Spark doesn't support multi-dim arrays).
+
+## Worked example: WriteUnixSeconds
+
+Suppose CEL's `int(timestamp)` should emit a per-dialect Unix-seconds expression.
+
+1. **Interface** (`dialect/dialect.go`):
+   ```go
+   // WriteEpochExtract writes extraction of epoch from a timestamp.
+   // For PostgreSQL: EXTRACT(EPOCH FROM expr)::bigint.
+   WriteEpochExtract(w *strings.Builder, writeExpr func() error) error
+   ```
+2. **PostgreSQL** (`dialect/postgres/dialect.go`):
+   ```go
+   func (d *Dialect) WriteEpochExtract(w *strings.Builder, writeExpr func() error) error {
+       w.WriteString("EXTRACT(EPOCH FROM ")
+       if err := writeExpr(); err != nil { return err }
+       w.WriteString(")::bigint")
+       return nil
+   }
+   ```
+3. **MySQL**: `UNIX_TIMESTAMP(<expr>)`.
+4. **SQLite**: `CAST(strftime('%s', <expr>) AS INTEGER)`.
+5. **DuckDB**: `EXTRACT(EPOCH FROM <expr>)::BIGINT`.
+6. **BigQuery**: `UNIX_SECONDS(<expr>)`.
+7. **Spark**: `UNIX_TIMESTAMP(<expr>)`.
+
+(All six already implement this method — it's a real example from the codebase.)
+
+8. **Test case** in `testcases/cast_tests.go`:
+   ```go
+   {
+       Name:     "cast_int_epoch",
+       CELExpr:  `int(created_at)`,
+       Category: CategoryCast,
+       WantSQL: map[dialect.Name]string{
+           dialect.PostgreSQL: "EXTRACT(EPOCH FROM created_at)::bigint",
+           dialect.MySQL:      "UNIX_TIMESTAMP(created_at)",
+           dialect.SQLite:     "CAST(strftime('%s', created_at) AS INTEGER)",
+           dialect.DuckDB:     "EXTRACT(EPOCH FROM created_at)::BIGINT",
+           dialect.BigQuery:   "UNIX_SECONDS(created_at)",
+           dialect.Spark:      "UNIX_TIMESTAMP(created_at)",
+       },
+   },
+   ```