Skip to content

feat: backport features and fix from observeinc/cel2sql fork#113

Merged
richardwooding merged 6 commits intomainfrom
backport/observe-features
Apr 27, 2026
Merged

feat: backport features and fix from observeinc/cel2sql fork#113
richardwooding merged 6 commits intomainfrom
backport/observe-features

Conversation

@richardwooding
Copy link
Copy Markdown
Contributor

Summary

Backports four commits from the observeinc/cel2sql fork — three additive options plus one bug fix. Skips the fork's module-path rename commit. All commits attribute the original author via Co-Authored-By.

  • WithJSONVariables(vars ...string) — declare flat JSONB variables; tags.color (or tags["color"]) emits tags->>'color'. Previously cel2sql only recognized JSONB through the two-level table.jsonColumn.key pattern.
  • WithColumnAliases(map[string]string) — map CEL variable names to different SQL columns (e.g., nameusr_name). Aliases are validated through the dialect's field-name validator. Combines with WithJSONVariables.
  • WithParamStartIndex(int) — first placeholder index for ConvertParameterized (e.g., $5, $6, ...). Useful when embedding the CEL fragment inside a larger parameterized query. Values < 1 clamp to 1.
  • BREAKING fix: removed the name-based numeric-cast heuristic from visitIdent. Identifiers named score/value/num/amount/count/level are no longer auto-cast to ::numeric — the heuristic was a footgun that misfired on plain (non-JSON) variables sharing those names. JSON text-extraction comparisons in visitCall are unaffected.

Notes

  • Dropped the fork's unused markJSONIterVar/unmarkJSONIterVar helpers per the project's "no scaffolding for hypothetical futures" guidance — the empty jsonIterVars map already produces the intended "heuristic disabled" behavior.
  • Added 5 tests for WithParamStartIndex (the fork's PR shipped without any).
  • All Co-Authored-By: trailers attribute david.lotfi <david.lotfi@observeinc.com>.

Test plan

  • golangci-lint run — 0 issues
  • go test -short -race ./... — all unit tests pass (15 new, all existing). TestPostgreSQL17Compatibility fails locally due to a missing Docker daemon; this is pre-existing on main and unrelated.
  • CI green
  • make ci on a Docker-enabled host (full integration suite)

Commits

  • feat: add WithJSONVariables option for flat JSONB variable support
  • feat: add WithColumnAliases option for CEL-to-SQL column name mapping
  • fix: remove name-based numeric casting heuristic from visitIdent
  • feat: add WithParamStartIndex for embedding CEL SQL in larger queries
  • docs: document backported observe options + breaking change

🤖 Generated with Claude Code

richardwooding and others added 5 commits April 26, 2026 16:25
Add ConvertOption WithJSONVariables(...string) that declares CEL variable
names corresponding to flat JSONB columns. When marked, both dot notation
(context.host) and bracket notation (context["host"]) emit PostgreSQL JSONB
text-extraction operators (e.g. context->>'host') instead of plain dot
notation.

Previously cel2sql only recognized JSONB fields through the
table.jsonColumn.key pattern (two levels of select expression). Flat CEL
variables typed as maps could not produce JSONB operators.

Backported from observeinc/cel2sql fork (commit 6871574).

Co-Authored-By: david.lotfi <david.lotfi@observeinc.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add ConvertOption WithColumnAliases(map[string]string) that maps CEL
variable names to SQL column names. When a CEL identifier matches a key
in the alias map, the SQL output uses the mapped name instead. This is
useful when database columns use prefixed names (e.g., usr_name) while
the user-facing CEL expressions use clean names (e.g., name).

Works alongside WithJSONVariables: aliased JSON variables produce
correctly prefixed JSONB operators (e.g., tbl_tags->>'color'). The alias
values are validated for SQL injection safety using the dialect's field
name validator.

Backported from observeinc/cel2sql fork (commit da5ff38).

Co-Authored-By: david.lotfi <david.lotfi@observeinc.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
The needsNumericCasting function used a hardcoded list of "common
numeric iteration variable names" (score, value, num, amount, count,
level) to apply ::numeric casts in visitIdent. This caused incorrect
SQL when any CEL variable happened to share one of these names.

Replace the hardcoded list with a jsonIterVars tracking set on the
converter struct. needsNumericCasting now only returns true for
variables explicitly registered in this set. With no callers populating
it, the heuristic-based path is effectively disabled. The context-aware
numeric casting in visitCall already handles JSON text-extraction
comparisons correctly and is unaffected by this change.

BREAKING: any caller relying on the auto-cast for plain (non-JSON)
variables named score/value/num/amount/count/level must add an explicit
cast in CEL or rely on the visitCall JSON-extraction path.

Backported from observeinc/cel2sql fork PR #1 (commit c68ab70).

Co-Authored-By: obs-gh-davidlotfi <david.lotfi@observeinc.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add ConvertOption WithParamStartIndex(n) so ConvertParameterized can emit
placeholders starting at $n (e.g. $5, $6) instead of $1, $2. Callers that
embed the CEL fragment into a larger parameterized query can pass the
next free index and append result.SQL and result.Parameters without
renumbering. Values less than 1 are clamped to 1.

Backported from observeinc/cel2sql fork PR #2 (commit ae931be). Adds
test coverage that the fork did not include: default behavior, explicit
=1, =5 with multiple params, clamping for 0/negative values, and the
many-params case verifying placeholders count up correctly past 9.

Co-Authored-By: obs-gh-davidlotfi <david.lotfi@observeinc.com>
Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Add Unreleased entries for WithJSONVariables, WithColumnAliases, and
WithParamStartIndex (Added) plus the BREAKING removal of the name-based
numeric-cast heuristic (Changed). Extend the Available Options list in
README and add usage sections to CLAUDE.md's Common Patterns covering
each new option with code examples.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
Copy link
Copy Markdown

Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull request overview

Backports several features from the observeinc/cel2sql fork to extend CEL→SQL conversion configurability (flat JSONB variables, column aliasing, configurable parameter start index) and removes a brittle numeric-casting heuristic.

Changes:

  • Added WithJSONVariables, WithColumnAliases, and WithParamStartIndex options to the converter.
  • Updated JSON-path detection / numeric-cast heuristic behavior and added new unit tests for the new options.
  • Documented the new options and the breaking numeric-cast heuristic removal in project docs/changelog.

Reviewed changes

Copilot reviewed 7 out of 7 changed files in this pull request and generated 4 comments.

Show a summary per file
File Description
cel2sql.go Adds new conversion options (JSON variables, column aliases, param start index) and wires them into conversion and identifier rendering.
json.go Extends JSON-path detection to recognize flat JSON variables and removes name-based numeric-cast heuristics.
json_variables_test.go New tests covering WithJSONVariables and WithColumnAliases.
param_start_index_test.go New tests covering WithParamStartIndex.
README.md Documents the newly added options.
CLAUDE.md Adds extended documentation/examples for the new options.
CHANGELOG.md Records the new options and the breaking heuristic change.
Comments suppressed due to low confidence (1)

json.go:51

  • WithJSONVariables currently only triggers JSON-path generation when the immediate operand is an IdentExpr. For nested access like tags.corpus.section (where the operand is itself a SelectExpr), shouldUseJSONPath returns false, and the converter falls back to regular select rendering, producing invalid SQL such as tags->>'corpus'.section. Consider (1) treating any select chain rooted at a JSON variable as JSON-path eligible (e.g., extend shouldUseJSONPath/hasJSONFieldInChain to detect con.isJSONVariable at the root), and (2) updating buildJSONPathInternal to avoid the table.column shortcut when tableName is a JSON variable, so nested keys render as tags->'corpus'->>'section'. Adding a test for nested keys would prevent regressions.
// shouldUseJSONPath determines if we should use JSON path operators for field access.
// This function checks if the operand represents a JSON/JSONB field using schema
// information or the WithJSONVariables option for flat JSONB variables.
func (con *converter) shouldUseJSONPath(operand *exprpb.Expr, _ string) bool {
	// Check if the operand is a flat variable declared as JSONB via WithJSONVariables
	if identExpr := operand.GetIdentExpr(); identExpr != nil {
		if con.isJSONVariable(identExpr.GetName()) {
			return true
		}
	}

	// Check if the operand is a direct table.column access where column is JSON
	if selectExpr := operand.GetSelectExpr(); selectExpr != nil {
		// For obj.metadata, check if metadata is a JSON column in obj table
		if tableName, fieldName, ok := con.getTableAndFieldFromSelectChain(operand); ok {
			// Use schema information to determine if this field is JSON
			isJSON := con.isFieldJSON(tableName, fieldName)
			con.logger.LogAttrs(context.Background(), slog.LevelDebug,
				"JSON path detection",
				slog.String("table", tableName),
				slog.String("field", fieldName),
				slog.Bool("is_json", isJSON),
			)
			return isJSON
		}

		// Check if there's a JSON field somewhere in the operand chain
		if con.hasJSONFieldInChain(operand) {
			con.logger.Debug("JSON field detected in select chain")
			return true
		}
	}

Comment thread cel2sql.go Outdated
Comment on lines 408 to 416
depth int // Current recursion depth
maxDepth int // Maximum allowed recursion depth
maxOutputLen int // Maximum allowed SQL output length
comprehensionDepth int // Current comprehension nesting depth
jsonIterVars map[string]bool // Iteration variables from JSON array comprehensions
parameterize bool // Enable parameterized output
parameters []any // Collected parameters for parameterized queries
paramCount int // Parameter counter for placeholders
}
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 16cc2ea. The cel2sql.go:416 dead-code observation was correct — jsonIterVars was never populated. Removed the field, the needsNumericCasting function, and the unreachable cast branch in visitIdent. The heuristic-disabled behavior is preserved by simply omitting the cast logic; the breaking-change CHANGELOG entry still applies.

Comment thread json_variables_test.go
Comment on lines +80 to +84
assert.Equal(t, tt.wantSQL, result.SQL)
if tt.wantArgs != nil {
assert.Equal(t, tt.wantArgs, result.Parameters)
}
})
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 16cc2ea. Both table-driven tests now always assert result.Parameters: when wantArgs is nil they use assert.Empty(...); otherwise they use assert.Equal(wantArgs, ...). Catches regressions where parameters might be unexpectedly collected for boolean / no-arg cases.

Comment thread json_variables_test.go

assert.Equal(t, tt.wantSQL, result.SQL)
if tt.wantArgs != nil {
assert.Equal(t, tt.wantArgs, result.Parameters)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Fixed in 16cc2ea — same pattern: assert.Empty when wantArgs == nil, assert.Equal otherwise. Applied via replace_all so both TestWithJSONVariables_* and TestWithColumnAliases table tests now assert empty-param cases explicitly.

Comment thread cel2sql.go Outdated
Comment on lines +1959 to +1962
if err := con.visit(m); err != nil {
return err
}
return con.dialect.WriteJSONFieldAccess(&con.str, func() error { return nil }, fieldName, true)
Copy link
Copy Markdown
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good catch — fixed in 16cc2ea. Replaced the pre-write + no-op-writeBase pattern with WriteJSONFieldAccess(&con.str, func() error { return con.visit(m) }, fieldName, true). PostgreSQL/DuckDB output is unchanged (operator-style); SQLite (json_extract(base, $.field)) and BigQuery (JSON_VALUE(base, $.field)) now correctly wrap the base inside the function call.

- visitCallMapIndex: use writeBase callback in WriteJSONFieldAccess instead
  of pre-writing the base. The fork's pre-write pattern broke wrapper-style
  dialects (SQLite json_extract, BigQuery JSON_VALUE) where the base must
  be inside the wrapper, not before it. PostgreSQL output is unchanged.

- Remove dead jsonIterVars field and needsNumericCasting function. With no
  callers populating the set, needsNumericCasting always returned false
  and the cast branch in visitIdent was unreachable. The behavior change
  (heuristic disabled) is preserved by simply omitting the cast logic.

- json_variables_test.go: always assert result.Parameters instead of
  skipping when wantArgs is nil. Catches regressions where parameters
  are unexpectedly collected.

- Fix nested JSON-variable access (tags.corpus.section). shouldUseJSONPath
  now treats any select chain rooted at a JSON variable as JSON-path
  eligible, and buildJSONPathInternal skips its table.column shortcut
  when the root is a JSON variable. Result: `tags->'corpus'->>'section'`
  instead of the previous broken `tags->>'corpus'.section`. Adds
  TestWithJSONVariables_NestedDotNotation covering 2- and 3-level depth.

Co-Authored-By: Claude Opus 4.7 (1M context) <noreply@anthropic.com>
@richardwooding
Copy link
Copy Markdown
Contributor Author

Pushed 16cc2ea addressing all four Copilot inline comments + the suppressed low-confidence comment about nested JSON-variable access.

Resolved:

  1. visitCallMapIndex now uses the writeBase callback correctly (works for SQLite json_extract and BigQuery JSON_VALUE).
  2. ✅ Dead jsonIterVars field and needsNumericCasting function removed; visitIdent simplified.
  3. ✅ Tests now always assert result.Parameters (assert.Empty when none expected, assert.Equal otherwise).
  4. ✅ Nested JSON-variable dot-access fixed: tags.corpus.sectiontags->'corpus'->>'section' (was producing invalid tags->>'corpus'.section). New TestWithJSONVariables_NestedDotNotation covers 2- and 3-level depth.

Status checks:

  • Go Vulnerability Check and OSV Security Scan failures are pre-existing on main (transitive Docker SDK vuln via testcontainers-go in test/example code, unrelated to this PR — Security workflow has been red since at least 2026-04-12). Out of scope for the backport; should be tracked separately.
  • All other checks green: lint, unit tests, fuzz, benchmarks, gosec.

@richardwooding richardwooding merged commit 92314f9 into main Apr 27, 2026
7 of 9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants