fix: pass elem writer to JSON array membership dialect methods by richardwooding · Pull Request #22 · SPANDigital/pycel2sql

richardwooding · 2026-06-09T10:13:53Z

Summary

Ports cel2sql4j PR #20 (commit 1835215) into pycel2sql. Resolves the single port candidate flagged by three weekly upstream-scan issues: #21, #20, #18.

The two JSON-array membership dialect methods (write_json_array_membership / write_nested_json_array_membership) previously received only the array writer, forcing the converter to emit elem = <dialect output> inline. Worse, _visit_in never routed to them at all — they were dead code, and x in <json array field> fell through to plain write_array_membership. The inline elem = form was broken for every dialect except PostgreSQL.

Changes

dialect/_base.py — widen both ABC methods to take write_elem alongside write_array; each dialect now owns the full boolean predicate.
6 dialects — emit correct membership SQL (see table).
_converter.py — _visit_in now detects a JSON-array RHS (schema-declared JSON field, nested JSON access, or flat json_variables path) and routes to the JSON-membership hooks. A direct table.field carries JSONB-ness via json_func; a deeper chain uses the nested form. Non-JSON RHS still falls back to write_array_membership.
Tests — replace the two Spark "raises" tests with positive assertions; add cross-dialect converter tests in test_dialect_parametrized.py (direct JSONB/JSON field, nested access, + non-JSON regression guard).
Docs — update CLAUDE.md conventions and the Spark dialect-differences line.

Per-dialect output (`"x" in t.tags`, `t.tags` a JSON array)

Dialect	Before (broken)	After
PostgreSQL	`ANY(ARRAY(SELECT jsonFunc(arr)))` (accidental)	`'x' = ANY(ARRAY(SELECT jsonb_array_elements_text(t.tags)))`
BigQuery	`= UNNEST(...)` ❌ invalid	`'x' IN UNNEST(JSON_VALUE_ARRAY(t.tags))`
DuckDB	scalar subquery (last row) ❌	`EXISTS (SELECT 1 FROM json_each(t.tags) WHERE value = 'x')`
SQLite	scalar subquery (last row) ❌	`EXISTS (SELECT 1 FROM json_each(t.tags) WHERE value = 'x')`
MySQL	compares elem to 0/1 ❌	`JSON_OVERLAPS(JSON_ARRAY('x'), t.tags)`
Spark	raised `UnsupportedDialectFeatureError` ❌	`array_contains(from_json(t.tags, 'ARRAY<STRING>'), 'x')`

Verification

uv run ruff check src/ tests/ — clean
uv run pytest tests/ --ignore=tests/integration — 728 passed (19 new)
mypy: only the pre-existing/expected bare-Tree [type-arg] errors (gated continue-on-error in CI)

Closes #21
Closes #20
Closes #18

🤖 Generated with Claude Code

) Port of cel2sql4j PR #20 (commit 1835215). The two JSON-array membership dialect methods previously received only the array writer, so the converter had to emit `elem = <dialect output>` inline — and `_visit_in` never even routed to them (dead code). The inline form was broken for every dialect except PostgreSQL: - BigQuery: `= UNNEST(...)` is invalid; needs `IN UNNEST(...)`. - DuckDB/SQLite: scalar subquery returns the last row for multi-element arrays. - MySQL: compared the element to JSON_CONTAINS's 0/1 result. - Spark: raised at conversion time (couldn't build a predicate without elem). Widen `write_json_array_membership` / `write_nested_json_array_membership` to take a `write_elem` writer alongside `write_array`; each dialect now owns the full boolean predicate. Wire `_visit_in` to detect a JSON-array RHS (schema JSON field, nested JSON access, or flat json_variable path) and route to the JSON-membership hooks; non-JSON RHS still falls back to write_array_membership. Per-dialect output: PostgreSQL: elem = ANY(ARRAY(SELECT <json_func>(arr))) MySQL: JSON_OVERLAPS(JSON_ARRAY(elem), arr) SQLite: EXISTS (SELECT 1 FROM json_each(arr) WHERE value = elem) DuckDB: EXISTS (SELECT 1 FROM json_each(arr) WHERE value = elem) BigQuery: elem IN UNNEST(JSON_VALUE_ARRAY(arr)) Spark: array_contains(from_json(arr, 'ARRAY<STRING>'), elem) Replace the two Spark "raises" tests with positive assertions and add cross-dialect converter tests (direct JSONB/JSON field, nested access, and a non-JSON regression guard). Update CLAUDE.md conventions + Spark dialect notes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>

richardwooding merged commit 70ff849 into main Jun 9, 2026
7 checks passed

richardwooding deleted the fix/json-array-membership-elem-writer branch June 9, 2026 10:18

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

fix: pass elem writer to JSON array membership dialect methods#22

fix: pass elem writer to JSON array membership dialect methods#22
richardwooding merged 1 commit into
mainfrom
fix/json-array-membership-elem-writer

richardwooding commented Jun 9, 2026

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Conversation

richardwooding commented Jun 9, 2026

Summary

Changes

Per-dialect output ("x" in t.tags, t.tags a JSON array)

Verification

Uh oh!

Uh oh!

Reviewers

Assignees

Labels

Projects

Milestone

Development

Uh oh!

1 participant

Per-dialect output (`"x" in t.tags`, `t.tags` a JSON array)