fix: pass elem writer to JSON array membership dialect methods#22
Merged
Merged
Conversation
) Port of cel2sql4j PR #20 (commit 1835215). The two JSON-array membership dialect methods previously received only the array writer, so the converter had to emit `elem = <dialect output>` inline — and `_visit_in` never even routed to them (dead code). The inline form was broken for every dialect except PostgreSQL: - BigQuery: `= UNNEST(...)` is invalid; needs `IN UNNEST(...)`. - DuckDB/SQLite: scalar subquery returns the last row for multi-element arrays. - MySQL: compared the element to JSON_CONTAINS's 0/1 result. - Spark: raised at conversion time (couldn't build a predicate without elem). Widen `write_json_array_membership` / `write_nested_json_array_membership` to take a `write_elem` writer alongside `write_array`; each dialect now owns the full boolean predicate. Wire `_visit_in` to detect a JSON-array RHS (schema JSON field, nested JSON access, or flat json_variable path) and route to the JSON-membership hooks; non-JSON RHS still falls back to write_array_membership. Per-dialect output: PostgreSQL: elem = ANY(ARRAY(SELECT <json_func>(arr))) MySQL: JSON_OVERLAPS(JSON_ARRAY(elem), arr) SQLite: EXISTS (SELECT 1 FROM json_each(arr) WHERE value = elem) DuckDB: EXISTS (SELECT 1 FROM json_each(arr) WHERE value = elem) BigQuery: elem IN UNNEST(JSON_VALUE_ARRAY(arr)) Spark: array_contains(from_json(arr, 'ARRAY<STRING>'), elem) Replace the two Spark "raises" tests with positive assertions and add cross-dialect converter tests (direct JSONB/JSON field, nested access, and a non-JSON regression guard). Update CLAUDE.md conventions + Spark dialect notes. Co-Authored-By: Claude Opus 4.8 (1M context) <noreply@anthropic.com>
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.This suggestion is invalid because no changes were made to the code.Suggestions cannot be applied while the pull request is closed.Suggestions cannot be applied while viewing a subset of changes.Only one suggestion per line can be applied in a batch.Add this suggestion to a batch that can be applied as a single commit.Applying suggestions on deleted lines is not supported.You must change the existing code in this line in order to create a valid suggestion.Outdated suggestions cannot be applied.This suggestion has been applied or marked resolved.Suggestions cannot be applied from pending reviews.Suggestions cannot be applied on multi-line comments.Suggestions cannot be applied while the pull request is queued to merge.Suggestion cannot be applied right now. Please check back later.
Summary
Ports cel2sql4j PR #20 (commit
1835215) into pycel2sql. Resolves the single port candidate flagged by three weekly upstream-scan issues: #21, #20, #18.The two JSON-array membership dialect methods (
write_json_array_membership/write_nested_json_array_membership) previously received only the array writer, forcing the converter to emitelem = <dialect output>inline. Worse,_visit_innever routed to them at all — they were dead code, andx in <json array field>fell through to plainwrite_array_membership. The inlineelem =form was broken for every dialect except PostgreSQL.Changes
dialect/_base.py— widen both ABC methods to takewrite_elemalongsidewrite_array; each dialect now owns the full boolean predicate._converter.py—_visit_innow detects a JSON-array RHS (schema-declared JSON field, nested JSON access, or flatjson_variablespath) and routes to the JSON-membership hooks. A directtable.fieldcarries JSONB-ness viajson_func; a deeper chain uses the nested form. Non-JSON RHS still falls back towrite_array_membership.test_dialect_parametrized.py(direct JSONB/JSON field, nested access, + non-JSON regression guard).CLAUDE.mdconventions and the Spark dialect-differences line.Per-dialect output (
"x" in t.tags,t.tagsa JSON array)ANY(ARRAY(SELECT jsonFunc(arr)))(accidental)'x' = ANY(ARRAY(SELECT jsonb_array_elements_text(t.tags)))= UNNEST(...)❌ invalid'x' IN UNNEST(JSON_VALUE_ARRAY(t.tags))EXISTS (SELECT 1 FROM json_each(t.tags) WHERE value = 'x')EXISTS (SELECT 1 FROM json_each(t.tags) WHERE value = 'x')JSON_OVERLAPS(JSON_ARRAY('x'), t.tags)UnsupportedDialectFeatureError❌array_contains(from_json(t.tags, 'ARRAY<STRING>'), 'x')Verification
uv run ruff check src/ tests/— cleanuv run pytest tests/ --ignore=tests/integration— 728 passed (19 new)Tree[type-arg]errors (gatedcontinue-on-errorin CI)Closes #21
Closes #20
Closes #18
🤖 Generated with Claude Code