-
Notifications
You must be signed in to change notification settings - Fork 2
Description
Problem Statement
GitHub Copilot has raised concerns in PR #58 about the correct usage of PostgreSQL JSON path operators (-> vs ->>). We need to validate the actual behavior and ensure our generated SQL follows PostgreSQL semantics correctly.
Background
In PR #58 (fixing SQL injection in JSON field names), three review comments were raised about the expected SQL in tests:
Comment 1 & 2 (lines 34, 40):
- Current test expects:
obj->>'metadata'->>'user_name' = 'test' - Copilot suggests:
obj->'metadata'->>'user_name' = 'test'
Comment 3 (line 123):
- Minor formatting: Use raw strings instead of escaped quotes
Technical Question
For CEL expression: obj.metadata.user_name == "test"
Which SQL should be generated?
Option A (Current): obj->>'metadata'->>'user_name' = 'test'
- Uses
->>on first field access (metadata)
Option B (Suggested): obj->'metadata'->>'user_name' = 'test'
- Uses
->for intermediate field (metadata),->>only for final field (user_name)
PostgreSQL JSON Operator Semantics
According to PostgreSQL documentation:
->returns JSON (for intermediate navigation)->>returns text (for final extraction)
The question is: Is obj a table with a metadata JSONB column, or is obj itself a reference that needs JSON operators?
If obj is a table name and metadata is a JSONB column, then:
obj.metadatashould use normal column access (no operator)- The path should be:
obj.metadata->>'user_name'
If obj itself needs JSON operators, then:
- We need to determine proper operator chain
Investigation Plan
Step 1: Investigate Current Code Behavior (Read-Only)
- Read
cel2sql.govisitSelect() function (around lines 1100-1150) - Read
json.gobuildJSONPath functions - Understand how
shouldUseJSONPath()determines operator usage - Check test schemas in
json_escaping_test.goto see table structure
Step 2: Create PostgreSQL Validation Test
- Use testcontainers to spin up real PostgreSQL instance
- Create test table:
CREATE TABLE test_table (id INT, metadata JSONB) - Insert test data:
INSERT INTO test_table VALUES (1, '{"user_name": "test"}') - Test both SQL variants:
SELECT * FROM test_table WHERE metadata->>'user_name' = 'test'SELECT * FROM test_table WHERE test_table->>'metadata'->>'user_name' = 'test'(expect error)
- Document which one works
Step 3: Fix Based on Findings
If validation shows current code is wrong:
- Update
visitSelect()logic incel2sql.go - Fix
buildJSONPath()injson.go - Update all test expectations
- Ensure escaping still works correctly
If validation shows current code is correct:
- Document why in PR comments
- Explain table vs JSON field access
Step 4: Respond to PR Comments
- Add validation results to PR comments
- Show actual PostgreSQL behavior
- Update code or explain rationale
Related Files
cel2sql.go(lines 1100-1170): visitSelect(), visitHasFunction()json.go(lines 17-67, 133-175, 342-392): JSON path constructionjson_escaping_test.go(lines 14-63): Test cases in questionutils.go(lines 197-201): escapeJSONFieldName() function
Acceptance Criteria
- PostgreSQL testcontainer validation test created
- Actual PostgreSQL behavior documented
- Code corrected if needed OR rationale documented
- All tests passing with correct SQL
- PR fix: Escape single quotes in JSON field names to prevent SQL injection #58 comments addressed with validation evidence
- No regression in SQL injection fix (escaping still works)
References
- PR fix: Escape single quotes in JSON field names to prevent SQL injection #58: Fix SQL injection in JSON field names
- PostgreSQL JSON Functions: https://www.postgresql.org/docs/current/functions-json.html
- Issue SQL Injection Risk via Unescaped Field Names in JSON Path Operators #22: Original SQL injection vulnerability report