-
Notifications
You must be signed in to change notification settings - Fork 5
Closed
Labels
Description
Feature Description
Complete support for PostgreSQL JSON and JSONB operators in expression parsing. The tokenizer already recognizes these operators, but parser integration is needed to handle them in expressions.
SQL Standard
PostgreSQL-specific operators (not in SQL standard, though SQL:2016 added JSON support)
Current Status
From codebase analysis:
- ✅ Tokenizer: Already recognizes JSON operators (
->,->>,#>,#>>,@>,<@,?,?|,?&) - ❌ Parser: Limited or no expression support for JSON operators
- ❌ AST: May need JSON-specific expression nodes
Example SQL
Basic JSON Operators
-- JSON field access (returns JSON)
SELECT data -> 'name' FROM users;
SELECT data -> 'address' -> 'city' FROM users;
-- JSON field access (returns text)
SELECT data ->> 'email' FROM users;
SELECT data -> 'address' ->> 'zipcode' FROM users;
-- JSON path extraction (returns JSON)
SELECT data #> '{address,city}' FROM users;
-- JSON path extraction (returns text)
SELECT data #>> '{address,city}' FROM users;JSON Containment and Existence
-- Contains operator (does left contain right?)
SELECT * FROM products WHERE tags @> '["electronics", "sale"]';
SELECT * FROM users WHERE data @> '{"role": "admin"}';
-- Contained by operator (is left contained in right?)
SELECT * FROM settings WHERE config <@ '{"enabled": true, "debug": false}';
-- Key existence
SELECT * FROM users WHERE data ? 'email'; -- Does key exist?
SELECT * FROM users WHERE data ?| array['phone', 'mobile']; -- Any key exists?
SELECT * FROM users WHERE data ?& array['email', 'name']; -- All keys exist?Complex Expressions
-- JSON operators in WHERE clauses
SELECT * FROM orders
WHERE metadata ->> 'status' = 'shipped'
AND metadata -> 'shipping' ->> 'carrier' = 'USPS';
-- JSON operators in SELECT list
SELECT
id,
data ->> 'name' as name,
data -> 'address' ->> 'city' as city,
(data @> '{"verified": true}') as is_verified
FROM users;
-- JSON operators with functions
SELECT
id,
jsonb_array_length(data -> 'items') as item_count,
data #>> '{items,0,name}' as first_item_name
FROM orders;
-- Chained JSON operations
SELECT
data -> 'user' -> 'profile' ->> 'email' as email,
data #> '{user,settings,notifications}' as notification_prefs
FROM events;Advanced Use Cases
-- JSON in JOIN conditions
SELECT o.id, u.name
FROM orders o
JOIN users u ON o.data ->> 'user_id' = u.id::text;
-- JSON in aggregations
SELECT
data ->> 'category' as category,
COUNT(*) as count,
array_agg(data ->> 'name') as names
FROM products
WHERE data @> '{"in_stock": true}'
GROUP BY data ->> 'category';
-- JSON with CASE expressions
SELECT
id,
CASE
WHEN data @> '{"type": "premium"}' THEN 'Premium'
WHEN data ? 'discount' THEN 'Discounted'
ELSE 'Standard'
END as product_tier
FROM products;Implementation Notes
Parser Changes Required
-
Tokenizer (already complete):
- ✅ All JSON operators tokenized
- Token types:
TOKEN_ARROW,TOKEN_DOUBLE_ARROW,TOKEN_HASH_GT,TOKEN_HASH_DOUBLE_GT, etc.
-
AST Nodes (
pkg/sql/ast/):- Extend
BinaryExpressionto handle JSON operators - Or create specific
JSONAccessExpressionnode:Object Expression- the JSON/JSONB columnOperator string- the JSON operator (->, ->>, #>, etc.)Path Expression- the key/path (string, int, or array)
- May need
JSONContainmentExpressionfor @>, <@ - May need
JSONExistenceExpressionfor ?, ?|, ?&
- Extend
-
Parser Logic (
pkg/sql/parser/):- In
parseExpression()andparseBinaryExpression():- Recognize JSON operator tokens
- Handle operator precedence (JSON operators should be high precedence)
- Support chaining:
data -> 'user' -> 'profile' ->> 'email'
- In
parsePrimaryExpression():- Handle JSON path arrays:
'{address,city}' - Handle array literals for ?| and ?& operators
- Handle JSON path arrays:
- In
-
Operator Precedence:
- JSON access operators (
->,->>,#>,#>>) should be left-associative and high precedence - JSON containment/existence operators (
@>,<@,?,?|,?&) should be comparison-level precedence
- JSON access operators (
Testing Requirements
- Parse single-level JSON access (->)
- Parse chained JSON access (-> -> ->>)
- Parse JSON path access (#>, #>>)
- Parse JSON containment operators (@>, <@)
- Parse JSON existence operators (?, ?|, ?&)
- Parse JSON operators in WHERE clauses
- Parse JSON operators in SELECT list
- Parse JSON operators in JOIN conditions
- Test operator precedence and associativity
- Test with complex nested expressions
- Formatter support to render JSON operators correctly
Compatibility
- PostgreSQL: ✅ Full support for JSON/JSONB operators
- MySQL:
⚠️ Partial support (JSON_EXTRACT function, not operators) - SQL Server:
⚠️ Different syntax (JSON_VALUE, JSON_QUERY functions) - Oracle:
⚠️ Different syntax (JSON_VALUE, JSON_QUERY functions) - SQLite:
⚠️ JSON functions only (json_extract), not operators
JSON Operator Reference
| Operator | Description | Example | Returns |
|---|---|---|---|
-> |
Get JSON field by key | data -> 'name' |
JSON/JSONB |
->> |
Get JSON field as text | data ->> 'name' |
text |
#> |
Get JSON at path | data #> '{a,b}' |
JSON/JSONB |
#>> |
Get JSON at path as text | data #>> '{a,b}' |
text |
@> |
Contains | data @> '{"a":1}' |
boolean |
<@ |
Contained by | '{"a":1}' <@ data |
boolean |
? |
Key exists | data ? 'key' |
boolean |
| `? | ` | Any key exists | `data ? |
?& |
All keys exist | data ?& array['a','b'] |
boolean |
Priority
Medium-High - PostgreSQL heavily uses JSON/JSONB types, and operators are essential for practical JSON querying
Related Issues
- Part of PostgreSQL enhancement roadmap
- Tokenizer already supports these operators (partial work done)
- Related to expression parsing enhancements
Dependencies
- Requires tokenizer JSON operator support (✅ already complete)
- May require AST node enhancements
- May require operator precedence adjustments
Acceptance Criteria
- Parser successfully handles all JSON/JSONB operators
- AST correctly represents JSON expressions
- Support for chained JSON operations
- Correct operator precedence and associativity
- Test coverage for all JSON operators
- Test coverage for complex nested JSON expressions
- Documentation updated with examples
- Formatter preserves JSON operators correctly
- Clear error messages for invalid JSON operator usage
- Dialect-specific notes for cross-database compatibility