Skip to content

feat(parser): add SQL Server PIVOT/UNPIVOT clause parsing#477

Open
ajitpratap0 wants to merge 3 commits intomainfrom
feat/pivot-unpivot
Open

feat(parser): add SQL Server PIVOT/UNPIVOT clause parsing#477
ajitpratap0 wants to merge 3 commits intomainfrom
feat/pivot-unpivot

Conversation

@ajitpratap0
Copy link
Copy Markdown
Owner

Summary

  • Add SQL Server / Oracle PIVOT and UNPIVOT operator parsing in FROM clauses
  • PIVOT transforms rows to columns via an aggregate: PIVOT (SUM(sales) FOR region IN ([North], [South]))
  • UNPIVOT performs reverse column-to-row: UNPIVOT (sales FOR region IN (north_sales, south_sales))
  • New PivotClause and UnpivotClause AST nodes with Pivot/Unpivot fields on TableReference
  • Registers PIVOT/UNPIVOT in the tokenizer keyword map so they tokenize as keywords (not identifiers)
  • Formatter renders PIVOT/UNPIVOT clauses in SQL output

Test plan

  • 4 dedicated unit tests: subquery+alias, plain table, no alias, AS alias
  • Testdata files 11_pivot.sql and 12_unpivot.sql now enabled and passing
  • All 20 MSSQL testdata files pass
  • Full test suite (60+ packages) passes with no regressions
  • Race detection passes on parser, formatter, AST, and tokenizer packages

Closes #456

Generated with Claude Code

Add support for SQL Server and Oracle PIVOT/UNPIVOT operators in FROM
clauses. PIVOT transforms rows to columns via an aggregate function,
while UNPIVOT performs the reverse column-to-row transformation.

- Add PivotClause and UnpivotClause AST node types
- Add Pivot/Unpivot fields to TableReference struct
- Implement parsePivotClause/parseUnpivotClause in new pivot.go
- Wire parsing into parseFromTableReference and parseJoinedTableRef
- Add PIVOT/UNPIVOT to tokenizer keyword map for correct token typing
- Update formatter to render PIVOT/UNPIVOT clauses
- Enable testdata/mssql/11_pivot.sql and 12_unpivot.sql
- Add 4 dedicated tests covering subquery+alias, plain table, AS alias

Closes #456

Co-Authored-By: Claude Opus 4.6 (1M context) <noreply@anthropic.com>
@github-actions github-actions bot added parser tokenizer Tokenizer / lexer changes ast AST node changes labels Mar 31, 2026
@vercel
Copy link
Copy Markdown

vercel bot commented Mar 31, 2026

The latest updates on your projects. Learn more about Vercel for GitHub.

Project Deployment Actions Updated (UTC)
gosqlx-website Ready Ready Preview, Comment Mar 31, 2026 9:48pm

@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 31, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: PIVOT/UNPIVOT Parsing Implementation

I've completed a comprehensive review of this PIVOT/UNPIVOT implementation.

Review Summary

AST Node Implementation ✅

Strengths:

  • Clean struct definitions with appropriate field types
  • Good documentation with examples
  • Proper interface implementations (expressionNode(), TokenLiteral(), Children())
  • Location tracking with models.Location for error reporting
  • PivotClause.AggregateFunction correctly typed as Expression to handle complex aggregates

Minor Issues:

  • PivotClause.PivotColumn and UnpivotClause fields use string instead of *Identifier - this loses source position information

Parsing Logic Analysis ✅

Strengths:

  • Comprehensive error messages for 18 different syntax error scenarios
  • Proper token type checking before advancing
  • EOF protection in loops to prevent infinite iteration
  • Support for multiple value types in IN lists (identifiers, numbers, strings)
  • Correct token advancement patterns

Improvements Needed:

  • Missing validation for empty IN lists (e.g., IN ()) - could cause runtime issues
  • Mixed types in IN lists stored as strings - consider type preservation

Formatter Integration ❌

Critical Issue - Keyword casing not respected:

  • pkg/formatter/render.go:1190,1199: Hard-coded "PIVOT"/"UNPIVOT" strings instead of using f.kw() function
  • This breaks the formatter's keyword casing options (KeywordUpper/KeywordLower)
  • Fix needed: Change to sb.WriteString(f.kw("PIVOT")) and sb.WriteString(f.kw("UNPIVOT"))

Test Coverage Analysis ✅

Excellent coverage:

  • 4 dedicated unit tests covering different alias patterns
  • Tests both PIVOT and UNPIVOT operations
  • Covers subquery + alias, plain table, AS alias, no alias cases
  • Proper AST validation and field checking
  • Integration with testdata files for both SQL Server and Oracle

Memory Management & Object Pooling ✅

Good practices followed:

  • No object pooling needed for new AST nodes (per repository patterns)
  • Proper integration with existing TableReference pooling
  • No memory leaks detected in parsing logic

Security & Performance ✅

Security:

  • No SQL injection vectors introduced
  • Proper input validation and error handling
  • Safe string handling practices

Performance:

  • Efficient parsing with minimal allocations
  • Good error path performance
  • Integrates well with existing zero-copy optimizations

Final Recommendation

Overall Assessment: Strong implementation with one critical formatter fix needed

Required Changes:

  1. Fix formatter keyword casing - Critical for consistency with repository standards

Suggested Improvements:

  1. Add empty IN list validation
  2. Consider using *Identifier for column references to preserve position info

What's Working Well:

  • Solid AST design and parsing logic
  • Excellent test coverage
  • Proper error handling
  • Good integration with existing codebase patterns

The implementation demonstrates strong understanding of the codebase architecture and SQL parsing requirements. The formatter fix is the only blocking issue.


Job run

CVE-2026-32285 affects github.com/buger/jsonparser v1.1.1, which is a
transitive dependency via mark3labs/mcp-go → invopop/jsonschema →
wk8/go-ordered-map → buger/jsonparser.

No fixed version is available upstream. The package is not called
directly by any GoSQLX code and risk is scoped to MCP JSON schema
generation. Added to .trivyignore until a patched version is released.

Fixes Trivy Repository Scan CI failures in PR #475 and #477.
@github-actions
Copy link
Copy Markdown

github-actions bot commented Mar 31, 2026

Claude finished @ajitpratap0's task —— View job


PR Review: PIVOT/UNPIVOT Parsing Implementation

I've completed a comprehensive review of this PIVOT/UNPIVOT implementation focusing on code quality, potential issues, performance, security, and test coverage.

Review Checklist

  • Read repository guidelines and context
  • Analyze AST node implementations
  • Review parsing logic and error handling
  • Check formatter integration
  • Evaluate test coverage
  • Assess security and performance implications
  • Provide final recommendations

Review Summary

AST Node Implementation ✅

Strengths:

  • Clean struct definitions with appropriate field types
  • Good documentation with examples
  • Proper interface implementations (expressionNode(), TokenLiteral(), Children())
  • Location tracking with models.Location for error reporting
  • PivotClause.AggregateFunction correctly typed as Expression to handle complex aggregates

Minor Issues:

  • PivotClause.PivotColumn and UnpivotClause fields use string instead of *Identifier - this loses source position information for enhanced error reporting

Parsing Logic Analysis ✅

Strengths:

  • Comprehensive error messages for 18 different syntax error scenarios
  • Proper token type checking before advancing
  • EOF protection in loops to prevent infinite iteration
  • Support for multiple value types in IN lists (identifiers, numbers, strings)
  • Correct token advancement patterns

Improvements Needed:

  • Missing validation for empty IN lists (e.g., IN ()) - could cause runtime issues
  • Mixed types in IN lists stored as strings - consider type preservation

Formatter Integration ❌

Critical Issue - Keyword casing not respected:

  • pkg/formatter/render.go:1190,1199: Hard-coded "PIVOT"/"UNPIVOT" strings instead of using f.kw() function
  • This breaks the formatter's keyword casing options (KeywordUpper/KeywordLower)
  • Fix needed: Change to sb.WriteString(f.kw("PIVOT")) and sb.WriteString(f.kw("UNPIVOT"))

Test Coverage Analysis ✅

Excellent coverage:

  • 4 dedicated unit tests covering different alias patterns
  • Tests both PIVOT and UNPIVOT operations
  • Covers subquery + alias, plain table, AS alias, no alias cases
  • Proper AST validation and field checking
  • Integration with testdata files for both SQL Server and Oracle

Memory Management & Object Pooling ✅

Good practices followed:

  • No object pooling needed for new AST nodes (per repository patterns)
  • Proper integration with existing TableReference pooling
  • No memory leaks detected in parsing logic

Security & Performance ✅

Security:

  • No SQL injection vectors introduced
  • Proper input validation and error handling
  • Safe string handling practices

Performance:

  • Efficient parsing with minimal allocations
  • Good error path performance
  • Integrates well with existing zero-copy optimizations

Final Recommendation

Overall Assessment: Strong implementation with one critical formatter fix needed

Required Changes:

  1. Fix formatter keyword casing - Critical for consistency with repository standards

Suggested Improvements:

  1. Add empty IN list validation
  2. Consider using *Identifier for column references to preserve position info

What's Working Well:

  • Solid AST design and parsing logic
  • Excellent test coverage
  • Proper error handling
  • Good integration with existing codebase patterns

The implementation demonstrates strong understanding of the codebase architecture and SQL parsing requirements. The formatter fix is the only blocking issue.


Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

ast AST node changes parser tokenizer Tokenizer / lexer changes

Projects

None yet

Development

Successfully merging this pull request may close these issues.

P2: SQL Server PIVOT and UNPIVOT parsing

1 participant