Summary
Research on SQL formatting, pretty printing theory, and code generation approaches.
Key Findings
Foundational Papers
- Oppen (1980): "Prettyprinting" — original linear-time algorithm
- Hughes (1995): "The Design of a Pretty-printing Library" — algebraic approach
- Wadler (1998): "A prettier printer" — simplified Hughes with group/nest/line combinators
- Lindig (2000): "Strictly Pretty" — imperative implementation of Wadler's algorithm
SQL Formatters Analyzed
- sql-formatter: Tokenizer → Parser → CST → Formatter pipeline
- prettier-plugin-sql-cst: CST-based parsing with Wadler-style layout
- pgFormatter: Perl-based, PostgreSQL-specific
- SQLFluff: Python linter+formatter, rule-based
- CockroachDB sqlfmt: Extends Wadler with SQL-specific right-alignment
Architecture Recommendation
Use Wadler-style document algebra as intermediate representation:
text(s) — literal string
line — newline or space
nest(n, doc) — increase indent
group(doc) — try flat, break if too wide
- This separates "what to format" from "how to format"
SQL Style Consensus
- One clause per line (SELECT, FROM, WHERE, etc.)
- 2-4 space indentation
- CTEs over subqueries
- Explicit JOIN syntax
- UPPERCASE keywords
Action Items
Summary
Research on SQL formatting, pretty printing theory, and code generation approaches.
Key Findings
Foundational Papers
SQL Formatters Analyzed
Architecture Recommendation
Use Wadler-style document algebra as intermediate representation:
text(s)— literal stringline— newline or spacenest(n, doc)— increase indentgroup(doc)— try flat, break if too wideSQL Style Consensus
Action Items