Skip to content

Research: SQL Formatting & Pretty Printing Theory #4

@productdevbook

Description

@productdevbook

Summary

Research on SQL formatting, pretty printing theory, and code generation approaches.

Key Findings

Foundational Papers

  • Oppen (1980): "Prettyprinting" — original linear-time algorithm
  • Hughes (1995): "The Design of a Pretty-printing Library" — algebraic approach
  • Wadler (1998): "A prettier printer" — simplified Hughes with group/nest/line combinators
  • Lindig (2000): "Strictly Pretty" — imperative implementation of Wadler's algorithm

SQL Formatters Analyzed

  • sql-formatter: Tokenizer → Parser → CST → Formatter pipeline
  • prettier-plugin-sql-cst: CST-based parsing with Wadler-style layout
  • pgFormatter: Perl-based, PostgreSQL-specific
  • SQLFluff: Python linter+formatter, rule-based
  • CockroachDB sqlfmt: Extends Wadler with SQL-specific right-alignment

Architecture Recommendation

Use Wadler-style document algebra as intermediate representation:

  • text(s) — literal string
  • line — newline or space
  • nest(n, doc) — increase indent
  • group(doc) — try flat, break if too wide
  • This separates "what to format" from "how to format"

SQL Style Consensus

  • One clause per line (SELECT, FROM, WHERE, etc.)
  • 2-4 space indentation
  • CTEs over subqueries
  • Explicit JOIN syntax
  • UPPERCASE keywords

Action Items

  • Implement Wadler-style document IR in printer/formatter.ts
  • Support 3 output modes: compact, formatted, debug (with params inlined)
  • Follow Holywell/Kickstarter style guide conventions
  • Add configurable indentation and keyword casing

Metadata

Metadata

Assignees

No one assigned

    Labels

    formattingSQL formatting/pretty printingresearchResearch findings

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions