A lightweight tool that parses SQL ETL scripts (BigQuery dialect) and auto-generates an interactive DAG to visualize end-to-end data flow across granular, intermediate, and output layers.
- Parses
CREATE TABLE,INSERT INTO, andMERGE INTOstatements. - Identifies source and target tables across scripts.
- Constructs directed dependency graphs (source → target).
- Interactive Streamlit UI using PyVis.
- CLI for lineage extraction and export to JSON/DOT.
- Works locally without database access.
uv syncuv run sqlviz examples/sql --json lineage.json --dot lineage.dotOutput:
lineage.json: Node-link representation of lineage graph.lineage.dot: Graphviz export for visualization or documentation.
uv run streamlit run sql_visualizer/app_streamlit.py -- examples/sqlThen open the Streamlit app in your browser to explore the DAG.
SOURCE_SYSTEM.CUSTOMERS ─┐
├─> RAW.CUSTOMER_BASE ─┐
SOURCE_SYSTEM.ACCOUNTS ─┘ │
├─> STG.CUST_ACCT_LINK ─> MART.SEGMENT_BALANCE_SUMMARY
sql-etl-visualizer/
├── sql_visualizer/
│ ├── sql_parser.py # Parse SQL to extract lineage
│ ├── graph_builder.py # Construct dependency DAG
│ ├── cli.py # CLI interface
│ └── app_streamlit.py # Streamlit UI (PyVis)
├── examples/sql/ # Sample SQL scripts
├── pyproject.toml # Project configuration
└── README.md # Documentation
pyproject.toml includes all dependencies and dev tools.
Main dependencies:
sqlglotfor SQL parsingnetworkxfor graph modelingpyvisfor visualizationstreamlitfor UI
Development tools:
rufffor lintingpytestfor testing
To test the parser:
pytest -vTo lint:
ruff check .- Add layer-based coloring to project, dataset, and table/view in UI.
- Include metadata tooltips (file, modified date, statement ID).
- Add export buttons for PNG/JSON from Streamlit UI.
- Integrate basic unit tests for parser and graph validation.
MIT License.