A metadata compiler that turns database schemas into AI-optimized documentation.
dbook connects to your database, introspects every table, and automatically generates structured metadata that AI agents can navigate -- enum values, data lineage, example queries, auto-detected metrics, and PII markers. One command, fully automated, no manual authoring. Agents with dbook score 4.7/5 on SQL tasks vs 3.2/5 with raw DDL, while reading 77% fewer tokens.
- Clean plugin architecture -- dbook core is fully self-contained, no external LLM dependency required
- LLM providers as plugins -- agentlib, OpenAI, Anthropic, Gemini supported via
LLMProviderprotocol - Own token counter -- no longer depends on agentlib for token counting
- Removed agentlib shim -- cleaner dependency graph, faster imports
Your AI agents are blind to your data.
Raw DDL tells agents the structure -- but not the meaning:
status VARCHAR(20)-- agents guess "active", "enabled", "1"... the real values are "pending", "shipped", "delivered"user_id INTEGER REFERENCES users(id)-- but what IS this relationship? The customer? The assignee? The creator?- Your gold layer exists because consumers couldn't read silver -- but AI agents CAN, with the right metadata
The result:
- You maintain expensive gold layer ETL just for AI consumption
- Every agent re-discovers the schema independently (10 agents = 10x cost)
- Schema changes break agents silently -- no one knows until production fails
- Agents access PII columns unknowingly -- compliance risk with every query
- Agents guess enum values and write wrong SQL -- silent data quality issues
One command connects to your database, introspects every table, runs SELECT DISTINCT on enum columns, traces foreign key chains, detects PII patterns, and generates a complete metadata directory -- no configuration, no manual authoring:
pip install dbook
dbook compile "postgresql://user:pass@host/db" --output ./my_dbook1. Enum value documentation -- auto-detected via SELECT DISTINCT
status: pending, confirmed, shipped, delivered, cancelled
method: credit_card, debit_card, paypal, bank_transfer
2. Semantic FK descriptions -- agents understand relationships
-> users via user_id -- the customer who placed this order
<- order_items.order_id -- line items in this order
3. Example queries -- patterns agents can follow
- By status: SELECT * FROM orders WHERE status IN ('pending', 'confirmed')
- Revenue over time: SELECT DATE(created_at), SUM(total) FROM orders GROUP BY DATE(created_at)4. Auto-detected metrics -- common aggregations ready to use
- Total Amount: SELECT SUM(total) FROM orders
- Count by Status: SELECT status, COUNT(*) FROM orders GROUP BY status
- Amount over time: SELECT DATE(created_at), SUM(total) FROM orders GROUP BY DATE(created_at)
5. Data lineage -- how tables connect in the data flow
Source tables: users, products (no dependencies)
Intermediate: orders -> depends on users | <- used by order_items, invoices
Leaf: payments -> depends on invoices
6. PII detection -- marks sensitive columns, redacts sample data
| email | VARCHAR(255) | EMAIL (0.90) | high |
| card_last_four | VARCHAR(4) | CREDIT_CARD_PARTIAL (0.70) | low |
7. Query validation -- SQLGlot-powered, catches errors before execution
validator = QueryValidator(book)
result = validator.validate("SELECT * FROM orders WHERE status = 'completed'")
# Warning: 'completed' not in known values: pending, confirmed, shipped, delivered, cancelleddbook is not a documentation tool you maintain by hand. It is a compiler that connects to your live database, runs real queries, and generates everything automatically.
| Raw DDL | Manual docs | dbook | |
|---|---|---|---|
| What agents read | Full schema dump | Whatever you wrote | Only the tables they need |
| Enum values | Not available | You maintain them | Auto-discovered via SELECT DISTINCT |
| Metrics | Agent guesses | You define them | Auto-detected (SUM columns, COUNT-by-enum, time series) |
| Data lineage | Agent traces FKs manually | You diagram it | Auto-mapped from FK chains (root, intermediate, leaf) |
| Example queries | None | You write them | Auto-generated (FK joins, unique-key lookups) |
| PII detection | None | You flag columns | Auto-detected (email, SSN, phone patterns) |
| Schema changes | Re-dump everything | You update manually | Per-table checksums, incremental recompilation |
| Setup effort | Zero | Hours per schema | One command: dbook compile |
| Token cost | 100% (reads everything) | Varies | 23% (reads only what's needed) |
The token savings come from the architecture, not from compression. dbook organizes metadata into navigable layers so agents read 2-3 files per task instead of the entire schema. But the quality improvement comes from what those files contain -- actual enum values, real relationship semantics, working query patterns, and pre-computed metrics that raw DDL simply does not have.
Tested against a realistic e-commerce database modeled after Amazon: 7 schemas, 34 tables, covering users, orders, inventory, payments, and support. 15 real agent tasks across 3 agent personas (Billing, Care, Sales) -- each requiring the agent to find the right tables, write correct SQL, execute it, and return accurate results. Every scenario scored by a judge on 4 dimensions.
- dbook Score: 4.7/5 vs Baseline (raw DDL) 3.2/5 -- improvement of +1.5
- Token savings: 77% (7,792 vs 33,656 tokens per scenario)
The improvement is not just efficiency -- it is correctness. dbook agents find the right tables, use valid enum values in WHERE clauses, join on correct foreign keys, and return accurate results. Baseline agents reading raw DDL frequently guess wrong enum values, miss relevant tables, and produce SQL that returns empty or incorrect results.
The biggest gains are in SQL correctness and result accuracy -- exactly the dimensions where enum values, relationship metadata, and example queries make the difference.
| Dimension | dbook | Baseline (DDL) | Delta |
|---|---|---|---|
| Table Discovery | 4.7 | 4.3 | +0.4 |
| SQL Correctness | 4.7 | 3.0 | +1.7 |
| Result Accuracy | 4.3 | 2.6 | +1.7 |
| Response Quality | 4.7 | 2.7 | +2.0 |
| Agent Type | dbook | Baseline | Token Savings |
|---|---|---|---|
| Billing | 4.8/5 | 3.0/5 | 77% |
| Care | 4.8/5 | 3.0/5 | 77% |
| Sales | 4.5/5 | 3.5/5 | 77% |
Benchmarked against a 34-table e-commerce schema on PostgreSQL. All scores from automated test runs — see
benchmarks/for scenarios, seed data, and reproducible results.
dbook uses an L0/L1/L2 layered navigation architecture (inspired by agentlib) that organizes metadata into progressively detailed layers. This architecture is what delivers the 77% token savings -- agents read 2-3 files per task instead of the entire schema.
| Layer | File | What it contains |
|---|---|---|
| L0 -- Overview | NAVIGATION.md | Schema listing with row counts, table descriptions |
| L1 -- Section | _manifest.md | Per-schema details, cross-table relationships |
| L2 -- Detail | {table}.md | Columns, enum values, FKs, metrics, sample data, example queries |
| Lookup | concepts.json | Table/column term index with mechanical + LLM aliases |
| Protocol | SKILL.md | Agent navigation instructions |
dbook's core -- compilation, metadata generation, enum discovery, FK analysis, PII detection -- works without any LLM. LLM enrichment is optional and adds semantic summaries, concept aliases, and schema narratives on top of the mechanical output.
dbook defines an LLMProvider protocol that any LLM backend can implement:
MockProvider-- built-in, deterministic responses for testingAgentlibProvider-- uses agentlib's multi-provider LLM layer (Anthropic, OpenAI, Gemini, xAI, DeepSeek)- Custom providers -- implement the
LLMProviderprotocol with a singlecomplete()method
To use LLM enrichment, install a provider backend. For example, with agentlib:
pip install agentlib
dbook compile "postgresql://..." --output ./my_dbook --llm --llm-provider anthropic --llm-key sk-...Database-agnostic via Catalog protocol. Default SQLAlchemyCatalog supports any SQLAlchemy-compatible database. DB type auto-detected from URL.
PostgreSQL, MySQL, SQLite, Snowflake, BigQuery -- any database with a SQLAlchemy dialect.
dbook compile "postgresql://user:pass@host/db" --output ./my_dbookpip install "dbook[pii]"
dbook compile "postgresql://..." --output ./my_dbook --piipip install "dbook[llm]"
dbook compile "postgresql://..." --output ./my_dbook --llm --llm-provider anthropic --llm-key sk-...dbook check ./my_dbook "postgresql://user:pass@host/db"dbook compile "postgresql://..." --output ./my_dbook --incrementalfrom dbook.catalog import SQLAlchemyCatalog
from dbook.compiler import compile_book
from dbook.validator import QueryValidator
# Compile
catalog = SQLAlchemyCatalog("postgresql://user:pass@host/db")
book = catalog.introspect_all()
compile_book(book, "./my_dbook")
# Validate agent SQL
validator = QueryValidator(book)
result = validator.validate("SELECT * FROM orders WHERE status = 'delivered'")
print(result.valid, result.errors, result.warnings)| Feature | Install | Flag | What it adds |
|---|---|---|---|
| PII detection | pip install "dbook[pii]" |
--pii |
Column sensitivity markers, sample data redaction |
| LLM enrichment | pip install "dbook[llm]" |
--llm |
Semantic summaries, concept aliases, schema narratives |
| Metrics | pip install "dbook[metrics]" |
--metrics |
User-defined canonical business metrics |
Traditional data pipelines create gold layers because consumers can't read raw data. With dbook, AI agents can understand silver directly -- reducing the need for gold views for discovery and ad-hoc queries.
Note: dbook reduces the need for gold views for discovery and ad-hoc queries. Gold layers still provide value for: enforced business rules, canonical metric definitions, data quality guarantees, and grain standardization. For critical metrics, define them in
metrics.yaml-- dbook includes them in its output so agents use the canonical definition, not their own interpretation.
pip install -e ".[dev]"
pytest tests/ -q --tb=short139 tests covering: introspection, compilation, CLI, PII detection, LLM enrichment, query validation, and realistic agent simulation benchmarks.
Apache License 2.0
