Table Formats Explorer

Side-by-side comparison of three modern table/storage formats through runnable Python wrappers and agent-driven exploration.

What this repo teaches

Format	Category	Key concept
Delta Lake	ACID data-lake table	Transaction log (`_delta_log`) drives versioning; checkpoints compact the log
Apache Iceberg	Analytic table format	Snapshot/manifest tree separates metadata from data; branches are snapshot refs
SlateDB	Embedded LSM key-value store	WAL → L0 SSTs → compacted runs; every state transition is visible in manifests

Intended workflow

This repo is for educational purposes and designed to be explored with a coding agent :

Clone the repo and install dependencies.
Open a chat.
Ask questions like "show me what happens to Iceberg metadata when I append a row" or "walk me through the SlateDB LSM phases".
The agent writes and runs code using the wrappers in table_formats_demo/, saves tables under demo_output/<format>/, and exports human-readable YAML explanation files under demo_output/<format>/scratch/.
The agent explains what changed and links to the generated files.

Rule for agents: always use the wrappers in table_formats_demo/ (or write code that calls their APIs) to generate outputs. Never hand-craft YAML or JSON manifests. See AGENTS.md for full guidance.

Project structure

All the code in this repo are coding agent generated, with very brief review by author. The repo is only meant for demo and educational purposes.

table_formats_demo/
├── base/           # Shared models (TableRow, OperationResult, …) and abstract TableFormat
├── delta/          # DeltaFormat  – wraps deltalake
├── iceberg/        # IcebergFormat – wraps pyiceberg with SQLite catalog
├── slatedb/        # SlateDBFormat – wraps slatedb (LSM-tree key-value store)
└── utils/          # yaml_helpers, logging_config

demos/              # Runnable end-to-end demo scripts
tests/              # pytest test suite (one file per wrapper)
demo_output/        # Generated at runtime, git-ignored
  delta/
    users/          # Delta table files
    scratch/        # YAML explanation files (transaction log entries)
  iceberg/
    default/users/  # Iceberg data + metadata
    catalog/        # SQLite catalog
    scratch/        # YAML explanation files (metadata snapshots, manifests)
  slatedb/
    users/          # SlateDB table files (wal/, manifest/, compacted/)
    scratch/        # YAML explanation files (manifest versions)

Setup

Prerequisites: Python 3.11+, uv

git clone https://github.com/vigneshc/TableFormatsDemo.git
cd TableFormatsDemo
uv sync --dev

Running demos

Each demo script creates a fresh table, runs a complete lifecycle of operations, and writes YAML scratch files for inspection.

uv run python demos/delta_demo.py
uv run python demos/iceberg_demo.py
uv run python demos/slatedb_demo.py

Output is written to demo_output/<format>/. Scratch files land in demo_output/<format>/scratch/.

Running tests

uv run pytest
uv run pytest --cov=table_formats_demo --cov-report=html

Key APIs used by agents

Delta Lake

from table_formats_demo.delta.delta_format import DeltaFormat
delta = DeltaFormat(base_path="demo_output/delta", table_name="users")
delta.create_table(initial_data=...)   # writes _delta_log/00000000000000000000.json
delta.append_data(...)                  # new version entry in transaction log
delta.perform_maintenance()             # checkpoint + vacuum + compact
delta.export_scratch("demo_output/delta/scratch")  # YAML per log entry

Apache Iceberg

from table_formats_demo.iceberg.iceberg_format import IcebergFormat
iceberg = IcebergFormat(base_path="demo_output/iceberg", table_name="users")
iceberg.create_table(initial_data=...)      # snapshot 0, manifest list/manifest
iceberg.append_data(...)                     # new snapshot with new manifest
iceberg.perform_maintenance()                # full-table rewrite → compacted snapshot
iceberg.create_branch("feature")             # named snapshot ref in metadata
iceberg.export_scratch("demo_output/iceberg/scratch")  # YAML per .json and .avro file

SlateDB

from table_formats_demo.slatedb.slatedb_format import SlateDBFormat
db = SlateDBFormat(base_path="demo_output/slatedb", table_name="users")
db.create_table(initial_data=...)        # opens DB, writes rows
db.flush_wal_only()                      # WAL SST on disk, memtable unchanged
db.flush_memtable_to_l0()               # memtable → L0 SST, manifest updated
db.compact_l0_to_lower_levels()         # L0 SSTs → compacted runs
db.create_clone(clone_name="snapshot")  # zero-copy checkpoint-based clone
db.export_scratch("demo_output/slatedb/scratch")  # YAML per manifest version

Format comparison cheat-sheet

Concept	Delta Lake	Iceberg	SlateDB
Versioning unit	Transaction log entry	Snapshot	Manifest
Metadata format	JSON (+ Parquet checkpoint)	JSON + Avro	FlatBuffer (readable via admin API)
Compaction	`optimize.compact()`	Full-table overwrite	L0 → compacted runs
Branching	Not supported (use clone)	Snapshot refs	Checkpoint-based clone
Catalog	None (path-based)	SQL (SQLite here)	N/A

License

MIT

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
.github/skills/explore-table-format		.github/skills/explore-table-format
demos		demos
table_formats_demo		table_formats_demo
tests		tests
AGENTS.md		AGENTS.md
README.md		README.md
pyproject.toml		pyproject.toml
uv.lock		uv.lock

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

Table Formats Explorer

What this repo teaches

Intended workflow

Project structure

Setup

Running demos

Running tests

Key APIs used by agents

Delta Lake

Apache Iceberg

SlateDB

Format comparison cheat-sheet

License

Status

About

Uh oh!

Releases

Packages

Uh oh!

Contributors

Uh oh!

Languages

Folders and files

Latest commit

History

Repository files navigation

Table Formats Explorer

What this repo teaches

Intended workflow

Project structure

Setup

Running demos

Running tests

Key APIs used by agents

Delta Lake

Apache Iceberg

SlateDB

Format comparison cheat-sheet

License

Status

About

Topics

Resources

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Uh oh!

Contributors

Uh oh!

Languages

Packages