A single-binary playground for Apache Iceberg
Five minutes to first query
Quick Start β’ Features β’ Usage Guide β’ Contributing
Icebox is a zero-configuration data lakehouse that gets you from zero to querying Iceberg tables in under five minutes. Perfect for:
- π¬ Experimenting with Apache Iceberg table format
- π Learning lakehouse concepts and workflows
- π§ͺ Prototyping data pipelines locally
- π Testing Iceberg integrations before production
No servers, no complex setup, no dependencies - just a single binary and your data.
Icebox is alpha softwareβfunctional, fast-moving, and rapidly evolving.
The core is there. Now we're looking for early contributors to help shape what comes nextβwhether through code, docs, testing, or ideas.
- Single binary - No installation complexity
- Embedded catalog - SQLite-based, no external database needed
- JSON catalog - Local JSON-based catalog for development and prototyping
- REST catalog support - Connect to existing Iceberg REST catalogs
- Embedded MinIO server - S3-compatible storage for testing production workflows
- Parquet & Avro import with automatic schema inference
- Enhanced table creation - Full support for partitioning and sort orders
- DuckDB v1.3.0 integration - High-performance analytics with native Iceberg support
- Universal catalog compatibility - All catalog types work seamlessly with query engine
- Interactive SQL shell with command history and multi-line support
- Time-travel queries - Query tables at any point in their history
- Transaction support with proper ACID guarantees
- Go 1.21+ for building from source
- DuckDB v1.3.0+ for optimal Iceberg support (automatically bundled with Go driver)
# Build from source
git clone https://github.com/TFMV/icebox.git
cd icebox
go build -o icebox cmd/icebox/main.go
# Add to your PATH for global access
sudo mv icebox /usr/local/bin/
# Or add the current directory to PATH
export PATH=$PATH:$(pwd)
π‘ Tip: Add export PATH=$PATH:/usr/local/bin
to your shell profile (.bashrc
, .zshrc
) for permanent access.
# Create a new lakehouse project (default: SQLite catalog)
./icebox init my-lakehouse
cd my-lakehouse
# Or with JSON catalog for version control friendly development
./icebox init my-lakehouse --catalog json
cd my-lakehouse
# Import a Parquet or Avro file into an Iceberg table
./icebox import data.parquet --table sales
# or
./icebox import data.avro --table users
β
Successfully imported table!
π Import Results:
Table: [default sales]
Records: 1,000,000
Size: 45.2 MB
Location: file:///.icebox/data/default/sales
# Create tables with partitioning and sorting for better performance
./icebox table create analytics_events \
--partition-by "date,region" \
--sort-by "timestamp ASC,user_id ASC" \
--schema events_schema.json
β
Successfully created table!
β
Applied partition specification with 2 field(s)
β
Applied sort order with 2 field(s)
# Import data into the optimized table
./icebox import events.parquet --table analytics_events
# Run SQL queries
./icebox sql "SELECT COUNT(*) FROM sales"
π Registered 1 tables for querying
β±οΈ Query executed in 45ms
π 1 rows returned
βββββββββββββββ
β count_star()β
βββββββββββββββ€
β 1000000 β
βββββββββββββββ
# Use the interactive shell for complex analysis
./icebox shell
π§ Icebox SQL Shell v0.1.0
Interactive SQL querying for Apache Iceberg
Type \help for help, \quit to exit
icebox> SELECT region, AVG(amount) as avg_amount FROM sales GROUP BY region;
β±οΈ Query executed in 23ms
π 3 rows returned
βββββββββββββββ¬βββββββββββββ
β region β avg_amount β
βββββββββββββββΌβββββββββββββ€
β North β 1250.50 β
β South β 980.75 β
β West β 1450.25 β
βββββββββββββββ΄βββββββββββββ
icebox> \quit
π You now have a working Iceberg lakehouse with your data and SQL querying!
Storage Type | Description | Use Case |
---|---|---|
Local Filesystem | File-based storage | Development, testing |
In-Memory | Temporary fast storage | Unit testing, experiments |
Embedded MinIO | S3-compatible local server | Cloud workflow testing |
External MinIO | Remote MinIO instance | Shared development |
Catalog Type | Description | Use Case |
---|---|---|
SQLite | Embedded local catalog | Single-user development |
JSON | Local JSON-based catalog | Development, prototyping, embedded use |
REST | External Iceberg REST catalog | Multi-user, production |
Icebox is designed to be approachable for developers at all levels.
- π΄ Fork the repository and create a feature branch
- π§ͺ Write tests for your changes
- π Update documentation as needed
- β
Ensure tests pass with
go test ./...
- π Submit a pull request
# Prerequisites: Go 1.21+, DuckDB v1.3.0+ (for local CLI testing)
# Install DuckDB locally (optional, for CLI testing)
# macOS: brew install duckdb
# Linux: See https://duckdb.org/docs/installation/
# Build from source
git clone https://github.com/TFMV/icebox.git
cd icebox
go mod tidy
go build -o icebox cmd/icebox/main.go
# Run tests
go test ./...
# Add to PATH for development
export PATH=$PATH:$(pwd)
- π Bug fixes and stability improvements
- π Documentation and examples
- β¨ New features and enhancements
- π§ͺ Test coverage improvements
- π¨ CLI/UX enhancements
For comprehensive documentation and advanced features, see our π Usage Guide.
This project is licensed under the Apache License 2.0 - see the LICENSE file for details.
Made with β€οΈ for the data community
β Star this project β’ π Usage Guide β’ π Report Issue