Experimental client for Apache Iceberg in Rust
icepick provides simple access to Apache Iceberg tables in AWS S3 Tables and Cloudflare R2 Data Catalog. Built on the official iceberg-rust library, icepick handles authentication, REST API details, and platform compatibility so you can focus on working with your data.
Why not use iceberg-rust? This project targets WASM as a compilation target (not yet supported in iceberg-rust) and focuses on "serverless" catalogs that implement a subset of the overall Iceberg specification.
- AWS S3 Tables — Full support with SigV4 authentication (native platforms only)
- Cloudflare R2 Data Catalog — Full support with bearer token auth (WASM-compatible)
- Generic REST Catalog — Build clients for any Iceberg REST endpoint (Nessie, Glue REST, custom)
- Direct S3 Parquet Writes — Write Arrow data directly to S3 without Iceberg metadata
- Clean API — Simple factory methods, no complex builders
- Type-safe errors — Comprehensive error handling with context
- Zero-config auth — Uses AWS credential chain and Cloudflare API tokens
- Production-ready — Used in real applications with real data
| Catalog | Linux/macOS/Windows | WASM (browser/Cloudflare Workers) |
|---|---|---|
| S3 Tables | ✅ | ❌ (requires AWS SDK) |
| R2 Data Catalog | ✅ | ✅ |
| No Catalog (direct parquet to object storage) | ✅ | ✅ |
Note: R2 Data Catalog and direct Parquet writes are fully WASM-compatible, making them suitable for Cloudflare Workers, browser applications, and other WASM environments.
Add to your Cargo.toml:
[dependencies]
icepick = "0.1"use icepick::S3TablesCatalog;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create catalog from S3 Tables ARN
let catalog = S3TablesCatalog::from_arn(
"my-catalog",
"arn:aws:s3tables:us-west-2:123456789012:bucket/my-bucket"
).await?;
// Load a table
let table = catalog.load_table(
&"namespace.table_name".parse()?
).await?;
Ok(())
}use icepick::R2Catalog;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Create catalog for R2
let catalog = R2Catalog::new(
"my-catalog",
"account-id",
"bucket-name",
"api-token"
).await?;
// Load a table
let table = catalog.load_table(
&"namespace.table_name".parse()?
).await?;
Ok(())
}use icepick::{FileIO, RestCatalog};
use opendal::Operator;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Configure your FileIO (S3, R2, filesystem, etc.)
let operator = Operator::via_iter(opendal::Scheme::Memory, [])?;
let file_io = FileIO::new(operator);
// Build a catalog for any Iceberg REST endpoint (Nessie, Glue REST, custom services)
let catalog = RestCatalog::builder("nessie", "https://nessie.example.com/api/iceberg")
.with_prefix("warehouse")
.with_file_io(file_io)
.with_bearer_token(std::env::var("NESSIE_TOKEN")?)
.build()?;
let table = catalog.load_table(&"namespace.table".parse()?).await?;
Ok(())
}Uses the AWS default credential provider chain in the following order:
- Environment variables (
AWS_ACCESS_KEY_ID,AWS_SECRET_ACCESS_KEY) - AWS credentials file (
~/.aws/credentials) - IAM instance profile (EC2)
- ECS task role
Important: Ensure your credentials have S3 Tables permissions.
Uses Cloudflare API tokens. To set up:
- Log into the Cloudflare dashboard
- Navigate to My Profile → API Tokens
- Create a token with R2 read/write permissions
- Pass the token when constructing the catalog
Need to write Parquet files directly to S3 for external tools (Spark, DuckDB, etc.) without Iceberg metadata? Use the arrow_to_parquet function:
use icepick::{arrow_to_parquet, FileIO, io::AwsCredentials};
use arrow::array::{Int32Array, StringArray};
use arrow::datatypes::{DataType, Field, Schema};
use arrow::record_batch::RecordBatch;
use parquet::basic::Compression;
use std::sync::Arc;
#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
// Setup FileIO with AWS credentials
let file_io = FileIO::from_aws_credentials(
AwsCredentials {
access_key_id: "your-key".to_string(),
secret_access_key: "your-secret".to_string(),
session_token: None,
},
"us-west-2".to_string()
);
// Create Arrow data
let schema = Arc::new(Schema::new(vec![
Field::new("id", DataType::Int32, false),
Field::new("name", DataType::Utf8, false),
]));
let batch = RecordBatch::try_new(
schema,
vec![
Arc::new(Int32Array::from(vec![1, 2, 3])),
Arc::new(StringArray::from(vec!["a", "b", "c"])),
],
)?;
// Simple write with defaults
arrow_to_parquet(&batch, "s3://my-bucket/output.parquet", &file_io).await?;
// With compression
arrow_to_parquet(&batch, "s3://my-bucket/compressed.parquet", &file_io)
.with_compression(Compression::ZSTD(parquet::basic::ZstdLevel::default()))
.await?;
// Manual partitioning (Hive-style or any structure)
let date = "2025-01-15";
let path = format!("s3://my-bucket/data/date={}/data.parquet", date);
arrow_to_parquet(&batch, &path, &file_io).await?;
Ok(())
}Note: This writes standalone Parquet files without Iceberg metadata. For writing to Iceberg tables, use the Transaction API instead.
Explore complete working examples in the examples/ directory:
| Example | Description | Command |
|---|---|---|
s3_tables_basic.rs |
Complete S3 Tables workflow | cargo run --example s3_tables_basic |
r2_basic.rs |
Complete R2 Data Catalog workflow | cargo run --example r2_basic |
cargo testVerify R2Catalog compiles for WASM:
cargo build --target wasm32-unknown-unknown# Format code
cargo fmt
# Run linter
cargo clippy -- -D warnings
# Check documentation
cargo doc --no-deps --all-featuresContributions are welcome! Please feel free to submit issues and pull requests.
Built on the official iceberg-rust library from the Apache Iceberg project.