icepick

Experimental client for Apache Iceberg in Rust

icepick provides simple access to Apache Iceberg tables in AWS S3 Tables and Cloudflare R2 Data Catalog. Built on the official iceberg-rust library, icepick handles authentication, REST API details, and platform compatibility so you can focus on working with your data.

Why icepick?

Why not use iceberg-rust? This project targets WASM as a compilation target (not yet supported in iceberg-rust) and focuses on "serverless" catalogs that implement a subset of the overall Iceberg specification.

Features

Catalog Support

AWS S3 Tables — Full support with SigV4 authentication (native platforms only)
Cloudflare R2 Data Catalog — Full support with bearer token auth (WASM-compatible)
Generic REST Catalog — Build clients for any Iceberg REST endpoint (Nessie, Glue REST, custom)
Direct S3 Parquet Writes — Write Arrow data directly to S3 without Iceberg metadata

Developer Experience

Clean API — Simple factory methods, no complex builders
Type-safe errors — Comprehensive error handling with context
Zero-config auth — Uses AWS credential chain and Cloudflare API tokens
Production-ready — Used in real applications with real data

Platform Support

Catalog	Linux/macOS/Windows	WASM (browser/Cloudflare Workers)
S3 Tables	✅	❌ (requires AWS SDK)
R2 Data Catalog	✅	✅
No Catalog (direct parquet to object storage)	✅	✅

Note: R2 Data Catalog and direct Parquet writes are fully WASM-compatible, making them suitable for Cloudflare Workers, browser applications, and other WASM environments.

Installation

Add to your Cargo.toml:

[dependencies]
icepick = "0.1"

Quick Start

AWS S3 Tables

use icepick::S3TablesCatalog;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create catalog from S3 Tables ARN
    let catalog = S3TablesCatalog::from_arn(
        "my-catalog",
        "arn:aws:s3tables:us-west-2:123456789012:bucket/my-bucket"
    ).await?;

    // Load a table
    let table = catalog.load_table(
        &"namespace.table_name".parse()?
    ).await?;

    Ok(())
}

Cloudflare R2 Data Catalog

use icepick::R2Catalog;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Create catalog for R2
    let catalog = R2Catalog::new(
        "my-catalog",
        "account-id",
        "bucket-name",
        "api-token"
    ).await?;

    // Load a table
    let table = catalog.load_table(
        &"namespace.table_name".parse()?
    ).await?;

    Ok(())
}

Generic Iceberg REST Catalog

use icepick::{FileIO, RestCatalog};
use opendal::Operator;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Configure your FileIO (S3, R2, filesystem, etc.)
    let operator = Operator::via_iter(opendal::Scheme::Memory, [])?;
    let file_io = FileIO::new(operator);

    // Build a catalog for any Iceberg REST endpoint (Nessie, Glue REST, custom services)
    let catalog = RestCatalog::builder("nessie", "https://nessie.example.com/api/iceberg")
        .with_prefix("warehouse")
        .with_file_io(file_io)
        .with_bearer_token(std::env::var("NESSIE_TOKEN")?)
        .build()?;

    let table = catalog.load_table(&"namespace.table".parse()?).await?;
    Ok(())
}

Authentication

AWS S3 Tables

Uses the AWS default credential provider chain in the following order:

Environment variables (AWS_ACCESS_KEY_ID, AWS_SECRET_ACCESS_KEY)
AWS credentials file (~/.aws/credentials)
IAM instance profile (EC2)
ECS task role

Important: Ensure your credentials have S3 Tables permissions.

Cloudflare R2 Data Catalog

Uses Cloudflare API tokens. To set up:

Log into the Cloudflare dashboard
Navigate to My Profile → API Tokens
Create a token with R2 read/write permissions
Pass the token when constructing the catalog

Direct S3 Parquet Writes

Need to write Parquet files directly to S3 for external tools (Spark, DuckDB, etc.) without Iceberg metadata? Use the arrow_to_parquet function:

use icepick::{arrow_to_parquet, FileIO, io::AwsCredentials};
use arrow::array::{Int32Array, StringArray};
use arrow::datatypes::{DataType, Field, Schema};
use arrow::record_batch::RecordBatch;
use parquet::basic::Compression;
use std::sync::Arc;

#[tokio::main]
async fn main() -> Result<(), Box<dyn std::error::Error>> {
    // Setup FileIO with AWS credentials
    let file_io = FileIO::from_aws_credentials(
        AwsCredentials {
            access_key_id: "your-key".to_string(),
            secret_access_key: "your-secret".to_string(),
            session_token: None,
        },
        "us-west-2".to_string()
    );

    // Create Arrow data
    let schema = Arc::new(Schema::new(vec![
        Field::new("id", DataType::Int32, false),
        Field::new("name", DataType::Utf8, false),
    ]));

    let batch = RecordBatch::try_new(
        schema,
        vec![
            Arc::new(Int32Array::from(vec![1, 2, 3])),
            Arc::new(StringArray::from(vec!["a", "b", "c"])),
        ],
    )?;

    // Simple write with defaults
    arrow_to_parquet(&batch, "s3://my-bucket/output.parquet", &file_io).await?;

    // With compression
    arrow_to_parquet(&batch, "s3://my-bucket/compressed.parquet", &file_io)
        .with_compression(Compression::ZSTD(parquet::basic::ZstdLevel::default()))
        .await?;

    // Manual partitioning (Hive-style or any structure)
    let date = "2025-01-15";
    let path = format!("s3://my-bucket/data/date={}/data.parquet", date);
    arrow_to_parquet(&batch, &path, &file_io).await?;

    Ok(())
}

Note: This writes standalone Parquet files without Iceberg metadata. For writing to Iceberg tables, use the Transaction API instead.

Examples

Explore complete working examples in the examples/ directory:

Example	Description	Command
`s3_tables_basic.rs`	Complete S3 Tables workflow	`cargo run --example s3_tables_basic`
`r2_basic.rs`	Complete R2 Data Catalog workflow	`cargo run --example r2_basic`

Development

Running Tests

cargo test

WASM Build

Verify R2Catalog compiles for WASM:

cargo build --target wasm32-unknown-unknown

Code Quality

# Format code
cargo fmt

# Run linter
cargo clippy -- -D warnings

# Check documentation
cargo doc --no-deps --all-features

Contributing

Contributions are welcome! Please feel free to submit issues and pull requests.

Acknowledgments

Built on the official iceberg-rust library from the Apache Iceberg project.

Name		Name	Last commit message	Last commit date
Latest commit History 59 Commits
.github/workflows		.github/workflows
examples		examples
scripts		scripts
src		src
tests		tests
.env.example		.env.example
.gitignore		.gitignore
.pre-commit-config.yaml		.pre-commit-config.yaml
AGENTS.md		AGENTS.md
CHANGELOG.md		CHANGELOG.md
CLAUDE.md		CLAUDE.md
Cargo.lock		Cargo.lock
Cargo.toml		Cargo.toml
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Repository files navigation

icepick

Why icepick?

Features

Catalog Support

Developer Experience

Platform Support

Installation

Quick Start

AWS S3 Tables

Cloudflare R2 Data Catalog

Generic Iceberg REST Catalog

Authentication

AWS S3 Tables

Cloudflare R2 Data Catalog

Direct S3 Parquet Writes

Examples

Development

Running Tests

WASM Build

Code Quality

Contributing

Acknowledgments

About

Uh oh!

Releases

Packages

Languages

License

smithclay/icepick

Folders and files

Latest commit

History

Repository files navigation

icepick

Why icepick?

Features

Catalog Support

Developer Experience

Platform Support

Installation

Quick Start

AWS S3 Tables

Cloudflare R2 Data Catalog

Generic Iceberg REST Catalog

Authentication

AWS S3 Tables

Cloudflare R2 Data Catalog

Direct S3 Parquet Writes

Examples

Development

Running Tests

WASM Build

Code Quality

Contributing

Acknowledgments

About

Topics

Resources

License

Uh oh!

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages