Skip to content

api-evangelist/table-format

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

6 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Table Format

Open Table Format is a category of open standards for organizing and managing data in data lakehouses. The three dominant formats are Apache Iceberg (the emerging industry standard), Delta Lake (Databricks-originated), and Apache Hudi (upsert-optimized). These formats bring ACID transactions, schema evolution, time travel, and efficient query planning to data lake storage. Apache Iceberg defines a REST Catalog API that enables standardized catalog operations across implementations.

APIs

Apache Iceberg REST Catalog API

An open REST API specification for interacting with Apache Iceberg table catalogs. Provides standard operations for namespace management, table lifecycle, view management, and metadata operations.

Delta Lake

Open-source storage framework with transaction log for ACID compliance on data lakes.

Apache Hudi

Lakehouse platform optimized for record-level upserts and incremental data processing.

Unity Catalog

Open, multi-modal catalog supporting Iceberg REST Catalog API, Hive Metastore, and Delta Sharing.

Artifacts

OpenAPI Specifications

Spec Description
apache-iceberg-rest-catalog-openapi.yml Apache Iceberg REST Catalog API - namespaces, tables, views, commits

JSON Schema

Schema Description
table-format-iceberg-table-schema.json Schema for Apache Iceberg table metadata (v2 format)

JSON Structure

Structure Description
table-format-iceberg-table-structure.json Field documentation for Iceberg table metadata

JSON-LD

Context Description
table-format-context.jsonld Linked data context for table format entities

Examples

Example Description
apache-iceberg-list-namespaces-example.json List namespaces in a catalog
apache-iceberg-create-table-example.json Create a new Iceberg table

Vocabulary

File Description
table-format-vocabulary.yml Domain vocabulary for open table format concepts

Key Concepts

  • Snapshot - Immutable point-in-time table state enabling time travel
  • Manifest File - Avro file tracking data files with column-level statistics
  • Catalog - Service mapping table names to metadata file locations
  • REST Catalog - Standardized HTTP API for catalog operations
  • Schema Evolution - Add/drop/rename columns without rewriting data
  • Partition Evolution - Change partitioning strategy without rewrites
  • ACID Transactions - Atomicity, Consistency, Isolation, Durability on object storage

Tags

  • Data Lakehouse
  • Open Table Format
  • Apache Iceberg
  • Delta Lake
  • Apache Hudi
  • ACID Transactions
  • Schema Evolution
  • Time Travel

About

Table Format is a specialized technology or methodology in the table domain that addresses specific technical or business requirements. It provides targeted capabilities that help practitioners and organizations solve problems and improve outcomes in their area of focus.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors