This is an incubator project for a new provider in the Transferia ecosystem. It's part of the Transferia project.
Iceberg is a provider implementation that handles data processing and transformation tasks. It's designed to be integrated into the Transferia ecosystem as a new data processing provider.
- Go 1.23 or higher
- Docker (for running tests with testcontainers)
- Make
- Clone the repository:
git clone https://github.com/transferia/iceberg.git
cd iceberg
- Install dependencies:
go mod download
- Build the project:
make build
Run the test suite:
make test
For detailed test reports:
make run-tests
Test reports will be generated in the reports/
directory.
The project uses standard Go tooling and Make for common tasks:
make clean
- Remove build artifactsmake build
- Build the projectmake test
- Run testsmake run-tests
- Run tests with detailed reporting
cmd/
- Main application entry points, it's custom main file same as in transfer, but with extra pluginreports/
- Test reportsbinaries/
- Compiled binariesdoc/
- Documentation, including design documents...rest
- plugin code base
The Iceberg Provider implements a robust Table Reading mechanism that:
- Provides efficient data access through optimized manifest processing
- Ensures data consistency through snapshot-based reading
- Implements advanced optimization techniques like partition pruning and column projection
For more details, see the Iceberg Table Reading Design Document.
The Iceberg Provider implements a powerful Snapshot Sink mechanism that:
- Efficiently transforms incoming data into Parquet files
- Tracks files generated by each worker
- Coordinates file registration using a central coordinator
- Atomically commits all files to the target table in a single transaction
For more details, see the Snapshot Sink Design Document.
The Iceberg Provider also implements a Streaming Sink mechanism that:
- Processes data in real-time as it arrives
- Maintains continuous data ingestion with minimal latency
- Provides exactly-once semantics for data delivery
- Supports automatic schema evolution and data type mapping
Note: It's for append-only sources, not for CDC
This project is part of the Transferia ecosystem and follows its contribution guidelines. Please refer to the main Transferia repository for more information.