OLake

Fastest open-source tool for replicating Databases to Apache Iceberg or Data Lakehouse. ⚡ Efficient, quick and scalable data ingestion for real-time analytics. Starting with MongoDB. Visit olake.io/docs for the full documentation, and benchmarks

Connector ecosystem for Olake, the key points Olake Connectors focuses on are these

Integrated Writers to avoid block of reading, and pushing directly into destinations
Connector Autonomy
Avoid operations that don't contribute to increasing record throughput

Getting Started with OLake

Source / Connectors

Writers / Destination

Source/Connector Functionalities

Functionality	MongoDB	Postgres	MySQL
Full Refresh Sync Mode	✅	✅	✅
Incremental Sync Mode	❌	❌	❌
CDC Sync Mode	✅	✅	✅
Full Parallel Processing	✅	✅	✅
CDC Parallel Processing	✅	❌	❌
Resumable Full Load	✅	✅	✅
CDC Heart Beat	❌	❌	❌

We have additionally planned the following sources - AWS S3 | Kafka

Writer Functionalities

Functionality	Local Filesystem	AWS S3
Flattening & Normalization (L1)	✅	✅
Partitioning	✅	✅
Schema Changes	✅	✅
Schema Evolution	✅	✅

Supported Catalogs For Iceberg Writer

Catalog	Status
Glue Catalog	WIP
Hive Meta Store	Upcoming
JDBC Catalogue	Upcoming
REST Catalogue - Nessie	Upcoming
REST Catalogue - Polaris	Upcoming
REST Catalogue - Unity	Upcoming
REST Catalogue - Gravitino	Upcoming
Azure Purview	Not Planned, submit a request
BigLake Metastore	Not Planned, submit a request

Core

Core or framework is the component/logic that has been abstracted out from Connectors to follow DRY. This includes base CLI commands, State logic, Validation logic, Type detection for unstructured data, handling Config, State, Catalog, and Writer config file, logging etc.

Core includes http server that directly exposes live stats about running sync such as:

Possible finish time
Concurrently running processes
Live record count

Core handles the commands to interact with a driver via these:

spec command: Returns render-able JSON Schema that can be consumed by rjsf libraries in frontend
check command: performs all necessary checks on the Config, Catalog, State and Writer config
discover command: Returns all streams and their schema
sync command: Extracts data out of Source and writes into destinations

Find more about how OLake works here.

Roadmap

Checkout GitHub Project Roadmap and Upcoming OLake Roadmap to track and influence the way we build it. If you have any ideas, questions, or any feedback, please share on our Github Discussions or raise an issue.

Contributing

We ❤️ contributions big or small check our Bounty Program. As always, thanks to our amazing contributors!.

To contribute to Olake Check CONTRIBUTING.md
To contribute to UI, visit OLake UI Repository.
To contribute to OLake website and documentation (olake.io), visit Olake Docs Repository.

Name		Name	Last commit message	Last commit date
Latest commit History 211 Commits
.github		.github
constants		constants
drivers		drivers
jsonschema		jsonschema
logger		logger
pkg		pkg
protocol		protocol
safego		safego
scripts/release		scripts/release
types		types
typeutils		typeutils
utils		utils
writers		writers
.dockerignore		.dockerignore
.gitignore		.gitignore
.golangci.yml		.golangci.yml
CONTRIBUTING.md		CONTRIBUTING.md
Dockerfile		Dockerfile
LICENCE		LICENCE
Makefile		Makefile
README.md		README.md
build.sh		build.sh
connector.go		connector.go
docker-compose.yml		docker-compose.yml
go.mod		go.mod
go.work		go.work
release-tool.sh		release-tool.sh

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

OLake

Getting Started with OLake

Source / Connectors

Writers / Destination

Source/Connector Functionalities

Writer Functionalities

Supported Catalogs For Iceberg Writer

Core

Roadmap

Contributing

About

Releases

Packages

Contributors 11

Languages

License

datazip-inc/olake

Folders and files

Latest commit

History

Repository files navigation

OLake

Getting Started with OLake

Source / Connectors

Writers / Destination

Source/Connector Functionalities

Writer Functionalities

Supported Catalogs For Iceberg Writer

Core

Roadmap

Contributing

About

Topics

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 11

Languages

Packages