
Docs · Report Bug · Feature Request · Get Help · Watch Demo
GlassFlow is an open-source ETL tool that enables real-time data processing from Kafka to ClickHouse with features like deduplication and temporal joins.
This guide walks you through a local installation using Docker Compose — perfect for development, testing, or trying out GlassFlow on your machine.
- Clone the repository:
git clone https://github.com/glassflow/clickhouse-etl.git
cd clickhouse-etl
- Start the services:
docker compose up
-
Access the web interface at
http://localhost:8080
to configure your pipeline. -
View the logs:
# Follow logs in real-time for all containers
docker compose logs -f
# logs for the backend app
docker compose logs app -f
# logs for the UI
docker compose logs ui -f
GlassFlow can be installed in a variety of environments depending on your use case. Below is a quick overview:
Method | Use Case | Docs Link |
---|---|---|
🐳 Local with Docker Compose | Quick evaluation and local testing | Local Docker Guide |
☁️ AWS EC2 with Docker Compose | Lightweight cloud deployment for testing | AWS EC2 Guide |
☸️ Kubernetes with Helm | Kubernetes deployment | Kubernetes Helm Guide |
ℹ️ Note: The current GlassFlow deployment is not horizontally scalable yet. A new Kubernetes-native, scalable deployment is in development and expected by end of July.
See a working demo of GlassFlow in action at demo.glassflow.dev.
GlassFlow Pipeline showing real-time streaming from Kafka through GlassFlow to ClickHouse
For detailed documentation, visit docs.glassflow.dev. The documentation includes:
- Installation Guide
- Usage Guide
- Pipeline Configuration
- Local Testing
- Architecture
- Load Test Results - Performance benchmarks and test results
Check out our public roadmap to see what's coming next in GlassFlow. We're actively working on new features and improvements based on community feedback.
Want to suggest a feature? We'd love to hear from you! Please use our GitHub Discussions to share your ideas and help shape the future of GlassFlow.
- Real-time data processing from Kafka to ClickHouse
- Deduplication with configurable time windows
- Temporal joins between multiple Kafka topics
- Web-based UI for pipeline management
- Docker-based deployment
- Local development environment
This project is licensed under the Apache License 2.0.