░██████╗███╗░░██╗░█████╗░██████╗░██████╗░ ██╔════╝████╗░██║██╔══██╗██╔══██╗██╔══██╗ ╚█████╗░██╔██╗██║███████║██████╔╝██████╔╝ ░╚═══██╗██║╚████║██╔══██║██╔═══╝░██╔═══╝░ ██████╔╝██║░╚███║██║░░██║██║░░░░░██║░░░░░ ╚═════╝░╚═╝░░╚══╝╚═╝░░╚═╝╚═╝░░░░░╚═╝░░░░░
Spring Boot service for ingesting and querying traffic data using ClickHouse.
docker-compose up --build
http://localhost:8080/swagger-ui.html
POST /api/traffic/{request_id} GET /api/traffic/daily/{user_id}?page=0&size=10
- Async ingestion
- Idempotent writes
- ClickHouse ReplacingMergeTree
- Metrics via Actuator
- Layered architecture (Controller / Service / Repository)
- Append-only writes optimized for ClickHouse
- Logical updates via ReplacingMergeTree
- Idempotent write API using request_id
- Asynchronous ingestion for low-latency writes
- Centralized exception handling
- Built-in observability (metrics & health checks)
- Engine: ReplacingMergeTree
- Partitioning: Monthly (toYYYYMM(created_at))
- Ordering: (user_id, created_at, request_id)
- Chosen to balance:
- High write throughput
- Efficient time-based aggregation
- Update capability without mutable storage
-
I used eclipse-temurin:17-jre because it provides a secure, vendor-neutral OpenJDK distribution. The JRE variant is sufficient for running a Spring Boot application and keeps the container image minimal and production-ready.
-
The init.sql file is mounted into the ClickHouse container under /docker-entrypoint-initdb.d/ and is executed automatically only once, during the first startup of the container when the data directory is empty. Its responsibility is to bootstrap the database schema by creating the required database and tables, including defining the storage engine, partitioning, and ordering strategy. In this project, the table uses the ReplacingMergeTree(version) engine because ClickHouse does not support traditional row-level updates; instead, updates are modeled as new inserts with a higher version value, and during background merge operations ClickHouse keeps the latest version of each record (based on the ORDER BY key) while discarding older ones. This approach enables an update endpoint at the application level while remaining fully aligned with ClickHouse’s OLAP-oriented, append-only design principles.
-
I exposed application metrics via Micrometer and Actuator, scraped them using Prometheus, and visualized ingestion throughput and API behavior in Grafana. This allows observing system health and data ingestion performance in real time.
| Service | URL |
|---|---|
| Traffic API | http://localhost:8080 |
| Swagger | http://localhost:8080/swagger-ui.html |
| Actuator | http://localhost:8080/actuator |
| Prometheus | http://localhost:9090 |
| Grafana | http://localhost:3000 |
| Component | Version | Notes |
|---|---|---|
| Java | 17 (LTS) | Long-term support, production standard |
| Spring Boot | 3.2.5 | Modern Spring stack, Jakarta EE |
| ClickHouse Server | latest (Docker) | stable + HTTP & native protocol support |
| ClickHouse JDBC Driver | 0.6.3 | Official driver |
| Springdoc OpenAPI | 2.5.0 | Swagger UI for API documentation |
| Micrometer | Spring Boot managed | Metrics instrumentation |
| Prometheus Registry | Spring Boot managed | Metrics export |
| Spring Boot Actuator | 2.5.0 | Health & metrics endpoints |
| Docker | latest | Containerization |
| Docker Compose | v3.8 | Local orchestration |
| JUnit Jupiter | Spring Boot managed | Testing framework |
| Testcontainers | 1.19.8 | Integration with real ClickHouse,not mocks |
- write_Request
curl -X POST http://localhost:8080/api/traffic/11111111-1111-1111-1111-111111111111
-H "Content-Type: application/json"
-d '{
"userId": 42,
"createdAt": "2025-01-01T12:00:00",
"pageUrl": "https://example.com/home"
}'
- idempotency_test
curl -X POST http://localhost:8080/api/traffic/11111111-1111-1111-1111-111111111111
-H "Content-Type: application/json"
-d '{ ... }'
- Update_Request
curl -X PUT http://localhost:8080/api/traffic/11111111-1111-1111-1111-111111111111
-H "Content-Type: application/json"
-d '{
"userId": 42,
"createdAt": "2025-01-01T12:05:00",
"pageUrl": "https://example.com/profile"
}'
- ReadAPI_Request(Daily Report)
curl http://localhost:8080/api/traffic/daily/42
| Category | Purpose | Command |
|---|---|---|
| Run | Build and run | docker compose up --build |
| Stop | Stop and remove volumes | docker compose down -v |
| Logs | View app logs | docker compose logs traffic-app |
| Logs | View ClickHouse logs | docker compose logs clickhouse |
| Logs | View Prometheus logs | docker compose logs prometheus |
| Shell | Enter traffic app container | docker exec -it traffic-app /bin/sh |
| Shell | Enter ClickHouse client | docker exec -it clickhouse clickhouse-client |
| Shell | Enter Prometheus container | docker exec -it prometheus /bin/sh |
| API test | Health check | curl http://localhost:8080/actuator/health |
| API test | Prometheus metrics | curl http://localhost:8080/actuator/prometheus |
| API test | Insert traffic event | curl -X POST http://localhost:8080/api/traffic/{uuid} |
| API test | Update traffic event | curl -X PUT http://localhost:8080/api/traffic/{uuid} |
| API test | Daily report | curl http://localhost:8080/api/traffic/daily/{userId} |
| Database | Show tables | SHOW TABLES FROM traffic; |
| Database | Inspect data | SELECT * FROM traffic.traffic_events; |
| Database | Validate daily aggregation | SELECT page_url, COUNT() FROM traffic.traffic_events WHERE user_id = 42 AND created_at >= now() - INTERVAL 1 DAY GROUP BY page_url; |
| Prometheus | Access UI | http://localhost:9090 |
| Category | Purpose | Command |
|---|---|---|
| Build | Clean and compile | mvn clean compile |
| Test | Run all unit tests | mvn test |
| Test | Run a specific test class | mvn -Dtest=TrafficServiceTest test |
| Test | Run a specific test method | mvn -Dtest=TrafficServiceTest#dailyReport_shouldReturnCounts test |
| Package | Create executable JAR | mvn clean package |
| Package | Skip tests during build | mvn clean package -DskipTests |
| Package | Package without running tests | mvn clean package -Dmaven.test.skip=true |
| Docker | Build Docker image via Spring Boot | mvn spring-boot:build-image |
| Docker | Build image with custom name | mvn spring-boot:build-image -Dspring-boot.build-image.imageName=traffic-ingestion:latest |
| Dependency | Show dependency tree | mvn dependency:tree |
| Debug | Run with debug output | mvn clean test -X |
| Debug | Resume failed build | mvn -rf :traffic-ingestion |
Identifying the Bottleneck in the Current System In the current system, ClickHouse is used directly as the primary storage for ingesting traffic events. While ClickHouse is an excellent OLAP database, this design makes it the main bottleneck under heavy write and large-scale read workloads.
- Why ClickHouse Becomes the Bottleneck: ClickHouse is optimized for analytical queries and batch-oriented data ingestion, not for high-frequency transactional writes.
The main bottlenecks are:
- High-frequency small writes:
- ClickHouse performs best with large batch inserts.
- Continuous single-row inserts create many small data parts.
- This significantly increases background merge operations.
- Expensive background merges:
- MergeTree engines continuously merge parts to optimize query performance.
- Under heavy write load, merge operations consume CPU and disk I/O.
- This directly impacts query latency.
- Read–write contention:
- OLAP queries and ingestion share the same resources.
- Analytical queries suffer from increased latency during peak writes.
- Lack of write buffering:
- The system writes directly to ClickHouse synchronously.
- Sudden traffic spikes cannot be absorbed gracefully.
As a result, ClickHouse becomes the primary bottleneck and limits system scalability beyond a certain write throughput.
The redesigned system must satisfy the following requirements:
- Heavy write tolerant
- Low-latency OLAP queries
- High availability
- Scalable analytics on very large datasets
- Support both real-time and batch processing
- Use only open-source technologies
Apache Flink (Real-Time Processing)
- DLH queryEngine
- Windowed aggregations
- Enrichment
- stateful processing
- use flink_CDC(ingestion)& flink_SQL(processing like batch with Table/Stream)
ClickHouse (RTOLAP Serving Layer)
- used strictly for analytical queries
- Optimized schema (e.g., AggregatingMergeTree)
- No direct user writes
Apache Fluss (ingestion & hot layer)
- prevent deduplication ETL
- Storage unification for all layers
- for deltaJoin
Iceberg or Paimon(Analytics Layer)
- stores large-scale & cold-data datasets
- bring ACID & TimeTravels ability or data_Versioning
- use snapshot and metadataLayer