The Fan-Out Engine is a Java-based distributed data processing application. It reads records from a source file and sends them to multiple sinks (REST API, gRPC, Message Queue, Wide-Column Database) with specific transformations applied for each sink.
It is designed for high-performance processing, handling large files efficiently using concurrency, throttling, and error handling.
Concurrent Processing: Processes records in parallel using Java ExecutorService.
Memory Efficient: Streams input files without loading the entire file into memory.
Multiple Sinks:
REST API
gRPC
Message Queue
Wide-Column Database (Cassandra / Aerospike / DynamoDB / ScyllaDB)
Transformations (Strategy Pattern):
JSON → JSON
JSON → Protobuf
JSON → XML
JSON → Wide-Column Map
Throttling & Resilience:
Configurable rate limits per sink
Retry logic (up to 3 retries per record)
Backpressure using Semaphore or BlockingQueue
Observability: Prints processing statistics for monitoring.
Java 8 or higher
Maven 3.6 or higher
Git
1️⃣ Clone the repository git clone https://github.com/Anil-kumar9861/Fanout-Engine.git cd Fanout-Engine
2️⃣ Build the project mvn clean install
3️⃣ Run the application java -cp target/classes com.fanout.Main
4️⃣ Sample Input (input.jsonl) {"id":1,"name":"Anil"} {"id":2,"name":"Kumar"} {"id":3,"name":"Engineer"}
5️⃣ Sample Output REST Sink received: {"id":1,"name":"Anil"} Queue Sink published message: {"id":1,"name":"Anil"} gRPC Sink sent payload size: 22 bytes, content: {"id":1,"name":"Anil"}
REST Sink received: {"id":2,"name":"Kumar"} Queue Sink published message: {"id":2,"name":"Kumar"} gRPC Sink sent payload size: 23 bytes, content: {"id":2,"name":"Kumar"}
REST Sink received: {"id":3,"name":"Engineer"} Queue Sink published message: {"id":3,"name":"Engineer"} gRPC Sink sent payload size: 26 bytes, content: {"id":3,"name":"Engineer"}
WideColumn Sink upserted record: {"id":1,"name":"Anil"} WideColumn Sink upserted record: {"id":2,"name":"Kumar"} WideColumn Sink upserted record: {"id":3,"name":"Engineer"}
Processing completed.
Concurrency: ExecutorService with FixedThreadPool allows parallel processing for each sink.
Backpressure: Sinks use semaphores or blocking queues to avoid memory overflow.
Transformations: Implemented via Strategy Pattern for extensibility.
Throttling: Each sink has configurable rate limits to avoid overwhelming external systems.
Retries: Each record is retried up to 3 times on failure.
Input file must be in JSON Lines format (.jsonl), one record per line.
Transformations are byte-level; REST, gRPC, and DB calls are mocked.
Wide-column DB sink simulates asynchronous upserts using byte arrays.
Implement actual REST/gRPC/DB clients instead of mocks.
Support more file formats like CSV or Fixed-width.
Add dynamic configuration from application.yaml.
Add metrics collection and monitoring dashboards.