Skip to content

Anil-kumar9861/Fanout-Engine

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

21 Commits
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Fan-Out Engine

📖 Overview

The Fan-Out Engine is a Java-based distributed data processing application. It reads records from a source file and sends them to multiple sinks (REST API, gRPC, Message Queue, Wide-Column Database) with specific transformations applied for each sink.

It is designed for high-performance processing, handling large files efficiently using concurrency, throttling, and error handling.

✨ Features

Concurrent Processing: Processes records in parallel using Java ExecutorService.

Memory Efficient: Streams input files without loading the entire file into memory.

Multiple Sinks:

REST API

gRPC

Message Queue

Wide-Column Database (Cassandra / Aerospike / DynamoDB / ScyllaDB)

Transformations (Strategy Pattern):

JSON → JSON

JSON → Protobuf

JSON → XML

JSON → Wide-Column Map

Throttling & Resilience:

Configurable rate limits per sink

Retry logic (up to 3 retries per record)

Backpressure using Semaphore or BlockingQueue

Observability: Prints processing statistics for monitoring.

⚙️ Prerequisites

Java 8 or higher

Maven 3.6 or higher

Git

🚀 Setup & Run

1️⃣ Clone the repository git clone https://github.com/Anil-kumar9861/Fanout-Engine.git cd Fanout-Engine

2️⃣ Build the project mvn clean install

3️⃣ Run the application java -cp target/classes com.fanout.Main

4️⃣ Sample Input (input.jsonl) {"id":1,"name":"Anil"} {"id":2,"name":"Kumar"} {"id":3,"name":"Engineer"}

5️⃣ Sample Output REST Sink received: {"id":1,"name":"Anil"} Queue Sink published message: {"id":1,"name":"Anil"} gRPC Sink sent payload size: 22 bytes, content: {"id":1,"name":"Anil"}

REST Sink received: {"id":2,"name":"Kumar"} Queue Sink published message: {"id":2,"name":"Kumar"} gRPC Sink sent payload size: 23 bytes, content: {"id":2,"name":"Kumar"}

REST Sink received: {"id":3,"name":"Engineer"} Queue Sink published message: {"id":3,"name":"Engineer"} gRPC Sink sent payload size: 26 bytes, content: {"id":3,"name":"Engineer"}

WideColumn Sink upserted record: {"id":1,"name":"Anil"} WideColumn Sink upserted record: {"id":2,"name":"Kumar"} WideColumn Sink upserted record: {"id":3,"name":"Engineer"}

Processing completed.

📝 Design Decisions

Concurrency: ExecutorService with FixedThreadPool allows parallel processing for each sink.

Backpressure: Sinks use semaphores or blocking queues to avoid memory overflow.

Transformations: Implemented via Strategy Pattern for extensibility.

Throttling: Each sink has configurable rate limits to avoid overwhelming external systems.

Retries: Each record is retried up to 3 times on failure.

⚖️ Assumptions

Input file must be in JSON Lines format (.jsonl), one record per line.

Transformations are byte-level; REST, gRPC, and DB calls are mocked.

Wide-column DB sink simulates asynchronous upserts using byte arrays.

🔮 Future Improvements

Implement actual REST/gRPC/DB clients instead of mocks.

Support more file formats like CSV or Fixed-width.

Add dynamic configuration from application.yaml.

Add metrics collection and monitoring dashboards.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published

Languages