This is our attempt at making a Kafka clone that mimics its Pub-Sub Communication system with log replication, leader–follower architecture, and offset-based message delivery.
- Leader–Follower Replication: Leader accepts messages, replicates to followers with ACK tracking
- High Watermark Durability: Only committed records (≤ HW) are visible to consumers
- Topic-less Design: Single partition log for simplicity
- Redis Metadata: Persistent metadata for ISR, HW, and follower LEOs
- CLI Clients: Producer and consumer tools
- Docker Compose: Multi-node cluster simulation
The codebase is structured with clear TODOs to guide your implementation:
- Start with
yak/broker/common.py- Implement config, logging, metadata, and replication utilities - Then
yak/broker/logstore.py- Build the append-only log with persistence - Next
yak/broker/leader.py- Implement the leader broker (produce, fetch, replication) - Follow with
yak/broker/follower.py- Implement follower replication handling - Finally
yak/client/- Build producer and consumer CLI tools
Each file has detailed TODO comments explaining what to implement.
To see the working implementation:
git checkout kaustubh-devTo switch back to boilerplate:
git checkout mainuv pip install -e .
# or with dev dependencies
uv pip install -e ".[dev]"cd infra
docker compose up --buildThis starts:
- Redis on
localhost:6379 - Leader on
localhost:8080 - Follower-1 on
localhost:8081 - Follower-2 on
localhost:8082
See DISTRIBUTED_DEPLOYMENT.md for complete instructions on running YAK across 4 different machines using ZeroTier for networking.
Quick summary:
- System 1: Leader broker
- System 2: Follower broker
- System 3: Redis metadata store
- System 4: Producer/Consumer clients
Then run the failover demo:
bash demo_failover.shpython -m yak.client.producer --message "Hello YAK"
python -m yak.client.producer --message "Second message"python -m yak.client.consumer --from-beginningcurl http://localhost:8080/healthyak/
├── broker/ # Leader & follower implementations
│ ├── common.py # Config, logging, metadata, replication
│ ├── logstore.py # Append-only log with HW/LEO
│ ├── leader.py # Leader node
│ ├── follower.py # Follower node
│ └── server.py # CLI entrypoint (leader or follower)
├── client/ # Producer & consumer CLI tools
│ ├── producer.py
│ ├── consumer.py
│ └── utils.py # Shared client utilities (placeholder)
└── tests/ # Unit and integration tests
infra/
├── docker-compose.yml
├── Dockerfile
└── redis.conf
docs/
└── architecture.md
- Leader: Accepts produce requests, appends to log, replicates to followers, manages HW
- Followers: Receive replication requests, append locally, ACK to leader
- High Watermark: min(LEO of all in-sync replicas) → ensures consistency
- Consumers: Only read committed records (≤ HW)
python -m pytest yak/tests/Or run individual smoke tests:
python yak/tests/test_replication.pyControlled via environment variables (see yak/common/config.py):
YAK_ROLE:leaderorfollowerYAK_NODE_ID: unique identifierYAK_HOST,YAK_PORT: bind addressYAK_FOLLOWERS: comma-separated follower URLs (leader only)YAK_REDIS_URL: Redis connection stringYAK_DATA_DIR: log persistence directory
See TODO.md for the full development plan and milestones.
MIT