Skip to content

snigenigmatic/YAK

Repository files navigation

YAK - Yet Another Kafka

Introduction and Motivation

This is our attempt at making a Kafka clone that mimics its Pub-Sub Communication system with log replication, leader–follower architecture, and offset-based message delivery.

Features

  • Leader–Follower Replication: Leader accepts messages, replicates to followers with ACK tracking
  • High Watermark Durability: Only committed records (≤ HW) are visible to consumers
  • Topic-less Design: Single partition log for simplicity
  • Redis Metadata: Persistent metadata for ISR, HW, and follower LEOs
  • CLI Clients: Producer and consumer tools
  • Docker Compose: Multi-node cluster simulation

🎯 Implementation Guide

The codebase is structured with clear TODOs to guide your implementation:

  1. Start with yak/broker/common.py - Implement config, logging, metadata, and replication utilities
  2. Then yak/broker/logstore.py - Build the append-only log with persistence
  3. Next yak/broker/leader.py - Implement the leader broker (produce, fetch, replication)
  4. Follow with yak/broker/follower.py - Implement follower replication handling
  5. Finally yak/client/ - Build producer and consumer CLI tools

Each file has detailed TODO comments explaining what to implement.

To see the working implementation:

git checkout kaustubh-dev

To switch back to boilerplate:

git checkout main

Quick Start

1. Install Dependencies (local dev)

uv pip install -e .
# or with dev dependencies
uv pip install -e ".[dev]"

2a. Run Locally with Docker Compose (Single Machine)

cd infra
docker compose up --build

This starts:

  • Redis on localhost:6379
  • Leader on localhost:8080
  • Follower-1 on localhost:8081
  • Follower-2 on localhost:8082

2b. Run Distributed Deployment (4 Separate Systems)

See DISTRIBUTED_DEPLOYMENT.md for complete instructions on running YAK across 4 different machines using ZeroTier for networking.

Quick summary:

  • System 1: Leader broker
  • System 2: Follower broker
  • System 3: Redis metadata store
  • System 4: Producer/Consumer clients

Then run the failover demo:

bash demo_failover.sh

3. Produce Messages

python -m yak.client.producer --message "Hello YAK"
python -m yak.client.producer --message "Second message"

4. Consume Messages

python -m yak.client.consumer --from-beginning

5. Check Health

curl http://localhost:8080/health

Project Structure

yak/
├── broker/          # Leader & follower implementations
│   ├── common.py    # Config, logging, metadata, replication
│   ├── logstore.py  # Append-only log with HW/LEO
│   ├── leader.py    # Leader node
│   ├── follower.py  # Follower node
│   └── server.py    # CLI entrypoint (leader or follower)
├── client/          # Producer & consumer CLI tools
│   ├── producer.py
│   ├── consumer.py
│   └── utils.py     # Shared client utilities (placeholder)
└── tests/           # Unit and integration tests
infra/
├── docker-compose.yml
├── Dockerfile
└── redis.conf
docs/
└── architecture.md

Architecture

  • Leader: Accepts produce requests, appends to log, replicates to followers, manages HW
  • Followers: Receive replication requests, append locally, ACK to leader
  • High Watermark: min(LEO of all in-sync replicas) → ensures consistency
  • Consumers: Only read committed records (≤ HW)

Testing

python -m pytest yak/tests/

Or run individual smoke tests:

python yak/tests/test_replication.py

Configuration

Controlled via environment variables (see yak/common/config.py):

  • YAK_ROLE: leader or follower
  • YAK_NODE_ID: unique identifier
  • YAK_HOST, YAK_PORT: bind address
  • YAK_FOLLOWERS: comma-separated follower URLs (leader only)
  • YAK_REDIS_URL: Redis connection string
  • YAK_DATA_DIR: log persistence directory

Next Steps

See TODO.md for the full development plan and milestones.

License

MIT

About

No description, website, or topics provided.

Resources

License

Contributing

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors