Redis Clone in C++

A from-scratch implementation of Redis featuring dual server architectures for comprehensive distributed systems learning. This project demonstrates both event-driven and multi-threaded approaches to high-performance server design.

Project Overview

This project recreates Redis's core functionality with two distinct server architectures to provide deep insights into:

Event-driven programming with I/O multiplexing (select/poll)
Multi-threaded server design with synchronization challenges
Distributed system concepts and concurrent programming patterns
Network programming and protocol implementation
High-performance I/O and memory management

Key Educational Features

Side-by-side comparison of event-driven vs multi-threaded architectures
Command-line mode selection for easy switching between implementations
Production-quality buffer management and protocol handling
Comprehensive documentation of design decisions and trade-offs

Project Structure

redis-clone-cpp/
├── CMakeLists.txt          # Top-level CMake configuration
├── cmake/                  # CMake modules and utilities
│   ├── Dependencies.cmake  # External dependency management
│   └── TestHelper.cmake   # Test configuration helpers
├── src/                   # Source code
│   ├── main.cpp          # Entry point with command-line parsing
│   ├── CMakeLists.txt    # Source build configuration
│   ├── storage/          # Storage engine implementation
│   │   ├── CMakeLists.txt
│   │   ├── include/     # Public headers
│   │   │   └── storage/
│   │   │       └── database.h
│   │   └── src/        # Implementation files
│   │       └── database.cpp
│   └── network/         # Network server implementation
│       ├── CMakeLists.txt
│       ├── include/     # Public headers
│       │   └── network/
│       │       ├── server.h      # Both server architectures
│       │       └── redis_utils.h # Shared protocol utilities
│       └── src/        # Implementation files
│           ├── server.cpp      # Server implementations
│           └── redis_utils.cpp # Shared utilities
├── test/                # Test files
│   ├── CMakeLists.txt  # Test configuration
│   └── unit/           # Unit tests
│       ├── storage/    # Storage engine tests
│       │   ├── CMakeLists.txt
│       │   └── database_test.cpp
│       └── network/    # Network component tests
│           ├── CMakeLists.txt
│           └── server_test.cpp
└── scripts/            # Build and test scripts
    ├── build.sh       # Build script
    └── test.sh        # Test runner script

Server Architectures

🚀 Event-Driven Server (Default) - `RedisServerEventLoop`

Production-style architecture using I/O multiplexing for handling multiple clients:

Single-threaded event loop with select() system call
Non-blocking I/O operations (MSG_DONTWAIT)
Client state management with per-client read/write buffers
Memory efficient - no thread overhead per connection
Scalable to thousands of concurrent connections

Best for: Learning modern server architectures, understanding event loops, and high-concurrency scenarios.

🧵 Multi-Threaded Server (Educational) - `RedisServer`

Traditional thread-per-client architecture for understanding threading concepts:

One thread per client connection model
Thread-safe database operations with mutex synchronization
Signal handling in worker threads
Resource isolation per client connection
Simpler logic but higher memory overhead

Best for: Understanding threading, synchronization challenges, and resource management.

Components

Current Implementation

Dual Server Architectures:
- Event-driven (default): Single-threaded with I/O multiplexing using select()
- Multi-threaded (educational): Thread-per-client with mutex synchronization
Command-Line Interface:
- Mode selection: --mode=eventloop|threaded
- Port configuration: --port=<number> (default: 6379)
- Help system: -h, --help for usage information
- Backward compatibility: Supports legacy positional arguments
Storage Engine: Thread-safe key-value storage implementation
- In-memory key-value store using std::unordered_map
- String values support with GET/SET/DEL/EXISTS operations
- Thread-safe database operations (multi-threaded mode)
- Template-based storage abstraction (event-loop mode)
Network Layer: Production-quality TCP server with robust protocol handling
- Redis Protocol (RESP): Complete command parsing and response formatting
- Buffer Management: Handles partial commands across multiple recv() calls
- Command Processing: Flexible termination handling (\r\n and \n)
- Connection Management: Graceful client disconnection and cleanup
- Error Handling: Comprehensive error responses and network failure recovery
Shared Utilities: redis_utils module for protocol consistency
- Command parsing: Structured command extraction (CommandParts)
- Template specialization: Support for different storage backends
- Protocol compliance: RESP format responses for all commands
- Code reuse: Shared logic between both server architectures

Planned Components

Data Structures: Redis-like data structure support
Replication: Master-slave replication system
Persistence: Snapshot and AOF persistence

Build System

The project uses CMake as its build system with a modern setup:

CMake Structure

Top-level CMake: Project-wide settings and component inclusion
Component-level CMake: Individual component configurations
Modern CMake Practices:
- Target-based configuration
- Proper dependency management
- Clear public/private interfaces

Build Configuration

C++17 standard
GoogleTest for unit testing
Modular component architecture
Separate build outputs for binaries and libraries

Getting Started

Prerequisites

CMake 3.14 or higher
C++17 compatible compiler
Git

Building the Project

# Clone the repository
git clone https://github.com/yourusername/redis-clone-cpp.git
cd redis-clone-cpp

# Build the project
./scripts/build.sh

Running Tests

# Run all tests
./scripts/test.sh

# Run specific test components
./scripts/test.sh network_test     # Run network tests
./scripts/test.sh database_test    # Run storage tests

# Run tests without rebuilding
./scripts/test.sh --no-build network_test

Running the Server

Quick Start

# Build the project
./scripts/build.sh

# Start with default settings (Event-driven server on port 6379)
./build/bin/redis-clone-cpp

# Expected output:
# Redis Clone Server v0.1.0
# Mode: Event Loop
# Port: 6379
# PID: 12345
# --------------------------------
# Event loop server ready to accept connections

Server Mode Selection

# Event-driven server (default) - recommended for learning modern architectures
./build/bin/redis-clone-cpp --mode=eventloop

# Multi-threaded server - educational comparison
./build/bin/redis-clone-cpp --mode=threaded

# Custom port
./build/bin/redis-clone-cpp --mode=eventloop --port=8080

# Using environment variable
REDIS_CLONE_PORT=9000 ./build/bin/redis-clone-cpp --mode=threaded

Help and Usage

# Display all options
./build/bin/redis-clone-cpp --help

# Output:
# Usage: ./build/bin/redis-clone-cpp [OPTIONS]
# Options:
#   --mode=<type>     Server mode: 'eventloop' (default) or 'threaded'
#   --port=<number>   Port number (default: 6379)
#   -h, --help        Show this help message
#
# Examples:
#   ./build/bin/redis-clone-cpp                    # Event loop server on port 6379
#   ./build/bin/redis-clone-cpp --mode=threaded    # Multi-threaded server
#   ./build/bin/redis-clone-cpp --port=8080        # Custom port

Testing with netcat

Once the server is running, you can test it using netcat:

# Connect to server
nc localhost 6379

# Send commands manually:
SET name john
GET name
SET age 25
GET age
DEL name
EXISTS name
EXISTS age
QUIT    # Gracefully close the connection

Testing Multiple Concurrent Connections

You can test both server architectures with multiple simultaneous connections:

# Terminal 1: Start event-driven server
./build/bin/redis-clone-cpp --mode=eventloop

# Terminal 2-4: Connect multiple clients simultaneously
nc localhost 6379  # Each terminal can send commands independently

# Example commands in each terminal:
# Terminal 2: SET user1 alice
# Terminal 3: SET user2 bob  
# Terminal 4: GET user1

# Then test multi-threaded mode:
./build/bin/redis-clone-cpp --mode=threaded
# Connect same number of clients and observe behavior

Architecture Comparison & Design

Event-Driven vs Multi-Threaded: A Learning Comparison

Both server architectures implement the same Redis protocol but use fundamentally different approaches to concurrency:

Aspect	Event-Driven (`eventloop`)	Multi-Threaded (`threaded`)
Concurrency Model	Single thread + I/O multiplexing	One thread per client
Resource Usage	Low memory, single thread	Higher memory, N threads
Scalability	Handles thousands of clients	Limited by thread overhead
Complexity	Complex state management	Simpler per-client logic
Debugging	Single-threaded debugging	Multi-threaded race conditions
Best Use Case	High-concurrency production	Educational/simple scenarios

Event-Driven Server Architecture (`RedisServerEventLoop`)

Event Loop (Single Thread)
┌─────────────────────────────────────────────────────────────┐
│                    select() System Call                     │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │   Client    │  │   Client    │  │   Client    │          │
│  │   Socket    │  │   Socket    │  │   Socket    │   ...    │
│  │     #3      │  │     #4      │  │     #5      │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
│         │                 │                 │               │
│  ┌──────▼──────┐  ┌─────▼─────┐   ┌─────▼─────┐             │
│  │ Read Buffer │  │Read Buffer│   │Read Buffer│             │
│  │Write Buffer │  │Write Buffer│  │Write Buffer│             │
│  └─────────────┘  └───────────┘   └───────────┘             │
│                                                              │
│  ┌─────────────────────────────────────────────────────────┐│
│  │             Shared Data Store                           ││
│  │         std::unordered_map<string, string>              ││
│  └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

Key Benefits:

Memory Efficient: No thread stack overhead per client
Cache Friendly: Single thread uses CPU cache effectively
No Locks Needed: No synchronization primitives required
Predictable Performance: No context switching overhead

Multi-Threaded Server Architecture (`RedisServer`)

Main Thread                     Worker Threads
     │                               │
┌────▼────┐                   ┌──────▼──────┐
│ Accept  │                   │   Client    │
│  Loop   │─────────────────▶ │   Handler   │
│         │                   │   Thread    │
└─────────┘                   └─────────────┘
                                     │
                              ┌──────▼──────┐
                              │   Command   │
                              │   Buffer    │◄─── recv() calls
                              │   Manager   │
                              └─────────────┘
                                     │
                              ┌──────▼──────┐
                              │  Database   │◄─── Mutex protected
                              │ Operations  │     std::lock_guard
                              └─────────────┘

Key Benefits:

Simple Logic: Each client handled independently
Isolation: Client failures don't affect others
Familiar Model: Traditional threading approach
Educational Value: Demonstrates synchronization challenges

Shared Protocol Implementation (`redis_utils`)

Both architectures use the same protocol handling code for consistency:

// Shared command parsing
struct CommandParts {
    std::string command;  // "SET", "GET", etc.
    std::string key;      // Key name
    std::string value;    // Value (for SET)
};

// Template-based processing for different storage types
template<typename DataStore>
std::string process_command_with_store(const CommandParts& parts, DataStore& data);

This design enables:

Code Reuse: Same protocol logic across architectures
Consistency: Identical command behavior in both modes
Maintainability: Single source of truth for Redis protocol
Testing: Compare architectures with identical functionality

Production-Quality Buffer Management

A critical networking challenge solved in both server implementations:

The Problem: TCP doesn't guarantee that complete commands arrive in a single recv() call. Network conditions and command size can cause partial command reception.

Example Scenario:

# Large command might arrive fragmented:
Command: "SET user:12345:profile:data some_very_long_value_that_spans_multiple_packets"

# recv() call 1: "SET user:12345:profile:data some_very_lo"
# recv() call 2: "ng_value_that_spans_multiple_packets"

Robust Solution Implemented:

Persistent Buffers: Maintain std::string read_buffer per client
Data Accumulation: Append all recv() data: read_buffer.append(buffer, bytes_read)
Boundary Detection: Scan for complete commands: buffer.find('\n')
Command Extraction: Extract complete commands and preserve partial data
Flexible Terminators: Handle both \r\n (Redis standard) and \n (telnet/netcat)

Code Example:

// Both architectures use this pattern
while ((pos = read_buffer.find('\n')) != std::string::npos) {
    std::string complete_command = read_buffer.substr(0, pos);
    read_buffer.erase(0, pos + 1);  // Remove processed command
    
    // Remove \r if present (\r\n handling)
    if (!complete_command.empty() && complete_command.back() == '\r') {
        complete_command.pop_back();
    }
    
    // Process complete command
    process_command(complete_command);
}
// Partial commands remain in buffer for next recv() cycle

Testing Verification: Simulated partial reception by reducing buffer sizes to force command fragmentation across multiple network calls.

Development Workflow

Adding New Components

Create component directory in src/
Add component's CMakeLists.txt
Update top-level CMakeLists.txt
Add corresponding test directory
Update test CMakeLists.txt

Testing

Unit tests using GoogleTest framework
Tests located in test/unit/
Each component has its own test directory
Test helper functions in cmake/TestHelper.cmake

Project Goals

Educational Objectives

Understanding distributed systems concepts
Learning concurrent programming patterns
Implementing high-performance data structures
Network programming fundamentals
Database system internals

Technical Goals

✅ Dual Server Architectures
- Event-driven server with I/O multiplexing (select())
- Multi-threaded server with thread-per-client model
- Command-line mode selection and configuration
- Production-quality buffer management for both architectures
✅ Redis Protocol Implementation
- RESP (Redis Serialization Protocol) compliance
- GET/SET/DEL/EXISTS operations with proper error handling
- Flexible command termination (\r\n and \n)
- Shared protocol utilities for code consistency
✅ Production-Quality Networking
- Robust partial command handling across multiple recv() calls
- Graceful client connection management and cleanup
- Signal handling and graceful shutdown procedures
- Thread-safe database operations (multi-threaded mode)
🚧 Advanced Features (Planned)
- Additional Redis commands (INCR, DECR, APPEND, etc.)
- Data structure operations (Lists, Sets, Hashes)
- Persistence mechanisms (RDB snapshots, AOF)
- Basic replication (master-slave)

Learning Outcomes

This project provides hands-on experience with:

Distributed Systems Concepts

Concurrency Models: Event-driven vs multi-threaded approaches
I/O Multiplexing: Understanding select(), poll(), and event loops
Resource Management: Memory usage patterns and scalability trade-offs
Performance Analysis: Comparing architectures under different loads

Systems Programming

Network Programming: TCP sockets, partial reads, connection management
Protocol Implementation: RESP parsing and response formatting
Signal Handling: Graceful shutdown in multi-threaded environments
Buffer Management: Production-quality data handling across network calls

Software Architecture

Modular Design: Clean separation of concerns (networking, storage, protocol)
Template Programming: Generic interfaces for different storage backends
Code Reuse: Shared utilities across different architectures
API Design: Command-line interfaces and configuration management

Contributing

This is primarily a learning project, but contributions are welcome! Areas of interest:

Immediate Opportunities

Performance Benchmarking: Add tools to compare architecture performance
Additional Commands: Implement more Redis commands (INCR, APPEND, etc.)
Protocol Enhancements: Support for Redis pipelining
Testing: Expand test coverage for edge cases

Advanced Features

Persistence: Add RDB or AOF persistence mechanisms
Replication: Implement basic master-slave replication
Clustering: Explore Redis cluster concepts
Monitoring: Add metrics and observability features

Please feel free to:

Report bugs or unexpected behavior
Suggest architectural improvements
Submit pull requests with enhancements
Share performance analysis or benchmarks

Acknowledgments

Redis for the inspiration and protocol specification
Stevens' Network Programming for socket programming concepts
Event-driven programming patterns from modern server architectures
C++ community for modern C++ practices and guidelines

Name		Name	Last commit message	Last commit date
Latest commit History 8 Commits
.vscode		.vscode
cmake		cmake
scripts		scripts
src		src
test		test
.clang-format		.clang-format
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
README.md		README.md

ishangote/redis-clone-cpp

Folders and files

Latest commit

History

Repository files navigation