Skip to content

ishangote/redis-clone-cpp

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

8 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

Redis Clone in C++

A from-scratch implementation of Redis featuring dual server architectures for comprehensive distributed systems learning. This project demonstrates both event-driven and multi-threaded approaches to high-performance server design.

Project Overview

This project recreates Redis's core functionality with two distinct server architectures to provide deep insights into:

  • Event-driven programming with I/O multiplexing (select/poll)
  • Multi-threaded server design with synchronization challenges
  • Distributed system concepts and concurrent programming patterns
  • Network programming and protocol implementation
  • High-performance I/O and memory management

Key Educational Features

  • Side-by-side comparison of event-driven vs multi-threaded architectures
  • Command-line mode selection for easy switching between implementations
  • Production-quality buffer management and protocol handling
  • Comprehensive documentation of design decisions and trade-offs

Project Structure

redis-clone-cpp/
├── CMakeLists.txt          # Top-level CMake configuration
├── cmake/                  # CMake modules and utilities
│   ├── Dependencies.cmake  # External dependency management
│   └── TestHelper.cmake   # Test configuration helpers
├── src/                   # Source code
│   ├── main.cpp          # Entry point with command-line parsing
│   ├── CMakeLists.txt    # Source build configuration
│   ├── storage/          # Storage engine implementation
│   │   ├── CMakeLists.txt
│   │   ├── include/     # Public headers
│   │   │   └── storage/
│   │   │       └── database.h
│   │   └── src/        # Implementation files
│   │       └── database.cpp
│   └── network/         # Network server implementation
│       ├── CMakeLists.txt
│       ├── include/     # Public headers
│       │   └── network/
│       │       ├── server.h      # Both server architectures
│       │       └── redis_utils.h # Shared protocol utilities
│       └── src/        # Implementation files
│           ├── server.cpp      # Server implementations
│           └── redis_utils.cpp # Shared utilities
├── test/                # Test files
│   ├── CMakeLists.txt  # Test configuration
│   └── unit/           # Unit tests
│       ├── storage/    # Storage engine tests
│       │   ├── CMakeLists.txt
│       │   └── database_test.cpp
│       └── network/    # Network component tests
│           ├── CMakeLists.txt
│           └── server_test.cpp
└── scripts/            # Build and test scripts
    ├── build.sh       # Build script
    └── test.sh        # Test runner script

Server Architectures

🚀 Event-Driven Server (Default) - RedisServerEventLoop

Production-style architecture using I/O multiplexing for handling multiple clients:

  • Single-threaded event loop with select() system call
  • Non-blocking I/O operations (MSG_DONTWAIT)
  • Client state management with per-client read/write buffers
  • Memory efficient - no thread overhead per connection
  • Scalable to thousands of concurrent connections

Best for: Learning modern server architectures, understanding event loops, and high-concurrency scenarios.

🧵 Multi-Threaded Server (Educational) - RedisServer

Traditional thread-per-client architecture for understanding threading concepts:

  • One thread per client connection model
  • Thread-safe database operations with mutex synchronization
  • Signal handling in worker threads
  • Resource isolation per client connection
  • Simpler logic but higher memory overhead

Best for: Understanding threading, synchronization challenges, and resource management.

Components

Current Implementation

  • Dual Server Architectures:

    • Event-driven (default): Single-threaded with I/O multiplexing using select()
    • Multi-threaded (educational): Thread-per-client with mutex synchronization
  • Command-Line Interface:

    • Mode selection: --mode=eventloop|threaded
    • Port configuration: --port=<number> (default: 6379)
    • Help system: -h, --help for usage information
    • Backward compatibility: Supports legacy positional arguments
  • Storage Engine: Thread-safe key-value storage implementation

    • In-memory key-value store using std::unordered_map
    • String values support with GET/SET/DEL/EXISTS operations
    • Thread-safe database operations (multi-threaded mode)
    • Template-based storage abstraction (event-loop mode)
  • Network Layer: Production-quality TCP server with robust protocol handling

    • Redis Protocol (RESP): Complete command parsing and response formatting
    • Buffer Management: Handles partial commands across multiple recv() calls
    • Command Processing: Flexible termination handling (\r\n and \n)
    • Connection Management: Graceful client disconnection and cleanup
    • Error Handling: Comprehensive error responses and network failure recovery
  • Shared Utilities: redis_utils module for protocol consistency

    • Command parsing: Structured command extraction (CommandParts)
    • Template specialization: Support for different storage backends
    • Protocol compliance: RESP format responses for all commands
    • Code reuse: Shared logic between both server architectures

Planned Components

  • Data Structures: Redis-like data structure support
  • Replication: Master-slave replication system
  • Persistence: Snapshot and AOF persistence

Build System

The project uses CMake as its build system with a modern setup:

CMake Structure

  • Top-level CMake: Project-wide settings and component inclusion
  • Component-level CMake: Individual component configurations
  • Modern CMake Practices:
    • Target-based configuration
    • Proper dependency management
    • Clear public/private interfaces

Build Configuration

  • C++17 standard
  • GoogleTest for unit testing
  • Modular component architecture
  • Separate build outputs for binaries and libraries

Getting Started

Prerequisites

  • CMake 3.14 or higher
  • C++17 compatible compiler
  • Git

Building the Project

# Clone the repository
git clone https://github.com/yourusername/redis-clone-cpp.git
cd redis-clone-cpp

# Build the project
./scripts/build.sh

Running Tests

# Run all tests
./scripts/test.sh

# Run specific test components
./scripts/test.sh network_test     # Run network tests
./scripts/test.sh database_test    # Run storage tests

# Run tests without rebuilding
./scripts/test.sh --no-build network_test

Running the Server

Quick Start

# Build the project
./scripts/build.sh

# Start with default settings (Event-driven server on port 6379)
./build/bin/redis-clone-cpp

# Expected output:
# Redis Clone Server v0.1.0
# Mode: Event Loop
# Port: 6379
# PID: 12345
# --------------------------------
# Event loop server ready to accept connections

Server Mode Selection

# Event-driven server (default) - recommended for learning modern architectures
./build/bin/redis-clone-cpp --mode=eventloop

# Multi-threaded server - educational comparison
./build/bin/redis-clone-cpp --mode=threaded

# Custom port
./build/bin/redis-clone-cpp --mode=eventloop --port=8080

# Using environment variable
REDIS_CLONE_PORT=9000 ./build/bin/redis-clone-cpp --mode=threaded

Help and Usage

# Display all options
./build/bin/redis-clone-cpp --help

# Output:
# Usage: ./build/bin/redis-clone-cpp [OPTIONS]
# Options:
#   --mode=<type>     Server mode: 'eventloop' (default) or 'threaded'
#   --port=<number>   Port number (default: 6379)
#   -h, --help        Show this help message
#
# Examples:
#   ./build/bin/redis-clone-cpp                    # Event loop server on port 6379
#   ./build/bin/redis-clone-cpp --mode=threaded    # Multi-threaded server
#   ./build/bin/redis-clone-cpp --port=8080        # Custom port

Testing with netcat

Once the server is running, you can test it using netcat:

# Connect to server
nc localhost 6379

# Send commands manually:
SET name john
GET name
SET age 25
GET age
DEL name
EXISTS name
EXISTS age
QUIT    # Gracefully close the connection

Testing Multiple Concurrent Connections

You can test both server architectures with multiple simultaneous connections:

# Terminal 1: Start event-driven server
./build/bin/redis-clone-cpp --mode=eventloop

# Terminal 2-4: Connect multiple clients simultaneously
nc localhost 6379  # Each terminal can send commands independently

# Example commands in each terminal:
# Terminal 2: SET user1 alice
# Terminal 3: SET user2 bob  
# Terminal 4: GET user1

# Then test multi-threaded mode:
./build/bin/redis-clone-cpp --mode=threaded
# Connect same number of clients and observe behavior

Architecture Comparison & Design

Event-Driven vs Multi-Threaded: A Learning Comparison

Both server architectures implement the same Redis protocol but use fundamentally different approaches to concurrency:

Aspect Event-Driven (eventloop) Multi-Threaded (threaded)
Concurrency Model Single thread + I/O multiplexing One thread per client
Resource Usage Low memory, single thread Higher memory, N threads
Scalability Handles thousands of clients Limited by thread overhead
Complexity Complex state management Simpler per-client logic
Debugging Single-threaded debugging Multi-threaded race conditions
Best Use Case High-concurrency production Educational/simple scenarios

Event-Driven Server Architecture (RedisServerEventLoop)

Event Loop (Single Thread)
┌─────────────────────────────────────────────────────────────┐
│                    select() System Call                     │
│  ┌─────────────┐  ┌─────────────┐  ┌─────────────┐          │
│  │   Client    │  │   Client    │  │   Client    │          │
│  │   Socket    │  │   Socket    │  │   Socket    │   ...    │
│  │     #3      │  │     #4      │  │     #5      │          │
│  └─────────────┘  └─────────────┘  └─────────────┘          │
│         │                 │                 │               │
│  ┌──────▼──────┐  ┌─────▼─────┐   ┌─────▼─────┐             │
│  │ Read Buffer │  │Read Buffer│   │Read Buffer│             │
│  │Write Buffer │  │Write Buffer│  │Write Buffer│             │
│  └─────────────┘  └───────────┘   └───────────┘             │
│                                                              │
│  ┌─────────────────────────────────────────────────────────┐│
│  │             Shared Data Store                           ││
│  │         std::unordered_map<string, string>              ││
│  └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘

Key Benefits:

  • Memory Efficient: No thread stack overhead per client
  • Cache Friendly: Single thread uses CPU cache effectively
  • No Locks Needed: No synchronization primitives required
  • Predictable Performance: No context switching overhead

Multi-Threaded Server Architecture (RedisServer)

Main Thread                     Worker Threads
     │                               │
┌────▼────┐                   ┌──────▼──────┐
│ Accept  │                   │   Client    │
│  Loop   │─────────────────▶ │   Handler   │
│         │                   │   Thread    │
└─────────┘                   └─────────────┘
                                     │
                              ┌──────▼──────┐
                              │   Command   │
                              │   Buffer    │◄─── recv() calls
                              │   Manager   │
                              └─────────────┘
                                     │
                              ┌──────▼──────┐
                              │  Database   │◄─── Mutex protected
                              │ Operations  │     std::lock_guard
                              └─────────────┘

Key Benefits:

  • Simple Logic: Each client handled independently
  • Isolation: Client failures don't affect others
  • Familiar Model: Traditional threading approach
  • Educational Value: Demonstrates synchronization challenges

Shared Protocol Implementation (redis_utils)

Both architectures use the same protocol handling code for consistency:

// Shared command parsing
struct CommandParts {
    std::string command;  // "SET", "GET", etc.
    std::string key;      // Key name
    std::string value;    // Value (for SET)
};

// Template-based processing for different storage types
template<typename DataStore>
std::string process_command_with_store(const CommandParts& parts, DataStore& data);

This design enables:

  • Code Reuse: Same protocol logic across architectures
  • Consistency: Identical command behavior in both modes
  • Maintainability: Single source of truth for Redis protocol
  • Testing: Compare architectures with identical functionality

Production-Quality Buffer Management

A critical networking challenge solved in both server implementations:

The Problem: TCP doesn't guarantee that complete commands arrive in a single recv() call. Network conditions and command size can cause partial command reception.

Example Scenario:

# Large command might arrive fragmented:
Command: "SET user:12345:profile:data some_very_long_value_that_spans_multiple_packets"

# recv() call 1: "SET user:12345:profile:data some_very_lo"
# recv() call 2: "ng_value_that_spans_multiple_packets"

Robust Solution Implemented:

  1. Persistent Buffers: Maintain std::string read_buffer per client
  2. Data Accumulation: Append all recv() data: read_buffer.append(buffer, bytes_read)
  3. Boundary Detection: Scan for complete commands: buffer.find('\n')
  4. Command Extraction: Extract complete commands and preserve partial data
  5. Flexible Terminators: Handle both \r\n (Redis standard) and \n (telnet/netcat)

Code Example:

// Both architectures use this pattern
while ((pos = read_buffer.find('\n')) != std::string::npos) {
    std::string complete_command = read_buffer.substr(0, pos);
    read_buffer.erase(0, pos + 1);  // Remove processed command
    
    // Remove \r if present (\r\n handling)
    if (!complete_command.empty() && complete_command.back() == '\r') {
        complete_command.pop_back();
    }
    
    // Process complete command
    process_command(complete_command);
}
// Partial commands remain in buffer for next recv() cycle

Testing Verification: Simulated partial reception by reducing buffer sizes to force command fragmentation across multiple network calls.

Development Workflow

Adding New Components

  1. Create component directory in src/
  2. Add component's CMakeLists.txt
  3. Update top-level CMakeLists.txt
  4. Add corresponding test directory
  5. Update test CMakeLists.txt

Testing

  • Unit tests using GoogleTest framework
  • Tests located in test/unit/
  • Each component has its own test directory
  • Test helper functions in cmake/TestHelper.cmake

Project Goals

Educational Objectives

  • Understanding distributed systems concepts
  • Learning concurrent programming patterns
  • Implementing high-performance data structures
  • Network programming fundamentals
  • Database system internals

Technical Goals

  1. Dual Server Architectures

    • Event-driven server with I/O multiplexing (select())
    • Multi-threaded server with thread-per-client model
    • Command-line mode selection and configuration
    • Production-quality buffer management for both architectures
  2. Redis Protocol Implementation

    • RESP (Redis Serialization Protocol) compliance
    • GET/SET/DEL/EXISTS operations with proper error handling
    • Flexible command termination (\r\n and \n)
    • Shared protocol utilities for code consistency
  3. Production-Quality Networking

    • Robust partial command handling across multiple recv() calls
    • Graceful client connection management and cleanup
    • Signal handling and graceful shutdown procedures
    • Thread-safe database operations (multi-threaded mode)
  4. 🚧 Advanced Features (Planned)

    • Additional Redis commands (INCR, DECR, APPEND, etc.)
    • Data structure operations (Lists, Sets, Hashes)
    • Persistence mechanisms (RDB snapshots, AOF)
    • Basic replication (master-slave)

Learning Outcomes

This project provides hands-on experience with:

Distributed Systems Concepts

  • Concurrency Models: Event-driven vs multi-threaded approaches
  • I/O Multiplexing: Understanding select(), poll(), and event loops
  • Resource Management: Memory usage patterns and scalability trade-offs
  • Performance Analysis: Comparing architectures under different loads

Systems Programming

  • Network Programming: TCP sockets, partial reads, connection management
  • Protocol Implementation: RESP parsing and response formatting
  • Signal Handling: Graceful shutdown in multi-threaded environments
  • Buffer Management: Production-quality data handling across network calls

Software Architecture

  • Modular Design: Clean separation of concerns (networking, storage, protocol)
  • Template Programming: Generic interfaces for different storage backends
  • Code Reuse: Shared utilities across different architectures
  • API Design: Command-line interfaces and configuration management

Contributing

This is primarily a learning project, but contributions are welcome! Areas of interest:

Immediate Opportunities

  • Performance Benchmarking: Add tools to compare architecture performance
  • Additional Commands: Implement more Redis commands (INCR, APPEND, etc.)
  • Protocol Enhancements: Support for Redis pipelining
  • Testing: Expand test coverage for edge cases

Advanced Features

  • Persistence: Add RDB or AOF persistence mechanisms
  • Replication: Implement basic master-slave replication
  • Clustering: Explore Redis cluster concepts
  • Monitoring: Add metrics and observability features

Please feel free to:

  • Report bugs or unexpected behavior
  • Suggest architectural improvements
  • Submit pull requests with enhancements
  • Share performance analysis or benchmarks

Acknowledgments

  • Redis for the inspiration and protocol specification
  • Stevens' Network Programming for socket programming concepts
  • Event-driven programming patterns from modern server architectures
  • C++ community for modern C++ practices and guidelines

About

A Redis clone implementation in modern C++ for learning distributed systems concepts

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published