A from-scratch implementation of Redis featuring dual server architectures for comprehensive distributed systems learning. This project demonstrates both event-driven and multi-threaded approaches to high-performance server design.
This project recreates Redis's core functionality with two distinct server architectures to provide deep insights into:
- Event-driven programming with I/O multiplexing (select/poll)
- Multi-threaded server design with synchronization challenges
- Distributed system concepts and concurrent programming patterns
- Network programming and protocol implementation
- High-performance I/O and memory management
- Side-by-side comparison of event-driven vs multi-threaded architectures
- Command-line mode selection for easy switching between implementations
- Production-quality buffer management and protocol handling
- Comprehensive documentation of design decisions and trade-offs
redis-clone-cpp/
├── CMakeLists.txt # Top-level CMake configuration
├── cmake/ # CMake modules and utilities
│ ├── Dependencies.cmake # External dependency management
│ └── TestHelper.cmake # Test configuration helpers
├── src/ # Source code
│ ├── main.cpp # Entry point with command-line parsing
│ ├── CMakeLists.txt # Source build configuration
│ ├── storage/ # Storage engine implementation
│ │ ├── CMakeLists.txt
│ │ ├── include/ # Public headers
│ │ │ └── storage/
│ │ │ └── database.h
│ │ └── src/ # Implementation files
│ │ └── database.cpp
│ └── network/ # Network server implementation
│ ├── CMakeLists.txt
│ ├── include/ # Public headers
│ │ └── network/
│ │ ├── server.h # Both server architectures
│ │ └── redis_utils.h # Shared protocol utilities
│ └── src/ # Implementation files
│ ├── server.cpp # Server implementations
│ └── redis_utils.cpp # Shared utilities
├── test/ # Test files
│ ├── CMakeLists.txt # Test configuration
│ └── unit/ # Unit tests
│ ├── storage/ # Storage engine tests
│ │ ├── CMakeLists.txt
│ │ └── database_test.cpp
│ └── network/ # Network component tests
│ ├── CMakeLists.txt
│ └── server_test.cpp
└── scripts/ # Build and test scripts
├── build.sh # Build script
└── test.sh # Test runner script
Production-style architecture using I/O multiplexing for handling multiple clients:
- Single-threaded event loop with
select()
system call - Non-blocking I/O operations (MSG_DONTWAIT)
- Client state management with per-client read/write buffers
- Memory efficient - no thread overhead per connection
- Scalable to thousands of concurrent connections
Best for: Learning modern server architectures, understanding event loops, and high-concurrency scenarios.
Traditional thread-per-client architecture for understanding threading concepts:
- One thread per client connection model
- Thread-safe database operations with mutex synchronization
- Signal handling in worker threads
- Resource isolation per client connection
- Simpler logic but higher memory overhead
Best for: Understanding threading, synchronization challenges, and resource management.
-
Dual Server Architectures:
- Event-driven (default): Single-threaded with I/O multiplexing using
select()
- Multi-threaded (educational): Thread-per-client with mutex synchronization
- Event-driven (default): Single-threaded with I/O multiplexing using
-
Command-Line Interface:
- Mode selection:
--mode=eventloop|threaded
- Port configuration:
--port=<number>
(default: 6379) - Help system:
-h, --help
for usage information - Backward compatibility: Supports legacy positional arguments
- Mode selection:
-
Storage Engine: Thread-safe key-value storage implementation
- In-memory key-value store using
std::unordered_map
- String values support with GET/SET/DEL/EXISTS operations
- Thread-safe database operations (multi-threaded mode)
- Template-based storage abstraction (event-loop mode)
- In-memory key-value store using
-
Network Layer: Production-quality TCP server with robust protocol handling
- Redis Protocol (RESP): Complete command parsing and response formatting
- Buffer Management: Handles partial commands across multiple
recv()
calls - Command Processing: Flexible termination handling (
\r\n
and\n
) - Connection Management: Graceful client disconnection and cleanup
- Error Handling: Comprehensive error responses and network failure recovery
-
Shared Utilities:
redis_utils
module for protocol consistency- Command parsing: Structured command extraction (
CommandParts
) - Template specialization: Support for different storage backends
- Protocol compliance: RESP format responses for all commands
- Code reuse: Shared logic between both server architectures
- Command parsing: Structured command extraction (
- Data Structures: Redis-like data structure support
- Replication: Master-slave replication system
- Persistence: Snapshot and AOF persistence
The project uses CMake as its build system with a modern setup:
- Top-level CMake: Project-wide settings and component inclusion
- Component-level CMake: Individual component configurations
- Modern CMake Practices:
- Target-based configuration
- Proper dependency management
- Clear public/private interfaces
- C++17 standard
- GoogleTest for unit testing
- Modular component architecture
- Separate build outputs for binaries and libraries
- CMake 3.14 or higher
- C++17 compatible compiler
- Git
# Clone the repository
git clone https://github.com/yourusername/redis-clone-cpp.git
cd redis-clone-cpp
# Build the project
./scripts/build.sh
# Run all tests
./scripts/test.sh
# Run specific test components
./scripts/test.sh network_test # Run network tests
./scripts/test.sh database_test # Run storage tests
# Run tests without rebuilding
./scripts/test.sh --no-build network_test
# Build the project
./scripts/build.sh
# Start with default settings (Event-driven server on port 6379)
./build/bin/redis-clone-cpp
# Expected output:
# Redis Clone Server v0.1.0
# Mode: Event Loop
# Port: 6379
# PID: 12345
# --------------------------------
# Event loop server ready to accept connections
# Event-driven server (default) - recommended for learning modern architectures
./build/bin/redis-clone-cpp --mode=eventloop
# Multi-threaded server - educational comparison
./build/bin/redis-clone-cpp --mode=threaded
# Custom port
./build/bin/redis-clone-cpp --mode=eventloop --port=8080
# Using environment variable
REDIS_CLONE_PORT=9000 ./build/bin/redis-clone-cpp --mode=threaded
# Display all options
./build/bin/redis-clone-cpp --help
# Output:
# Usage: ./build/bin/redis-clone-cpp [OPTIONS]
# Options:
# --mode=<type> Server mode: 'eventloop' (default) or 'threaded'
# --port=<number> Port number (default: 6379)
# -h, --help Show this help message
#
# Examples:
# ./build/bin/redis-clone-cpp # Event loop server on port 6379
# ./build/bin/redis-clone-cpp --mode=threaded # Multi-threaded server
# ./build/bin/redis-clone-cpp --port=8080 # Custom port
Once the server is running, you can test it using netcat:
# Connect to server
nc localhost 6379
# Send commands manually:
SET name john
GET name
SET age 25
GET age
DEL name
EXISTS name
EXISTS age
QUIT # Gracefully close the connection
You can test both server architectures with multiple simultaneous connections:
# Terminal 1: Start event-driven server
./build/bin/redis-clone-cpp --mode=eventloop
# Terminal 2-4: Connect multiple clients simultaneously
nc localhost 6379 # Each terminal can send commands independently
# Example commands in each terminal:
# Terminal 2: SET user1 alice
# Terminal 3: SET user2 bob
# Terminal 4: GET user1
# Then test multi-threaded mode:
./build/bin/redis-clone-cpp --mode=threaded
# Connect same number of clients and observe behavior
Both server architectures implement the same Redis protocol but use fundamentally different approaches to concurrency:
Aspect | Event-Driven (eventloop ) |
Multi-Threaded (threaded ) |
---|---|---|
Concurrency Model | Single thread + I/O multiplexing | One thread per client |
Resource Usage | Low memory, single thread | Higher memory, N threads |
Scalability | Handles thousands of clients | Limited by thread overhead |
Complexity | Complex state management | Simpler per-client logic |
Debugging | Single-threaded debugging | Multi-threaded race conditions |
Best Use Case | High-concurrency production | Educational/simple scenarios |
Event Loop (Single Thread)
┌─────────────────────────────────────────────────────────────┐
│ select() System Call │
│ ┌─────────────┐ ┌─────────────┐ ┌─────────────┐ │
│ │ Client │ │ Client │ │ Client │ │
│ │ Socket │ │ Socket │ │ Socket │ ... │
│ │ #3 │ │ #4 │ │ #5 │ │
│ └─────────────┘ └─────────────┘ └─────────────┘ │
│ │ │ │ │
│ ┌──────▼──────┐ ┌─────▼─────┐ ┌─────▼─────┐ │
│ │ Read Buffer │ │Read Buffer│ │Read Buffer│ │
│ │Write Buffer │ │Write Buffer│ │Write Buffer│ │
│ └─────────────┘ └───────────┘ └───────────┘ │
│ │
│ ┌─────────────────────────────────────────────────────────┐│
│ │ Shared Data Store ││
│ │ std::unordered_map<string, string> ││
│ └─────────────────────────────────────────────────────────┘│
└─────────────────────────────────────────────────────────────┘
Key Benefits:
- Memory Efficient: No thread stack overhead per client
- Cache Friendly: Single thread uses CPU cache effectively
- No Locks Needed: No synchronization primitives required
- Predictable Performance: No context switching overhead
Main Thread Worker Threads
│ │
┌────▼────┐ ┌──────▼──────┐
│ Accept │ │ Client │
│ Loop │─────────────────▶ │ Handler │
│ │ │ Thread │
└─────────┘ └─────────────┘
│
┌──────▼──────┐
│ Command │
│ Buffer │◄─── recv() calls
│ Manager │
└─────────────┘
│
┌──────▼──────┐
│ Database │◄─── Mutex protected
│ Operations │ std::lock_guard
└─────────────┘
Key Benefits:
- Simple Logic: Each client handled independently
- Isolation: Client failures don't affect others
- Familiar Model: Traditional threading approach
- Educational Value: Demonstrates synchronization challenges
Both architectures use the same protocol handling code for consistency:
// Shared command parsing
struct CommandParts {
std::string command; // "SET", "GET", etc.
std::string key; // Key name
std::string value; // Value (for SET)
};
// Template-based processing for different storage types
template<typename DataStore>
std::string process_command_with_store(const CommandParts& parts, DataStore& data);
This design enables:
- Code Reuse: Same protocol logic across architectures
- Consistency: Identical command behavior in both modes
- Maintainability: Single source of truth for Redis protocol
- Testing: Compare architectures with identical functionality
A critical networking challenge solved in both server implementations:
The Problem: TCP doesn't guarantee that complete commands arrive in a single recv()
call. Network conditions and command size can cause partial command reception.
Example Scenario:
# Large command might arrive fragmented:
Command: "SET user:12345:profile:data some_very_long_value_that_spans_multiple_packets"
# recv() call 1: "SET user:12345:profile:data some_very_lo"
# recv() call 2: "ng_value_that_spans_multiple_packets"
Robust Solution Implemented:
- Persistent Buffers: Maintain
std::string read_buffer
per client - Data Accumulation: Append all
recv()
data:read_buffer.append(buffer, bytes_read)
- Boundary Detection: Scan for complete commands:
buffer.find('\n')
- Command Extraction: Extract complete commands and preserve partial data
- Flexible Terminators: Handle both
\r\n
(Redis standard) and\n
(telnet/netcat)
Code Example:
// Both architectures use this pattern
while ((pos = read_buffer.find('\n')) != std::string::npos) {
std::string complete_command = read_buffer.substr(0, pos);
read_buffer.erase(0, pos + 1); // Remove processed command
// Remove \r if present (\r\n handling)
if (!complete_command.empty() && complete_command.back() == '\r') {
complete_command.pop_back();
}
// Process complete command
process_command(complete_command);
}
// Partial commands remain in buffer for next recv() cycle
Testing Verification: Simulated partial reception by reducing buffer sizes to force command fragmentation across multiple network calls.
- Create component directory in
src/
- Add component's CMakeLists.txt
- Update top-level CMakeLists.txt
- Add corresponding test directory
- Update test CMakeLists.txt
- Unit tests using GoogleTest framework
- Tests located in
test/unit/
- Each component has its own test directory
- Test helper functions in
cmake/TestHelper.cmake
- Understanding distributed systems concepts
- Learning concurrent programming patterns
- Implementing high-performance data structures
- Network programming fundamentals
- Database system internals
-
✅ Dual Server Architectures
- Event-driven server with I/O multiplexing (
select()
) - Multi-threaded server with thread-per-client model
- Command-line mode selection and configuration
- Production-quality buffer management for both architectures
- Event-driven server with I/O multiplexing (
-
✅ Redis Protocol Implementation
- RESP (Redis Serialization Protocol) compliance
- GET/SET/DEL/EXISTS operations with proper error handling
- Flexible command termination (
\r\n
and\n
) - Shared protocol utilities for code consistency
-
✅ Production-Quality Networking
- Robust partial command handling across multiple
recv()
calls - Graceful client connection management and cleanup
- Signal handling and graceful shutdown procedures
- Thread-safe database operations (multi-threaded mode)
- Robust partial command handling across multiple
-
🚧 Advanced Features (Planned)
- Additional Redis commands (INCR, DECR, APPEND, etc.)
- Data structure operations (Lists, Sets, Hashes)
- Persistence mechanisms (RDB snapshots, AOF)
- Basic replication (master-slave)
This project provides hands-on experience with:
- Concurrency Models: Event-driven vs multi-threaded approaches
- I/O Multiplexing: Understanding
select()
,poll()
, and event loops - Resource Management: Memory usage patterns and scalability trade-offs
- Performance Analysis: Comparing architectures under different loads
- Network Programming: TCP sockets, partial reads, connection management
- Protocol Implementation: RESP parsing and response formatting
- Signal Handling: Graceful shutdown in multi-threaded environments
- Buffer Management: Production-quality data handling across network calls
- Modular Design: Clean separation of concerns (networking, storage, protocol)
- Template Programming: Generic interfaces for different storage backends
- Code Reuse: Shared utilities across different architectures
- API Design: Command-line interfaces and configuration management
This is primarily a learning project, but contributions are welcome! Areas of interest:
- Performance Benchmarking: Add tools to compare architecture performance
- Additional Commands: Implement more Redis commands (INCR, APPEND, etc.)
- Protocol Enhancements: Support for Redis pipelining
- Testing: Expand test coverage for edge cases
- Persistence: Add RDB or AOF persistence mechanisms
- Replication: Implement basic master-slave replication
- Clustering: Explore Redis cluster concepts
- Monitoring: Add metrics and observability features
Please feel free to:
- Report bugs or unexpected behavior
- Suggest architectural improvements
- Submit pull requests with enhancements
- Share performance analysis or benchmarks
- Redis for the inspiration and protocol specification
- Stevens' Network Programming for socket programming concepts
- Event-driven programming patterns from modern server architectures
- C++ community for modern C++ practices and guidelines