A high-performance in-memory store written in C++.
Shared-nothing, lock-free architecture. Data is partitioned across multiple threads. Any thread can handle a request depending on the underlying OS. On Linux, the kernel will load balance a TCP port over multiple threads. On other OS, only one thread will establish the connection, but will pass data across to the correct thread depending on a request key.
Uses a custom binary protocol over TCP called FMP. Features implemented:
- Pipelining: avoids head-of-line blocking
- Persistent connections: multiple requests per connection
- Python client
A request consists of a fixed-size header followed by a variable number of "parts" (arguments).
| Section | Field | Size | Description |
|---|---|---|---|
| Header | Magic Number | 4 bytes | 0xCAFEBEEF (Sanity check) |
| Header | Total Length | 4 bytes | Total size (Header + Body) |
| Header | Request ID | 4 bytes | Unique ID |
| Body | Part Length | 4 bytes | Length of the argument |
| Body | Part Data | Variable | The argument data |
| Section | Field | Size | Description |
|---|---|---|---|
| Header | Magic Number | 4 bytes | 0xCAFEBEEF |
| Header | Total Length | 4 bytes | Total size |
| Header | Request ID | 4 bytes | Matches request ID |
| Header | Status | 4 bytes | Status code |
| Body | Payload | Variable | Response data |
- Client
- Connection pooling: one client should be able to handle many concurrent requests.
request_idis already specified in FMP. - Messaging multiplexing
- Connection pooling: one client should be able to handle many concurrent requests.
- Server
- Messaging multiplexing
- More stack usage to maximize performance