STDIO | Stateful HTTP | Stateless HTTP | SSE
A Production-Ready Model Context Protocol Server Teaching Hybrid Storage, Distributed Systems, and Resilient Error Handling
Learn by building a world-class, horizontally-scalable MCP server that is robust by design.
This repository is a masterclass in building distributed systems with the Model Context Protocol. It is a comprehensive reference implementation that demonstrates how to build a robust, scalable, and fault-tolerant MCP server using a stateful, hybrid-storage architecture.
This project is designed to teach five core concepts:
- 🏗️ Clean Architecture: Master a clean separation of concerns by organizing code into a
types.ts
for data contracts and aserver.ts
for application logic. - ⚙️ Hybrid Storage (Strategy Pattern): Implement a system that runs with zero dependencies locally (in-memory) and seamlessly transitions to a distributed architecture using Redis for production.
- 🔒 Scalability & Zero-Downtime: Build a system that scales horizontally and supports zero-downtime deployments by externalizing state and eliminating the need for "sticky sessions".
- ⚡ Advanced State Management: Learn critical patterns for distributed systems, including storage abstraction (
ISessionStore
), race condition prevention, and just-in-time instance reconstruction. - 🛡️ Resilience & Predictability: Implement a robust error handling strategy using specific, typed errors and a global error boundary to build a server that fails gracefully and predictably.
This stateful, distributed architecture is the ideal choice for complex, high-availability applications:
- Enterprise Applications: Systems that require persistent user sessions and must remain available during deployments or node failures.
- Collaborative Tools: Scenarios where multiple users or agents interact with a shared context that must be centrally managed.
- Multi-Turn Conversational Agents: Complex chatbots or agents that need to remember the entire history of an interaction to provide coherent responses.
- Any system where losing session state or failing unpredictably is unacceptable.
This server is designed to work in two modes: a simple local mode and a scalable production mode.
Run the server instantly on your machine with zero external dependencies.
# Clone the repository
git clone https://github.com/yigitkonur/example-mcp-server-streamable-http
cd example-mcp-server-streamable-http
# Install dependencies
npm install
# Start the server in development mode (uses in-memory storage)
npm run dev
# The server starts on port 1453 with the message:
# ✅ Using In-Memory for single-node state management.
Test the full distributed architecture using the provided Docker Compose setup.
# Make sure Docker is running on your machine
# This single command starts the server and a Redis instance
docker-compose up --build
# The server starts on port 1453 and connects to the Redis container:
# ✅ Using Redis for distributed state management.
# INFO: Redis Client Connected
This project follows a clean architecture with a deliberate separation of concerns.
src/
├── types.ts # Data Contracts: Interfaces, Zod Schemas, Custom Errors
└── server.ts # Application Logic: Storage Impls, Server Factory, HTTP Endpoints
- Storage Abstraction (Strategy Pattern): The core application logic is decoupled from the storage mechanism (
in-memory
vs.Redis
) via anISessionStore
interface defined intypes.ts
. - Stateless Nodes, Stateful System: Individual server nodes hold only a temporary cache of session objects. The authoritative state lives in a central store (Redis), allowing the system as a whole to be stateful and resilient.
- Just-in-Time Reconstruction: Any server node can handle a request for any session ID. If a session is not in a node's local cache, it is reconstructed on-the-fly from the central store. This eliminates the need for sticky sessions.
- Predictable Error Handling: The server uses a multi-layered error strategy. It throws specific, typed errors for known failure modes (like an invalid session) and uses a global Express error handler as a safety net to catch all unexpected issues, ensuring the client always receives a secure, protocol-compliant error response.
┌─────────────────────────────────────────┐
│ Express Server │
│ ┌─────────────────────────────────────┐ │
│ │ Global Error Handler (Safety Net) │ │
│ └─────────────────────────────────────┘ │
│ Rate Limiting | CORS | Health Checks │
├─────────────────────────────────────────┤
│ In-Memory Session Store │
│ (Ephemeral Map<id, Data>) │
├─────────────────────────────────────────┤
│ Per-Session MCP Server Instances │
│ (Cached in an in-memory Map) │
└─────────────────────────────────────────┘
Load Balancer (No Sticky Sessions)
|
+--------------+--------------+
| | |
┌─────────┐ ┌─────────┐ ┌─────────┐
│ Server A│ │ Server B│ │ Server C│
└─────────┘ └─────────┘ └─────────┘
| | |
+--------------+--------------+
|
┌────────────────────────────┐
│ Redis Cluster │
│ (Authoritative Session & │
│ Event Store) │
└────────────────────────────┘
This section highlights the most important code patterns that define this architecture.
The Principle: Code to an interface, not a concrete implementation. This decouples our application logic from the storage technology.
The Implementation (src/types.ts
):
// The contract that any storage backend must adhere to.
export interface ISessionStore {
get(sessionId: string): Promise<SessionData | null>;
set(sessionId: string, data: SessionData): Promise<void>;
// ... and other methods
}
// Application logic in server.ts only ever interacts with this interface.
The Principle: To achieve horizontal scalability without sticky sessions, any server node must be able to handle a request for any active session.
The Implementation (src/server.ts
):
// DRY Implementation: Single helper function eliminates code duplication
async function getOrCreateInstances(
sessionId: string,
): Promise<{ transport: StreamableHTTPServerTransport; server: McpServer }> {
// 1. Check high-performance local cache first
let instances = sessionInstances.get(sessionId);
if (instances) return instances;
// 2. Verify session exists in authoritative persistent store
const sessionData = await sessionStore.get(sessionId);
if (!sessionData) {
throw new SessionNotFoundError('Session does not exist or has expired.', { sessionId });
}
// 3. Reconstruct instances from persistent state
console.log(`Reconstructing instances for session ${sessionId} on this node`);
// ... reconstruction logic ...
return instances;
}
// Used consistently across POST, GET, and DELETE endpoints
const instances = await getOrCreateInstances(sessionId);
The Principle: To prevent race conditions in a distributed system, the session record must be saved to the persistent store before the McpServer
instance is created.
The Implementation (src/server.ts
):
// 1. A new session request arrives. Generate a session ID.
const newSessionId = randomUUID();
// 2. Create the initial session data object.
const sessionData = createNewSessionData();
// 3. CRITICAL: Persist the session data to Redis/memory FIRST.
await sessionStore.set(newSessionId, sessionData);
// 4. NOW it's safe to create the McpServer instance, which may need to read this data.
const server = await createMCPServer(newSessionId);
The Principle: A robust server fails predictably. We use specific error types for known issues and a global safety net for everything else.
The Implementation:
1. Define Custom, Specific Errors (src/types.ts
): We create a hierarchy of error classes to represent distinct failure modes.
// A base class for all our application's errors.
export class CalculatorServerError extends McpError {
/* ... */
}
// A specific error for when a session is not found.
export class SessionNotFoundError extends CalculatorServerError {
/* ... */
}
// A specific error for when a database/Redis operation fails.
export class StorageOperationFailedError extends CalculatorServerError {
/* ... */
}
2. Throw Specific Errors in Logic (src/server.ts
): Instead of returning generic errors, our code throws these specific types.
// Inside an HTTP handler...
const sessionData = await sessionStore.get(sessionId);
if (!sessionData) {
// This is a known, predictable failure. Throw the specific error.
throw new SessionNotFoundError('Session not found or expired', { sessionId });
}
3. Complete Error Boundary Coverage (src/server.ts
): Every endpoint throws specific errors instead of direct HTTP responses, ensuring 100% coverage by the global handler. This prevents any error from bypassing our safety net.
// All endpoints throw errors instead of sending responses directly
if (!sessionId) {
throw new McpError(ErrorCode.InvalidRequest, 'Mcp-Session-Id header is required');
}
// Global Express middleware catches ALL errors
app.use((err: Error, req: Request, res: Response, next: express.NextFunction) => {
// 1. Log the full, detailed error for our internal records.
console.error('[GLOBAL ERROR HANDLER] Unhandled error caught:', err);
// 2. Handle specific error types with proper codes and context
let code = ErrorCode.InternalError;
let message = 'An internal server error occurred.';
let data: unknown = undefined;
if (err instanceof CalculatorServerError) {
code = err.code;
message = err.message;
data = err.context; // Include contextual information for debugging
} else if (err instanceof McpError) {
code = err.code;
message = err.message;
data = err.data;
}
// 3. Always send protocol-compliant JSON-RPC error responses
res.status(500).json({ jsonrpc: '2.0', id: rpcId, error: { code, message, data } });
});
This server implements a comprehensive set of capabilities to demonstrate a production-grade system.
Feature | Description | Key Pattern Demonstrated |
---|---|---|
Hybrid Storage | Switches between in-memory and Redis via USE_REDIS env var. |
Strategy Pattern and environment-based configuration. |
Persistent History | Calculation history is saved as part of the session data. | Stateful Tool Use: Tools modify session state which is then persisted. |
Gold-Standard Error Handling | Complete error boundary coverage with typed errors and comprehensive TSDoc documentation. | Multi-Layered Defense: Custom error hierarchy + global safety net + contextual error data. |
DRY Code Architecture | Single getOrCreateInstances helper eliminates reconstruction logic duplication. |
Maintainability: Critical patterns abstracted into reusable, well-documented functions. |
Health Checks | /health endpoint reports server status, including Redis connectivity. |
Observability: Providing critical system status for monitoring. |
Prometheus Metrics | /metrics endpoint exposes mcp_active_sessions and more. |
Monitoring: Exposing key performance indicators for a metrics scraper. |
Complete Documentation | Every tool, resource, and prompt handler documents exact failure modes with @throws annotations. |
Predictable APIs: Clear contracts for all failure scenarios. |
Verify the server's operational status and view its metrics. The /health
endpoint is aware of the storage mode.
# Check basic health (works in both modes)
curl http://localhost:1453/health
# In Redis mode, a healthy response will include:
# "storageMode": "redis", "redis": "ready"
# Check Prometheus-style metrics
curl http://localhost:1453/metrics
Use curl
to test the full session lifecycle.
# 1. Initialize a session and capture the Mcp-Session-Id header
SESSION_ID=$(curl -i -X POST http://localhost:1453/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"1.0.0"}}' \
| grep -i Mcp-Session-Id | awk '{print $2}' | tr -d '\r')
echo "Acquired Session ID: $SESSION_ID"
# 2. Use the session ID to call a tool
curl -X POST http://localhost:1453/mcp \
-H "Content-Type: application/json" \
-H "Mcp-Session-Id: $SESSION_ID" \
-d '{"jsonrpc":"2.0","id":2,"method":"tools/call","params":{"name":"calculate","arguments":{"a":100,"b":50,"op":"add"}}}'
Use the official inspector to interactively test the stateful server.
# The inspector will handle the session ID automatically.
npx @modelcontextprotocol/inspector --cli http://localhost:1453/mcp
The server is configured using environment variables.
Variable | Description | Default |
---|---|---|
PORT |
The port for the HTTP server to listen on. | 1453 |
USE_REDIS |
Set to true to enable Redis for distributed state. |
false |
REDIS_URL |
The connection string for the Redis instance. | redis://localhost:6379 |
LOG_LEVEL |
Logging verbosity (debug , info , warn , error ). |
info |
CORS_ORIGIN |
Allowed origin for CORS. Use a specific domain in production. | * |
SAMPLE_TOOL_NAME |
(Educational) Demonstrates dynamic tool registration via environment variables. When set, adds a simple echo tool with the specified name that takes a value parameter and returns test string print: {value} . This pattern shows how MCP servers can be configured at runtime. |
None |
This server is designed for high-availability, horizontally-scaled deployments.
- Containerization: The multi-stage
Dockerfile
creates a lean, secure production image. Thedocker-compose.yml
file is ready for multi-replica scaling (docker-compose up --scale mcp-server=4
). - Load Balancing: Deploy behind any standard load balancer. Sticky sessions are not required due to the "Just-in-Time Reconstruction" architecture.
- Zero-Downtime Updates: Because session state is externalized to Redis, you can perform rolling deployments of new server versions without interrupting or losing active user sessions.