Production-ready Model Context Protocol (MCP) server template built with Axum, Rust (stable), and integrated observability.
- MCP Protocol: Full support for Tools, Resources, and Prompts via
#[tool]macros and provider structs - Async Runtime: Optimal scalability with tokio async for all I/O operations
- Type-safe HTTP Client: Compile-time schema validation via
serde+schemars - Observability: Prometheus metrics + OpenTelemetry traces out of the box
- Production Ready: Health probes, graceful shutdown, circuit breaker, Kubernetes manifests
- Rust stable (install via rustup)
- Docker (optional, for containerization)
This template uses WeatherAPI.com as an example external API.
- Go to weatherapi.com/signup.aspx
- Create a free account (1M calls/month free tier)
- Copy your API key from the dashboard
# Set Weather API key
export WEATHER_API_KEY=your_api_key_here
# Start the server
cargo runAccess points:
- Health: http://localhost:8181/health
- Metrics: http://localhost:8181/metrics
- MCP: http://localhost:8181/mcp
cargo testRUST_LOG=debug cargo test -- --nocapture# One-time: install the coverage tool
cargo install cargo-tarpaulin
cargo tarpaulin --out html
# Report at: tarpaulin-report.html# Health check
curl http://localhost:8181/health
# Prometheus metrics
curl http://localhost:8181/metrics
# MCP endpoint
curl -X POST http://localhost:8181/mcp \
-H "Content-Type: application/json" \
-d '{"jsonrpc":"2.0","id":1,"method":"tools/list"}'# Streamable HTTP (recommended)
npx @modelcontextprotocol/inspector --url http://localhost:8181/mcpcargo run# Build optimized binary
cargo build --release
# Run
./target/release/rust-mcp-serverdocker build -f docker/Dockerfile -t rust-mcp-server:latest .# Dev mode (default): OTEL disabled, DEBUG logs
docker run -d \
--name rust-mcp-server \
-p 8181:8181 \
-e WEATHER_API_KEY=your_api_key_here \
rust-mcp-server:latest
# Prod mode: enable OTEL, INFO logs
docker run -d \
--name rust-mcp-server \
-p 8181:8181 \
-e RUN_MODE=production \
-e WEATHER_API_KEY=your_api_key_here \
rust-mcp-server:latest# Check container is running
docker ps | grep rust-mcp-server
# Test health endpoint
curl -s http://localhost:8181/health/live | jq .
# Test MCP endpoint
curl -s http://localhost:8181/mcp -X POST \
-H 'Content-Type: application/json' \
-H 'Accept: application/json, text/event-stream' \
-d '{"jsonrpc":"2.0","id":1,"method":"initialize","params":{"protocolVersion":"2024-11-05","capabilities":{},"clientInfo":{"name":"test","version":"1.0"}}}'./scripts/smoke_test.sh http://localhost:8181docker logs -f rust-mcp-serverdocker stop rust-mcp-server && docker rm rust-mcp-serverclaude mcp add rust-mcp-server --transport streamable-http --url http://localhost:8181/mcpOr add to ~/.claude/settings.json:
{
"mcpServers": {
"rust-mcp-server": {
"type": "streamable-http",
"url": "http://localhost:8181/mcp"
}
}
}Settings → Tools → AI Assistant → Model Context Protocol (MCP) → Add:
| Field | Value |
|---|---|
| Name | rust-mcp-server |
| Transport | Streamable HTTP |
| URL | http://localhost:8181/mcp |
Tools are functions that AI clients can invoke. Well-designed tools are the foundation of a useful MCP server.
1. Single Responsibility
Each tool should do one thing well. Prefer multiple focused tools over one multipurpose tool.
// Good: Focused tools
#[tool(description = "Get current weather conditions for a city")]
async fn get_current_weather(&self, city: String) -> Result<CallToolResult, McpError> { }
#[tool(description = "Get weather forecast for upcoming days")]
async fn get_forecast(&self, city: String, days: u32) -> Result<CallToolResult, McpError> { }
// Bad: Multipurpose tool
#[tool(description = "Get weather data")]
async fn get_weather(&self, city: String, kind: String, days: u32, include_alerts: bool) -> Result<CallToolResult, McpError> { }2. Descriptive Names and Documentation
Tool names should be verbs. Descriptions should explain what the tool does, not how.
#[tool(description = "Search for cities and locations matching a query. Returns matching city names, regions, and countries.")]
async fn search_locations(
&self,
#[tool(param)]
request: SearchLocationsRequest,
) -> Result<CallToolResult, McpError> { }
#[derive(Deserialize, Serialize, JsonSchema)]
pub struct SearchLocationsRequest {
#[schemars(description = "City name, postal code, or coordinates (lat,lon)")]
pub query: String,
}3. Clear Parameter Descriptions
Every parameter needs a description via schemars. Include valid values, ranges, and defaults.
#[derive(Deserialize, Serialize, JsonSchema)]
pub struct GetForecastRequest {
#[schemars(description = "City name (e.g., 'London', 'New York')")]
pub city: String,
#[schemars(description = "Number of forecast days (1-7, default: 3)")]
#[serde(default = "default_days")]
pub days: u32,
}4. Structured Return Types
Return structured data via CallToolResult. AI clients understand structured data better.
// Good: Structured response
#[derive(Serialize)]
pub struct WeatherResult {
pub city: String,
pub country: String,
pub temperature_celsius: f64,
pub temperature_fahrenheit: f64,
pub humidity_percent: u32,
pub condition: String,
pub icon_url: String,
}
// Return as JSON text content
let json = serde_json::to_string_pretty(&result)?;
Ok(CallToolResult::success(vec![TextContent::new(json).into()]))5. Handle Errors Gracefully
Return error information in the response rather than panicking.
#[tool(description = "Get weather for a city")]
async fn get_current_weather(
&self,
#[tool(param)] request: GetWeatherRequest,
) -> Result<CallToolResult, McpError> {
if request.city.trim().is_empty() {
return Ok(CallToolResult::error(vec![
TextContent::new("City name is required").into(),
]));
}
match self.client.get_current_weather(&request.city).await {
Ok(data) => {
let json = serde_json::to_string_pretty(&data).unwrap_or_default();
Ok(CallToolResult::success(vec![TextContent::new(json).into()]))
}
Err(e) => Ok(CallToolResult::error(vec![
TextContent::new(format!("Failed to fetch weather: {e}")).into(),
])),
}
}#[derive(Clone)]
pub struct MyTools {
client: Arc<MyApiClient>,
metrics: McpMetrics,
tool_router: ToolRouter<Self>,
}
#[tool_router]
impl MyTools {
#[tool(description = "Clear, concise description of what this tool does")]
async fn my_tool(
&self,
#[tool(param)] request: MyToolRequest,
) -> Result<CallToolResult, McpError> {
// 1. Validate inputs
if request.required_param.trim().is_empty() {
return Ok(CallToolResult::error(vec![
TextContent::new("required_param is required").into(),
]));
}
// 2. Call external service with metrics
let result = self.metrics.timed_async("my_tool", async {
self.client.get_data(&request.required_param).await
}).await;
// 3. Transform to result
match result {
Ok(data) => Ok(CallToolResult::success(vec![TextContent::new(data).into()])),
Err(e) => Ok(CallToolResult::error(vec![TextContent::new(e.to_string()).into()])),
}
}
}Resources provide read-only data that AI clients can access.
#[derive(Clone)]
pub struct ResourceProvider {
server_port: u16,
}
impl ResourceProvider {
pub fn list_resources(&self) -> Vec<RawResource> {
vec![
RawResource::new("config://api-endpoints", "API Endpoints")
.with_description("Available API endpoints and their documentation")
.with_mime_type("application/json"),
]
}
pub fn read_resource(&self, uri: &str) -> Option<ResourceContents> {
match uri {
"config://api-endpoints" => Some(self.api_endpoints_resource()),
_ => None,
}
}
fn api_endpoints_resource(&self) -> ResourceContents {
let json = serde_json::json!({
"endpoints": {
"weather": {
"description": "Weather data API",
"docs": "https://weatherapi.com/docs"
}
}
});
ResourceContents::text(
serde_json::to_string_pretty(&json).unwrap_or_default(),
"config://api-endpoints",
)
}
}| Scheme | Purpose | Example |
|---|---|---|
config:// |
Configuration data | config://server-info |
docs:// |
Documentation | docs://tools |
data:// |
Static data | data://countries |
file:// |
File contents | file://templates/email |
Prompts are reusable templates that guide AI behavior.
#[derive(Clone)]
pub struct PromptProvider;
impl PromptProvider {
pub fn list_prompts(&self) -> Vec<Prompt> {
vec![
Prompt::new(
"code_review",
Some("Review code for quality, security, and best practices"),
Some(vec![
PromptArgument {
name: "language".into(),
description: Some("Programming language".into()),
required: Some(true),
..Default::default()
},
PromptArgument {
name: "code".into(),
description: Some("Code to review".into()),
required: Some(true),
..Default::default()
},
]),
),
]
}
pub fn get_prompt(
&self,
name: &str,
args: &HashMap<String, String>,
) -> Option<GetPromptResult> {
match name {
"code_review" => {
let language = args.get("language").map(|s| s.as_str()).unwrap_or("unknown");
let code = args.get("code").map(|s| s.as_str()).unwrap_or("");
let content = format!(
"Review this {language} code for:\n\n\
1. **Correctness**: Logic errors, edge cases\n\
2. **Security**: Injection, auth issues, data exposure\n\
3. **Performance**: Inefficiencies, N+1 queries\n\
4. **Maintainability**: Naming, structure, complexity\n\n\
Code:\n```{language}\n{code}\n```\n\n\
Provide specific, actionable feedback."
);
Some(GetPromptResult {
description: Some("Code review".into()),
messages: vec![PromptMessage {
role: PromptMessageRole::User,
content: PromptMessageContent::Text { text: content },
}],
})
}
_ => None,
}
}
}MCP tools, resources, and prompts form a contract with AI clients. Breaking changes confuse models and break workflows. Follow these guidelines for backward-compatible evolution.
These changes are backward-compatible and can be deployed without coordination:
// Original request
#[derive(Deserialize, Serialize, JsonSchema)]
pub struct GetWeatherRequest {
pub city: String,
}
// Safe evolution: add optional parameter with default
#[derive(Deserialize, Serialize, JsonSchema)]
pub struct GetWeatherRequest {
pub city: String,
#[serde(default)]
pub unit: TemperatureUnit, // NEW - defaults via Default impl (Celsius)
}
// Safe evolution: add new field to response
#[derive(Serialize)]
pub struct WeatherResult {
pub city: String,
pub temperature: f64,
pub condition: String,
pub icon_url: String, // NEW - clients ignore unknown fields
}These changes break existing clients and require explicit versioning:
| Change | Why It Breaks | Solution |
|---|---|---|
| Rename tool | AI learned old name | Create new tool, deprecate old |
| Remove parameter | Calls with old param fail | Keep accepting, ignore if unused |
| Change parameter type | Type mismatch | New tool version |
| Remove response field | Client expects field | Keep field, populate with default |
| Change field semantics | Different interpretation | New field with new name |
For breaking changes, create versioned tools rather than modifying existing ones:
// Original (keep for backward compatibility)
#[tool(description = "[Deprecated: use get_weather_v2] Get weather for a city")]
async fn get_current_weather(&self, request: GetWeatherRequest) -> Result<CallToolResult, McpError> {
self.get_weather_v2(GetWeatherV2Request {
city: request.city,
unit: "celsius".into(),
}).await
}
// New version with breaking changes
#[tool(description = "Get weather for a city with unit selection")]
async fn get_weather_v2(&self, request: GetWeatherV2Request) -> Result<CallToolResult, McpError> {
// New implementation
}- Mark deprecated - Add
[Deprecated]prefix to description - Log usage - Track calls to deprecated tools for migration planning
- Set removal date - Communicate timeline to consumers
- Remove after migration - Only after confirming no active usage
Document your tool schemas for LLM consumption. AI clients use descriptions for tool selection:
#[tool(description = "Get current weather conditions for a city.\n\n\
Returns: temperature, humidity, wind speed, and conditions.\n\
Supports: city names, airport codes, coordinates (lat,lon).\n\
Rate limit: 100 calls/minute per API key.")]Test that old clients work with new server:
#[tokio::test]
async fn legacy_call_without_new_parameter_uses_default() {
// Simulate old client call without new 'unit' parameter
let request = serde_json::json!({"city": "London"});
let parsed: GetWeatherRequest = serde_json::from_value(request).unwrap();
// Should use default (Celsius) not fail
assert_eq!(parsed.unit, TemperatureUnit::Celsius);
}pub struct WeatherClient {
client: ClientWithMiddleware,
base_url: String,
api_key: String,
circuit_breaker: Option<CircuitBreaker>,
}
impl WeatherClient {
pub fn new(config: &WeatherConfig) -> Result<Self> {
let retry_policy = ExponentialBackoff::builder()
.retry_bounds(config.retry_min_wait(), config.retry_max_wait())
.build_with_max_retries(config.retry_max_attempts);
let client = ClientBuilder::new(
reqwest::Client::builder()
.timeout(config.timeout())
.connect_timeout(Duration::from_secs(5))
.build()
.context("failed to create HTTP client")?,
)
.with(RetryTransientMiddleware::new_with_policy(retry_policy))
.build();
Ok(Self {
client,
base_url: config.url.clone(),
api_key: config.key.clone(),
circuit_breaker: None, // configure via config.circuit_breaker
})
}
}#[derive(Debug, Deserialize)]
pub struct WeatherApiResponse {
pub location: Location,
pub current: CurrentWeather,
}
#[derive(Debug, Deserialize)]
pub struct Location {
pub name: String,
pub region: String,
pub country: String,
}# config/default.toml
[weather]
url = "https://api.weatherapi.com"
key = "" # set via WEATHER_API_KEY env var
timeout_secs = 10
retry_max_attempts = 3
retry_min_wait_ms = 500
retry_max_wait_ms = 5000src/
├── main.rs # Entry point
├── lib.rs # Library exports
├── app.rs # Application setup, router
├── config.rs # Configuration structs
├── tool/ # MCP Tools (callable functions)
│ ├── mod.rs
│ ├── weather.rs # Weather tools implementation
│ └── result.rs # Tool result helpers
├── client/ # HTTP Clients (external APIs)
│ ├── mod.rs
│ ├── weather_client.rs
│ └── model.rs # Response DTOs
├── resource/ # MCP Resources (read-only data)
│ ├── mod.rs
│ └── server_info.rs
├── prompt/ # MCP Prompts (templates)
│ ├── mod.rs
│ └── code_assist.rs
├── observability/ # Metrics and tracing
│ ├── mod.rs
│ ├── metrics.rs # Prometheus metrics
│ ├── tracing.rs # OpenTelemetry setup
│ └── correlation.rs # Correlation ID middleware
├── common/ # Shared types
│ ├── mod.rs
│ ├── circuit_breaker.rs
│ ├── result.rs
│ └── trace_id.rs
├── health/ # Health checks
│ ├── mod.rs
│ └── checks.rs
└── lifecycle/ # Application lifecycle
├── mod.rs
└── shutdown.rs # Graceful shutdown
| Variable | Description | Default |
|---|---|---|
WEATHER_API_KEY |
WeatherAPI.com API key | (required, empty default) |
APP_WEATHER_URL |
Weather API base URL | https://api.weatherapi.com |
RUN_MODE |
Config profile: default, development, production |
default |
CONFIG_DIR |
Directory containing *.toml config files |
config |
RUST_LOG |
Log directive (overrides observability.log_level) |
(unset) |
APP_SERVER_PORT |
Override server port | 8181 |
| Profile | Logging | Traces | Format |
|---|---|---|---|
development |
DEBUG, pretty | Disabled | Pretty console |
production |
INFO, JSON | Enabled (OTLP) | Structured JSON |
The server can gate access via the X-API-Key header. Disabled by
default — appropriate for local development or trusted-network
deployments. Enable it whenever the server is reachable from untrusted
clients.
# config/production.toml (or any profile)
[auth]
enabled = true
allowed_keys = [
"prod-client-alice",
"prod-client-bob",
]Behavior when enabled = true:
- Every request must carry
X-API-Key: <one-of-the-allowed-keys>. - Missing or non-matching keys get
401 UnauthorizedwithWWW-Authenticate: APIKey. - Keys are compared in constant time per entry (via
subtle) so valid prefixes cannot be discovered via timing.
Behavior when enabled = false (default): the middleware is a no-op;
all requests pass through. allowed_keys is ignored in this mode.
Important: this is entirely separate from the weather.key that
our server uses for its outbound calls to WeatherAPI. weather.key
is the server's own credential for the upstream service; auth.allowed_keys
is the list of client credentials we accept on inbound requests.
# Create secrets
kubectl create secret generic rust-mcp-server-secrets \
--from-literal=weather-api-key=your_key
# Deploy
kubectl apply -f k8s/base/
# Or production with HPA
kubectl apply -f k8s/overlays/prod/| Endpoint | Purpose |
|---|---|
/health/live |
Liveness probe (is process alive?) |
/health/ready |
Readiness probe (can serve traffic?) |
/health/started |
Startup probe (initial setup complete?) |
/metrics |
Prometheus metrics |
This template includes built-in metrics for tool invocations via McpMetrics:
| Metric | Type | Labels | Purpose |
|---|---|---|---|
mcp_tool_calls_total |
Counter | tool, status |
Tool invocation count (success/error) |
mcp_tool_duration_seconds |
Histogram | tool |
Latency distribution (p50, p95, p99) |
mcp_tool_payload_bytes_total |
Counter | tool, direction |
Request/response size |
let result = self.metrics.timed_async("my_tool", async {
// Tool implementation - automatically tracked
self.client.get_data(¶m).await
}).await;# Error rate by tool (last 5 minutes)
sum(rate(mcp_tool_calls_total{status="error"}[5m])) by (tool)
/ sum(rate(mcp_tool_calls_total[5m])) by (tool)
# p95 latency by tool
histogram_quantile(0.95, sum(rate(mcp_tool_duration_seconds_bucket[5m])) by (le, tool))
# Tool usage ranking
topk(10, sum(rate(mcp_tool_calls_total[1h])) by (tool))
# Payload throughput (proxy for token cost)
sum(rate(mcp_tool_payload_bytes_total[5m])) by (tool, direction)
MCP servers don't see actual token counts—that's computed by the LLM. However, you can estimate costs using payload size as a proxy:
Why payload size correlates with tokens:
- Tool arguments are serialized to JSON → tokens for the LLM to parse
- Tool responses are serialized to JSON → tokens in the context window
- Larger payloads = more tokens = higher cost
Tracking payload size:
// Automatic with McpMetrics.timed_async()
let result = self.metrics.timed_async("my_tool", async {
let result = self.client.get_data(¶m).await;
result // Response size tracked automatically
}).await;Grafana dashboard query for cost estimation:
# Estimate daily token usage (rough: 1 token ≈ 4 bytes)
sum(increase(mcp_tool_payload_bytes_total[24h])) / 4
Traces are exported via OpenTelemetry. Each tool call creates a span:
MCP Request
└── tool:get_current_weather
├── http:weatherapi.com/v1/current.json
└── transform:response
Configuration:
# config/production.toml
[observability]
otel_enabled = true
otel_endpoint = "http://tempo:4317"1. Tool not being selected by AI:
- Check tool description is clear and specific
- Verify parameter descriptions explain expected values
- Look at
mcp_tool_calls_totalto confirm tool is callable
2. Tool errors:
# Find failing tools
mcp_tool_calls_total{status="error"}
Check logs for error details:
grep "tool error" /var/log/rust-mcp-server.log3. Slow tool performance:
# Find slow tools (p95 > 1s)
histogram_quantile(0.95, rate(mcp_tool_duration_seconds_bucket[5m])) > 1
4. High token usage:
# Tools with largest payloads
topk(5, sum(rate(mcp_tool_payload_bytes_total[1h])) by (tool))
Consider:
- Pagination for large result sets
- Summary responses with detail-on-demand tools
- Caching for repeated queries
Example Prometheus alerting rules for MCP servers:
groups:
- name: mcp-server
rules:
- alert: McpToolHighErrorRate
expr: |
sum(rate(mcp_tool_calls_total{status="error"}[5m])) by (tool)
/ sum(rate(mcp_tool_calls_total[5m])) by (tool) > 0.1
for: 5m
labels:
severity: warning
annotations:
summary: "Tool {{ $labels.tool }} has >10% error rate"
- alert: McpToolHighLatency
expr: |
histogram_quantile(0.95, sum(rate(mcp_tool_duration_seconds_bucket[5m])) by (le, tool)) > 5
for: 5m
labels:
severity: warning
annotations:
summary: "Tool {{ $labels.tool }} p95 latency >5s"
- alert: McpServerUnhealthy
expr: up{job="mcp-server"} == 0
for: 1m
labels:
severity: critical
annotations:
summary: "MCP server is down"