A production-ready cookie-cutter template for building MCP servers with LangGraph's Functional API. Features comprehensive authentication (JWT), fine-grained authorization (OpenFGA), secrets management (Infisical), and OpenTelemetry-based observability.
π― Opinionated, production-grade foundation for your MCP server projects.
# Generate your own MCP server project
pip install cookiecutter
cookiecutter gh:vishnu2kmohan/mcp_server_langgraph
# Answer a few questions and get a fully configured project!
See TEMPLATE_USAGE.md for detailed instructions.
- Multi-LLM Support (LiteLLM): 100+ LLM providers - Anthropic, OpenAI, Google, Azure, AWS Bedrock, Ollama
- Open-Source Models: Llama 3.1, Qwen 2.5, Mistral, DeepSeek, and more via Ollama
- LangGraph Functional API: Stateful agent with conditional routing and checkpointing
- MCP Server: Standard protocol for exposing AI agents as tools (stdio, StreamableHTTP, SSE)
- Authentication: JWT-based authentication with token validation
- Fine-Grained Authorization: OpenFGA (Zanzibar-style) relationship-based access control
- Secrets Management: Infisical integration for secure secret storage and retrieval
- Dual Observability: OpenTelemetry + LangSmith for comprehensive monitoring
- OpenTelemetry: Distributed tracing with Jaeger, metrics with Prometheus
- LangSmith: LLM-specific tracing, prompt engineering, evaluations
- Structured Logging: JSON logging with trace context correlation
- Full Observability Stack: Docker Compose setup with OpenFGA, Jaeger, Prometheus, and Grafana
- LangGraph Platform: Deploy to managed LangGraph Cloud with one command
- Automatic Fallback: Resilient multi-model fallback for high availability
- Full Documentation - Complete guides, API reference, and tutorials
- API Documentation - Interactive OpenAPI/Swagger UI (when running locally)
- Deployment Guide - Mintlify documentation deployment instructions
βββββββββββββββββββ
β MCP Client β
β (Claude Desktopβ
β or other) β
ββββββββββ¬βββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββ
β MCP Server (mcp_server.py) β
β ββββββββββββββββββββββββββββ β
β β Auth Middleware β β
β β - JWT Verification β β
β β - RBAC Authorization β β
β ββββββββββββββββββββββββββββ β
β ββββββββββββββββββββββββββββ β
β β LangGraph Agent β β
β β - Routing β β
β β - Tool Usage β β
β β - Response Generation β β
β ββββββββββββββββββββββββββββ β
ββββββββββ¬βββββββββββββββββββββββββ
β
βΌ
βββββββββββββββββββββββββββββββββββ
β Observability (OTEL) β
β ββββββββββββ ββββββββββββ β
β β Traces β β Metrics β β
β β (Jaeger) β β(Prometheus) β
β ββββββββββββ ββββββββββββ β
β ββββββββ¬βββββββββ β
β βΌ β
β ββββββββββββ β
β β Grafana β β
β ββββββββββββ β
βββββββββββββββββββββββββββββββββββ
Get the complete stack running in 2 minutes:
# Quick start script handles everything
./scripts/docker-compose-quickstart.sh
This starts:
- Agent API: http://localhost:8000 (MCP agent)
- OpenFGA: http://localhost:8080 (authorization)
- OpenFGA Playground: http://localhost:3001
- Jaeger UI: http://localhost:16686 (distributed tracing)
- Prometheus: http://localhost:9090 (metrics)
- Grafana: http://localhost:3000 (visualization, admin/admin)
- PostgreSQL: localhost:5432 (OpenFGA storage)
Then setup OpenFGA:
python scripts/setup_openfga.py
# Add OPENFGA_STORE_ID and OPENFGA_MODEL_ID to .env
docker-compose restart agent
Test the agent:
curl http://localhost:8000/health
See Docker Compose documentation for details.
- Install dependencies:
pip install -r requirements.txt
- Start infrastructure (without agent):
# Start only supporting services
docker-compose up -d openfga postgres otel-collector jaeger prometheus grafana
- Configure environment:
cp .env.example .env
# Edit .env with your API keys:
# - GOOGLE_API_KEY (get from https://aistudio.google.com/apikey)
# - ANTHROPIC_API_KEY or OPENAI_API_KEY (optional)
- Setup OpenFGA:
python scripts/setup_openfga.py
# Save OPENFGA_STORE_ID and OPENFGA_MODEL_ID to .env
- Run the agent locally:
python mcp_server_streamable.py
- Test:
# Test with example client
python examples/example_client.py
# Or curl
curl http://localhost:8000/health
python mcp_server.py
python example_client.py
Add to your MCP client config (e.g., Claude Desktop):
{
"mcpServers": {
"langgraph-agent": {
"command": "python",
"args": ["/path/to/mcp_server_langgraph/mcp_server.py"]
}
}
}
from auth import AuthMiddleware
auth = AuthMiddleware(secret_key=settings.jwt_secret_key)
# Create token
token = auth.create_token("alice", expires_in=3600)
# Authenticate user
result = await auth.authenticate("alice")
Uses relationship-based access control (Google Zanzibar model):
from openfga_client import OpenFGAClient
client = OpenFGAClient(
api_url=settings.openfga_api_url,
store_id=settings.openfga_store_id,
model_id=settings.openfga_model_id
)
# Check permission
allowed = await client.check_permission(
user="user:alice",
relation="executor",
object="tool:chat"
)
# Grant permission
await client.write_tuples([
{"user": "user:alice", "relation": "executor", "object": "tool:chat"}
])
# List accessible resources
resources = await client.list_objects(
user="user:alice",
relation="executor",
object_type="tool"
)
- alice: Premium user, member and admin of organization:acme
- bob: Standard user, member of organization:acme
- admin: Admin user with elevated privileges
See auth.py:30-50
for user definitions.
This project supports dual observability: OpenTelemetry for infrastructure metrics and LangSmith for LLM-specific tracing.
LangSmith provides comprehensive LLM and agent observability:
Setup:
# Add to .env
LANGSMITH_API_KEY=your-key-from-smith.langchain.com
LANGSMITH_TRACING=true
LANGSMITH_PROJECT=mcp-server-langgraph
Features:
- π Automatic Tracing: All LLM calls and agent steps traced
- π― Prompt Engineering: Iterate on prompts with production data
- π Evaluations: Compare model performance on datasets
- π¬ User Feedback: Collect and analyze user ratings
- π° Cost Tracking: Monitor LLM API costs per user/session
- π Debugging: Root cause analysis with full context
View traces: https://smith.langchain.com/
See LANGSMITH_INTEGRATION.md for complete LangSmith guide.
Every request is traced end-to-end with OpenTelemetry:
from observability import tracer
with tracer.start_as_current_span("my_operation") as span:
span.set_attribute("custom.attribute", "value")
# Your code here
View traces in Jaeger: http://localhost:16686
Standard metrics are automatically collected:
agent.tool.calls
: Tool invocation counteragent.calls.successful
: Successful operation counteragent.calls.failed
: Failed operation counterauth.failures
: Authentication failure counterauthz.failures
: Authorization failure counteragent.response.duration
: Response time histogram
View metrics in Prometheus: http://localhost:9090
Structured logging with trace context:
from observability import logger
logger.info("Event occurred", extra={
"user_id": "user_123",
"custom_field": "value"
})
Logs include trace_id and span_id for correlation with traces.
The agent uses the functional API with:
- State Management: TypedDict-based state with message history
- Conditional Routing: Dynamic routing based on message content
- Tool Integration: Extensible tool system (extend in
agent.py
) - Checkpointing: Conversation persistence with MemorySaver
Add tools in agent.py
:
def custom_tool(state: AgentState) -> AgentState:
# Your tool logic
return state
workflow.add_node("custom_tool", custom_tool)
workflow.add_edge("router", "custom_tool")
All settings via environment variables, Infisical, or .env
file:
Variable | Description | Default |
---|---|---|
SERVICE_NAME |
Service identifier | mcp-server-langgraph |
OTLP_ENDPOINT |
OpenTelemetry collector | http://localhost:4317 |
JWT_SECRET_KEY |
Secret for JWT signing | (loaded from Infisical) |
ANTHROPIC_API_KEY |
Anthropic API key | (loaded from Infisical) |
MODEL_NAME |
Claude model to use | claude-3-5-sonnet-20241022 |
LOG_LEVEL |
Logging level | INFO |
OPENFGA_API_URL |
OpenFGA server URL | http://localhost:8080 |
OPENFGA_STORE_ID |
OpenFGA store ID | (from setup) |
OPENFGA_MODEL_ID |
OpenFGA model ID | (from setup) |
INFISICAL_CLIENT_ID |
Infisical auth client ID | (optional) |
INFISICAL_CLIENT_SECRET |
Infisical auth secret | (optional) |
INFISICAL_PROJECT_ID |
Infisical project ID | (optional) |
See config.py
for all options.
- Infisical (if configured)
- Environment variables (fallback)
- Default values (last resort)
Access Grafana at http://localhost:3000 (admin/admin) and create dashboards using:
- Prometheus datasource: Metrics visualization
- Jaeger datasource: Trace exploration
Example queries:
- Request rate:
rate(agent_tool_calls_total[5m])
- Error rate:
rate(agent_calls_failed_total[5m])
- P95 latency:
histogram_quantile(0.95, agent_response_duration_bucket)
π Production Checklist:
- Store JWT secret in Infisical
- Use production Infisical project with proper access controls
- Configure OpenFGA with PostgreSQL backend (not in-memory)
- Enable OpenFGA audit logging
- Enable TLS for all services (OTLP, OpenFGA, PostgreSQL)
- Implement rate limiting on MCP endpoints
- Use production-grade user database
- Review and minimize OpenFGA permissions
- Set up secret rotation in Infisical
- Enable monitoring alerts for auth failures
- Implement token rotation and revocation
- Use separate OpenFGA stores per environment
- Enable MFA for Infisical access
Deploy to LangGraph Platform for fully managed, serverless hosting:
# Install CLI
pip install langgraph-cli
# Login
langgraph login
# Deploy
langgraph deploy
Benefits:
- β Zero infrastructure management
- β Integrated LangSmith observability
- β Automatic versioning and rollbacks
- β Built-in scaling and load balancing
- β One-command deployment
See LANGGRAPH_PLATFORM_DEPLOYMENT.md for complete platform guide.
Deploy to Google Cloud Run for fully managed, serverless deployment:
# Quick deploy
cd cloudrun
./deploy.sh --setup
# Or use gcloud directly
gcloud run deploy mcp-server-langgraph \
--source . \
--region us-central1 \
--allow-unauthenticated
Benefits:
- β Serverless autoscaling (0 to 100+ instances)
- β Pay only for actual usage
- β Automatic HTTPS and SSL certificates
- β Integrated with Google Secret Manager
- β Built-in monitoring and logging
See CLOUDRUN_DEPLOYMENT.md for complete Cloud Run guide.
The agent is fully containerized and ready for Kubernetes deployment. Supported platforms:
- Google Kubernetes Engine (GKE)
- Amazon Elastic Kubernetes Service (EKS)
- Azure Kubernetes Service (AKS)
- Rancher
- VMware Tanzu
Quick Deploy:
# Build and push image
docker build -t your-registry/langgraph-agent:v1.0.0 .
docker push your-registry/langgraph-agent:v1.0.0
# Deploy with Helm
helm install langgraph-agent ./helm/langgraph-agent \
--namespace langgraph-agent \
--create-namespace \
--set image.repository=your-registry/langgraph-agent \
--set image.tag=v1.0.0
# Or deploy with Kustomize
kubectl apply -k kustomize/overlays/production
See KUBERNETES_DEPLOYMENT.md for complete deployment guide.
Kong API Gateway integration provides:
- Rate Limiting: Tiered limits (60-1000 req/min) per consumer/tier
- Authentication: JWT, API Key, OAuth2
- Traffic Control: Request transformation, routing, load balancing
- Security: IP restriction, bot detection, CORS
- Monitoring: Prometheus metrics, request logging
# Deploy with Kong rate limiting
helm install langgraph-agent ./helm/langgraph-agent \
--set kong.enabled=true \
--set kong.rateLimitTier=premium
# Or apply Kong manifests directly
kubectl apply -k kubernetes/kong/
See KONG_INTEGRATION.md for complete Kong setup and rate limiting configuration.
The agent supports multiple MCP transports:
- StreamableHTTP (Recommended): Modern HTTP streaming for production
- stdio: For Claude Desktop and local applications
- HTTP/SSE (Deprecated): Legacy Server-Sent Events
# StreamableHTTP (recommended for web/production)
python mcp_server_streamable.py
# stdio (local/desktop)
python mcp_server.py
# HTTP/SSE (deprecated, legacy only)
python mcp_server_http.py
# Access StreamableHTTP endpoints
POST /message # Main MCP endpoint (streaming or regular)
GET /tools # List tools
GET /resources # List resources
Why StreamableHTTP?
- β Modern HTTP/2+ streaming
- β Better load balancer/proxy compatibility
- β Proper request/response pairs
- β Full MCP spec compliance
- β Works with Kong rate limiting
Registry compliant - Includes manifest files for MCP Registry publication.
See MCP_REGISTRY.md for registry deployment and transport configuration.
Thanks to all the amazing people who have contributed to this project! π
This project follows the all-contributors specification.
Want to be listed here? See CONTRIBUTING.md!
Need help? Check out our Support Guide for:
- π Documentation links
- π¬ Where to ask questions
- π How to report bugs
- π Security reporting
MIT - see LICENSE file for details
Built with:
- LangGraph - Agent framework
- MCP - Model Context Protocol
- OpenFGA - Authorization
- LiteLLM - Multi-LLM support
- OpenTelemetry - Observability
Special thanks to the open source community!
We welcome contributions from the community! π
-
Read the guides:
- CONTRIBUTING.md - Contribution guidelines
- DEVELOPMENT.md - Developer setup
-
Find something to work on:
-
Get help:
- π» Code: Features, bug fixes, performance improvements
- π Documentation: Guides, tutorials, API docs
- π§ͺ Testing: Unit tests, integration tests, test coverage
- π Security: Security improvements, audits
- π Translations: i18n support (future)
- π‘ Ideas: Feature requests, architecture discussions
All contributors will be recognized in our Contributors section!