β οΈ Architecture Update: The monolithicmcp_server.pyis now deprecated in favor of the modularapp/main.pystructure. Both are functionally identical, but new code should use the modular version. Migration Guide
A persistent backend service that reduces AI agent context window bloat by centralizing memory, tools, and logic.
Before (Tool-in-Prompt):
- Tool schemas repeated in every LLM call
- Memory/state re-sent constantly
- Context window fills up fast
- Poor multi-agent support
After (MCP Server):
- Tools live on external server
- Memory persists across sessions
- Lean prompts (10x token reduction)
- Scalable multi-agent architecture
pip install -r requirements.txtProduction (Recommended):
python -m uvicorn app.main:app --host 0.0.0.0 --port 8000Development with auto-reload:
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000Server runs at: http://localhost:8000
Optional Environment Variables:
# Enable API key authentication (recommended for production)
export MCP_API_KEY="your-secret-key"
# GitHub API token (optional but recommended for higher rate limits)
export MCP_GITHUB_TOKEN="your-github-token"
# Figma API token (required for Figma actions)
export MCP_FIGMA_TOKEN="your-figma-token"
# Configure CORS allowed origins (default: *)
export MCP_CORS_ORIGINS="https://yourdomain.com,https://app.yourdomain.com"
# Configure rate limit (default: 100/minute)
export MCP_RATE_LIMIT="200/minute"
# Configure log retention limit (default: 1000)
export MCP_LOG_RETENTION="5000"python mcp_client_example.pyThe MCP Server provides auto-generated, interactive API documentation powered by OpenAPI (Swagger) and ReDoc:
Swagger UI (Interactive):
http://localhost:8000/docs
- Try out API endpoints directly from your browser
- View request/response examples
- See detailed parameter descriptions
ReDoc (Clean Documentation):
http://localhost:8000/redoc
- Clean, searchable API reference
- Three-panel design for easy navigation
- Mobile-friendly interface
OpenAPI Specification (JSON):
http://localhost:8000/openapi.json
- Download the complete OpenAPI 3.1.0 specification
- Use with code generators (OpenAPI Generator, Swagger Codegen)
- Import into API testing tools (Postman, Insomnia)
All endpoints return responses in this standard format:
{
"success": true,
"data": { ... },
"message": "Action completed successfully",
"timestamp": "2026-02-16T16:00:00.000000"
}Response Fields:
success(boolean) - Whether the operation succeededdata(any) - Response data (varies by endpoint)message(string) - Human-readable status messagetimestamp(string) - ISO 8601 timestamp
The MCP Server supports API versioning to ensure backward compatibility and allow for future enhancements without breaking existing clients.
Current API Versions:
- v1 (
/api/v1/*) - Current stable version (recommended) - v2 (
/api/v2/*) - Placeholder for future extensions - Legacy (
/mcp/*) - Original endpoints (deprecated, maintained for backward compatibility)
Migration Guide: All legacy
/mcp/*endpoints are available at/api/v1/*. For example:
/mcp/stateβ/api/v1/state/mcp/queryβ/api/v1/query/mcp/logsβ/api/v1/logs/mcp/resetβ/api/v1/reset
| Endpoint | Method | Description |
|---|---|---|
/ |
GET | Health check (shows available API versions) |
/api/v1/query |
POST | Main action endpoint |
/api/v1/state |
GET | Get memory snapshot (supports filtering & pagination) |
/api/v1/logs |
GET | View structured action logs with configurable retention |
/api/v1/reset |
POST | Reset all memory |
| Endpoint | Method | Description | Replacement |
|---|---|---|---|
/mcp/query |
POST | Main action endpoint | Use /api/v1/query |
/mcp/state |
GET | Get memory snapshot | Use /api/v1/state |
/mcp/logs |
GET | View action logs | Use /api/v1/logs |
/mcp/reset |
POST | Reset all memory | Use /api/v1/reset |
Note: Legacy endpoints are maintained for backward compatibility but may be removed in a future major version. Please migrate to versioned endpoints.
User Management:
list_users- Get all usersadd_user- Add new userremove_user- Remove userget_user- Get user details
Task Management:
list_tasks- List all tasks (with filters)add_task- Create new taskupdate_task- Update taskdelete_task- Delete tasksearch_tasks- Search by query
Configuration:
get_config- Get config valuesupdate_config- Update config
Utilities:
calculate- Perform calculationssummarize_data- Get data summary
GitHub:
github_search_repositories- Search reposgithub_search_issues- Search issues/PRsgithub_get_repository- Repo detailsgithub_list_issues- List issuesgithub_list_pulls- List PRs
Figma:
figma_get_file- File metadatafigma_get_nodes- Node metadatafigma_get_components- File componentsfigma_get_styles- File styles
Playwright:
playwright_get_title- Page titleplaywright_get_text- Page textplaywright_screenshot- Screenshot
from mcp_client_example import MCPClient
client = MCPClient()
# Add user
client.add_user("alice")
# Create task
task = client.add_task(
title="Build feature X",
priority="high",
assigned_to="alice"
)
# List tasks for user
alice_tasks = client.list_tasks(assigned_to="alice")
# Get summary
summary = client.get_summary()Using V1 API (Recommended):
# List all users
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "list_users",
"params": {}
}'
# Add a new user
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "add_user",
"params": {
"username": "alice",
"role": "admin",
"metadata": {
"team": "engineering",
"location": "San Francisco"
}
}
}'
# Get user details
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "get_user",
"params": {
"username": "alice"
}
}'
# Remove a user
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "remove_user",
"params": {
"username": "alice"
}
}'# Create a new task
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "add_task",
"params": {
"title": "Implement API documentation",
"description": "Add comprehensive OpenAPI/Swagger docs",
"priority": "high",
"assigned_to": "alice"
}
}'
# List all tasks
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "list_tasks",
"params": {}
}'
# List tasks for a specific user
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "list_tasks",
"params": {
"assigned_to": "alice"
}
}'
# List tasks by status
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "list_tasks",
"params": {
"status": "pending"
}
}'
# Update a task
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "update_task",
"params": {
"task_id": 1,
"status": "completed",
"priority": "medium"
}
}'
# Delete a task
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "delete_task",
"params": {
"task_id": 1
}
}'
# Search tasks by keyword
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "search_tasks",
"params": {
"query": "documentation"
}
}'# Get all configuration
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "get_config",
"params": {}
}'
# Get specific config value
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "get_config",
"params": {
"key": "app_name"
}
}'
# Update configuration
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "update_config",
"params": {
"key": "app_name",
"value": "My MCP Server"
}
}'# Perform calculations
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "calculate",
"params": {
"operation": "sum",
"numbers": [10, 20, 30, 40]
}
}'
# Get data summary
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-d '{
"action": "summarize_data",
"params": {}
}'# Get full state
curl http://localhost:8000/api/v1/state
# Get filtered state (tasks only, first 10)
curl "http://localhost:8000/api/v1/state?entity=tasks&limit=10"
# Get pending tasks
curl "http://localhost:8000/api/v1/state?entity=tasks&status=pending"
# Get logs
curl http://localhost:8000/api/v1/logs?limit=10
# Reset all memory (use with caution!)
curl -X POST http://localhost:8000/api/v1/resetWhen MCP_API_KEY environment variable is set, include the API key in all requests:
# Using API key authentication
curl -X POST http://localhost:8000/api/v1/query \
-H "Content-Type: application/json" \
-H "X-API-Key: your-secret-key-here" \
-d '{
"action": "list_users",
"params": {}
}'
# Get state with authentication
curl http://localhost:8000/api/v1/state \
-H "X-API-Key: your-secret-key-here"Legacy endpoints (still work but deprecated):
# Legacy endpoints - replace /mcp/ with /api/v1/
curl http://localhost:8000/mcp/state # Deprecated
curl http://localhost:8000/api/v1/state # Use this insteadThe /api/v1/state (and legacy /mcp/state) endpoint supports optional query parameters for filtering and pagination:
entity: Filter by entity type (users|tasks|config|logs)limit: Maximum number of items to returnoffset: Number of items to skip (for pagination)status: Filter tasks by status (only applies whenentity=tasks)
Examples:
# Get only tasks
curl http://localhost:8000/api/v1/state?entity=tasks
# Get first 5 pending tasks
curl http://localhost:8000/api/v1/state?entity=tasks&status=pending&limit=5
# Get users with pagination (skip first 10, return next 20)
curl http://localhost:8000/api/v1/state?entity=users&offset=10&limit=20Instead of defining tools in your LLM prompt, just give it the endpoint:
# Old way (bloated prompt):
prompt = """
You have access to these tools:
[...massive tool schemas...]
[...memory state...]
"""
# New way (lean prompt):
prompt = """
You can query the MCP server at http://localhost:8000/api/v1/query
Available actions: list_users, add_task, list_tasks, etc.
Example:
POST /api/v1/query
{
"action": "list_users",
"params": {}
}
"""For a full end-to-end walkthrough (start server β IDE β tool calls), see docs/IDE_INTEGRATION.md. For the integration action catalog, see docs/MCP_INTEGRATIONS.md.
Your AI assistant can call the MCP server directly from generated code:
import requests
def get_user_tasks(username):
response = requests.post(
"http://localhost:8000/api/v1/query",
json={
"action": "list_tasks",
"params": {"assigned_to": username}
}
)
return response.json()["data"]Create .vscode/tasks.json:
{
"version": "2.0.0",
"tasks": [
{
"label": "Start MCP Server",
"type": "shell",
"command": "python",
"args": ["mcp_cli.py", "start"],
"isBackground": true,
"problemMatcher": []
}
]
}Run with: Terminal > Run Task > Start MCP Server
Create .vscode/launch.json:
{
"version": "0.2.0",
"configurations": [
{
"name": "MCP Server",
"type": "python",
"request": "launch",
"program": "${workspaceFolder}/mcp_cli.py",
"args": ["start", "--reload"],
"console": "integratedTerminal"
}
]
}Debug with breakpoints: F5
All actions are logged with structured JSON format including:
- timestamp: ISO 8601 timestamp of the action
- action: The action that was performed
- payload: Action parameters and result (truncated to 200 chars)
- status:
successorerror
View recent actions:
# Using v1 API (recommended)
curl http://localhost:8000/api/v1/logs?limit=10
# Legacy endpoint (deprecated)
curl http://localhost:8000/mcp/logs?limit=10Example log entry:
{
"timestamp": "2026-02-16T15:14:57.661938",
"action": "add_user",
"payload": {
"params": {"username": "alice"},
"result": "{'username': 'alice', 'added': True}"
},
"status": "success"
}Logs are automatically trimmed to maintain the configured retention limit (default: 1000 entries). Configure via environment variable:
export MCP_LOG_RETENTION="5000" # Keep last 5000 log entriesVia /api/v1/logs endpoint (recommended):
# Get last 10 logs
curl http://localhost:8000/api/v1/logs?limit=10Via /api/v1/state endpoint with entity filter:
# Get logs with pagination
curl "http://localhost:8000/api/v1/state?entity=logs&limit=20&offset=10"Legacy endpoints (deprecated):
curl http://localhost:8000/mcp/logs?limit=10
curl "http://localhost:8000/mcp/state?entity=logs&limit=20&offset=10"Check server health:
curl http://localhost:8000/Preferred (modular app): add a service and register it in the v1 router.
# app/services/notification_service.py
async def send_notification(params: Dict[str, Any], db: Session) -> Dict[str, Any]:
user = params.get("user")
message = params.get("message")
if not user or not message:
raise ValueError("user and message are required")
return {"sent": True, "user": user}
# app/routers/v1.py
handlers = {
# ... existing handlers ...
"send_notification": notification_service.send_notification,
}Legacy (monolithic app): add new actions to mcp_server.py:
# 1. Add handler function
async def handle_my_custom_tool(params: Dict[str, Any]) -> Any:
# Your logic here
result = params.get("input") * 2
return {"result": result}
# 2. Register in handlers dict
handlers = {
# ... existing handlers ...
"my_custom_tool": handle_my_custom_tool,
}Then call it:
client.query("my_custom_tool", {"input": 42})For production, replace in-memory storage with a database:
import sqlite3
class DatabaseMemory:
def __init__(self, db_path="mcp.db"):
self.conn = sqlite3.connect(db_path, check_same_thread=False)
self.init_tables()
def init_tables(self):
self.conn.execute("""
CREATE TABLE IF NOT EXISTS tasks (
id INTEGER PRIMARY KEY,
title TEXT,
priority TEXT,
assigned_to TEXT,
created_at TEXT
)
""")
self.conn.commit()
def add_task(self, task):
self.conn.execute(
"INSERT INTO tasks (title, priority, assigned_to, created_at) VALUES (?, ?, ?, ?)",
(task["title"], task["priority"], task["assigned_to"], task["created_at"])
)
self.conn.commit()import psycopg2
conn = psycopg2.connect(
host="localhost",
database="mcp_db",
user="user",
password="password"
)Create Dockerfile:
FROM python:3.11-slim
WORKDIR /app
COPY requirements.txt .
RUN pip install --no-cache-dir -r requirements.txt
COPY . .
EXPOSE 8000
CMD ["python", "mcp_cli.py", "start", "--host", "0.0.0.0", "--port", "8000"]Build and run:
docker build -t mcp-server .
docker run -p 8000:8000 mcp-serverCreate /etc/systemd/system/mcp-server.service:
[Unit]
Description=MCP Server
After=network.target
[Service]
Type=simple
User=youruser
WorkingDirectory=/path/to/mcp
ExecStart=/usr/bin/python3 /path/to/mcp/mcp_cli.py start
Restart=always
[Install]
WantedBy=multi-user.targetEnable:
sudo systemctl enable mcp-server
sudo systemctl start mcp-serverAPI key authentication is built-in and can be enabled by setting an environment variable:
export MCP_API_KEY="your-secret-key-here"
python mcp_cli.py startThen include the API key in requests:
# Using v1 API (recommended)
curl http://localhost:8000/api/v1/state \
-H "X-API-Key: your-secret-key-here"
# Legacy endpoint
curl http://localhost:8000/mcp/state \
-H "X-API-Key: your-secret-key-here"If MCP_API_KEY is not set, authentication is disabled (useful for development).
CORS is enabled by default with wildcard origins for development. For production, restrict to specific domains:
# Allow specific origins (comma-separated)
export MCP_CORS_ORIGINS="https://yourdomain.com,https://app.yourdomain.com"
python mcp_cli.py startThe CORS middleware supports:
- Configurable allowed origins
- Credentials support
- Standard HTTP methods (GET, POST, PUT, DELETE, OPTIONS)
- All headers allowed
Rate limiting is automatically enabled to protect against abuse:
# Configure rate limit (default: 100/minute)
export MCP_RATE_LIMIT="200/minute"
python mcp_cli.py startSupported formats:
100/minute- 100 requests per minute10/second- 10 requests per second1000/hour- 1000 requests per hour
When rate limit is exceeded, the API returns:
- Status:
429 Too Many Requests - Response:
{"error":"Rate limit exceeded: 100 per 1 minute"}
Rate limiting applies to all endpoints and is enforced per IP address.
- Use connection pooling for database connections
- Add caching for frequently accessed data (Redis)
- Configure rate limiting based on your traffic patterns (see Security section)
- Use async handlers for I/O operations
- Add pagination for large result sets
- Set appropriate CORS origins to reduce unauthorized requests
The project includes a comprehensive pytest-based test suite covering all endpoints and operations.
# Install test dependencies
pip install -r requirements.txt
# Run all tests
pytest
# Run with verbose output
pytest -v
# Run specific test file
pytest tests/test_users.py
# Run with coverage report
pytest --cov=. --cov-report=htmlThe test suite includes 78+ tests covering:
- Endpoints: Health check, state, query, logs, reset
- User Operations: List, add, remove, get user details
- Task Operations: CRUD, search, filtering
- Config Operations: Get and update configuration
- Authentication: API key validation and security
- Error Handling: Invalid inputs, missing parameters, edge cases
tests/
βββ conftest.py # Pytest fixtures and configuration
βββ test_endpoints.py # Core endpoint tests
βββ test_users.py # User management tests
βββ test_tasks.py # Task management tests
βββ test_config.py # Configuration tests
βββ test_auth.py # Authentication tests
βββ test_error_handling.py # Error handling and edge cases
All tests use an in-memory SQLite database for speed and isolation. See tests/README.md for detailed documentation.
The codebase has been reorganized for clarity and maintainability:
| Old Location | New Location | Status |
|---|---|---|
mcp_server.py |
app/main.py |
|
routers/v1.py |
app/routers/v1.py |
|
routers/v2.py |
app/routers/v2.py |
|
models.py |
app/models/ |
|
auth.py |
app/auth.py |
|
database.py |
app/database.py |
|
log_manager.py |
app/log_manager.py |
Old startup command:
python mcp_server.pyNew startup command (recommended):
python -m uvicorn app.main:app --reload --host 0.0.0.0 --port 8000- All endpoints remain identical
/mcp/*routes are now marked@deprecatedbut still functional- Prefer
/api/v1/*routes for new clients /api/v2/*is now available for future enhancements
- All
ValueErrorexceptions now automatically return HTTP 400 Bad Request - No user-facing API changes required
- Improves consistency across all endpoints
- β Cleaner code organization (separation of concerns)
- β Easier to test individual components
- β Better scalability for adding new features
- β Services layer enables code reuse
Both implementations work identically. The old files remain functional for backward compatibility and can be phased out at your own pace.
MIT
- Fork the repo
- Create feature branch
- Add your enhancements
- Submit pull request
Built for AI agents to reduce context window bloat π