Autonomous incident response using Claude Agent SDK with Skills and Model Context Protocol
Transform incident response from a 45-minute manual process into a 3-5 minute autonomous resolution using:
- Agent Skills (
.claude/skills/- filesystem-based instructions with progressive disclosure) - MCP Servers (FastMCP - tool integrations)
- Claude Agent SDK (Autonomous agentic execution)
- Structured Workflows (6-phase incident response with phase transition summaries)
Required: Claude Code CLI 2.0.55 or later (for Skill tool support)
# Check version
claude --version # Should show 2.0.55 or higher
# Upgrade if needed
claude update# Clone the repository
git clone https://github.com/My3VM/CladeSkillDemo.git
cd CladeSkillDemoNote: Throughout the documentation, paths like /Users/<username>/CladeSkillDemo/analytics/ are shown with <username> as a placeholder. Replace <username> with your actual username. For example:
# 1. Setup
python3 -m venv cenv
source cenv/bin/activate
pip install -r requirements.txt
# 2. Start MCP servers (Terminal 1)
./scripts/start-servers.sh
# 3. Start Web UI (Terminal 2)
./scripts/run-web-ui.sh
# 4. Open browser to http://localhost:8000# 1. Setup
python3 -m venv cenv
source cenv/bin/activate
pip install -r requirements.txt
# 2. Start MCP servers (Terminal 1)
./scripts/start-servers.sh
# 3. Run demo (Terminal 2)
python demos/run_scenario.pyClaudeSkillsSDK/
βββ .claude/skills/ # Agent Skills (SKILL.md files)
β βββ incident-analysis/ # 6-phase incident response workflow
β βββ log-analytics/ # Code generation for log analysis
β
βββ mcp-servers/ # FastMCP servers
β βββ monitoring-analysis-server.py # Port 9001
β βββ workflow-orchestration-server.py # Port 9002
β βββ log-analytics-server.py # Port 9003
β
βββ claude-agent/
β βββ agent.py # Agent SDK integration
β βββ utils/
β βββ todo_tracker.py # Live progress tracking
β
βββ analytics/ # Generated log analysis outputs
β
βββ web_ui.py # Web UI (localhost:8000)
β
βββ demos/
β βββ run_scenario.py # CLI demo runner
β
βββ scripts/
βββ setup.sh
βββ run-demo.sh
βββ run-web-ui.sh # Web UI launcher
βββ start-servers.sh
Filesystem-based instructions that Claude autonomously invokes:
---
name: incident-analysis
description: Analyze and resolve production incidents
---
# Incident Analysis Skill
[Instructions for Claude...]When Claude sees: "Our API is slow, investigate and resolve"
It automatically uses: incident-analysis Skill
FastMCP tools that Claude calls explicitly:
@mcp_server.tool()
async def get_system_metrics(incident_type: Optional[str] = None) -> str:
"""Get current system metrics"""
return json.dumps(metrics)from claude_agent_sdk import query, ClaudeAgentOptions
options = ClaudeAgentOptions(
cwd=str(project_root), # .claude/skills/ location
setting_sources=["user", "project"], # Load Skills from filesystem
mcp_servers=mcp_servers, # Connect to MCP servers
allowed_tools=["Skill", "Read", "TodoWrite", "Bash"], # Enable core tools
permission_mode='bypassPermissions', # Autonomous execution
model="us.anthropic.claude-sonnet-4-5-20250929-v1:0",
max_turns=100 # Allow full workflow completion
)
async for message in query(prompt=user_prompt, options=options):
# Agent autonomously:
# 1. Invokes Skill tool for incident-analysis
# 2. Reads phase files progressively
# 3. Calls MCP tools as instructed
# 4. Tracks progress with TodoWrite
print(message)python demos/run_scenario.py --scenario connection_leakpython demos/run_scenario.py --scenario memory_leak- ARCHITECTURE.md - Complete architecture
- SKILLS_README.md - Agent Skills guide
- MCP_SERVERS.md - MCP server details
- QUICKSTART.md - 2-minute quick start
- User prompt: "Production system is degraded. Users can't log in."
- Claude invokes:
Skilltool withincident-analysisskill - Progressive disclosure: Reads phase files sequentially:
- Phase 1:
phases/triage.mdβ Gathers metrics, creates incident - Phase 2:
phases/investigation.mdβ Analyzes logs, forms hypotheses - Phase 3:
phases/rca.mdβ Confirms root cause with 90%+ confidence - Phase 4:
phases/remediation.mdβ Plans fix strategy - Phase 5:
phases/execution.mdβ Executes and verifies remediation - Phase 6:
phases/documentation.mdβ Documents and notifies team
- Phase 1:
- Structured transitions: Provides summary before each phase transition
- MCP tool calls:
get_system_metrics(),analyze_logs(),execute_remediation(), etc. - Autonomous resolution: Complete incident lifecycle with audit trail
- Progressive Disclosure: Phase instructions revealed incrementally
- Structured Transitions: Mandated summaries between phases
- Domain Expertise: Codified incident response best practices
- Autonomous Invocation: Model-invoked based on description matching
- Audit Trail: Each phase creates documented decision points
- System Integrations: Monitoring, logging, remediation
- Data Operations: Metrics collection, log analysis
- Concrete Actions: Health checks, incident creation, notifications
- Explicit Invocation: Called by Claude as instructed by Skills
- Real-time Updates: Phase completion tracking
- Structured Task List: 6 todos matching 6 phases
- Status Transitions: pending β in_progress β completed
- Completion Rate: Visual progress indicator
CRITICAL: The log-analytics skill enforces secure file paths to prevent sensitive data exposure.
β Allowed:
analytics/incident_logs.json- Raw log dataanalytics/parse_logs_*.py- Generated analysis scriptsanalytics/analysis_results_*.json- Analysis outputs
β Forbidden:
/tmp/- Temporary files accessible to other processes/private/tmp/- Same security risk- Any path outside the project workspace
Why: Incident logs contain sensitive production data. Saving to /tmp/ exposes this data to other system processes, creating a security vulnerability.
Enforcement: Each phase file in .claude/skills/log-analytics/phases/ explicitly warns against temp directory usage and requires verification of file paths.
mkdir -p .claude/skills/your-skillCreate .claude/skills/your-skill/SKILL.md:
---
name: your-skill
description: When to use this skill
---
# Your Skill
Instructions for Claude...In mcp-servers/*/server.py:
@mcp_server.tool()
async def your_tool(param: str) -> str:
"""Tool description"""
return json.dumps(result)MIT License - see LICENSE file
Built to demonstrate autonomous AI agents with Claude Skills + MCP π