A LangGraph-based multi-agent system for testing Damn Vulnerable MCP Server using Gemini 2.5 Flash with intelligent function calling.
This repository contains an interactive security testing agent designed specifically for testing the vulnerabilities in the Damn Vulnerable MCP Server project. It provides a hands-on learning environment for understanding MCP (Model Context Protocol) security issues through 10 different challenge scenarios.
- ๐ค Multi-Agent Architecture: Orchestrator + 10 individual challenge agents
- ๐ง Function Calling Integration: LLM dynamically calls MCP tools based on user requests
- ๐ฌ Conversation Persistence: Full conversation history within threads
- ๐ LangSmith Tracing: Complete observability and debugging
- ๐จ LangGraph Studio UI: Beautiful chat interface for interactive testing
- ๐ Full MCP Support: Tools, Resources, and Prompts
- ๐ฏ Challenge-Specific Configs: Each agent has customized objectives, resources, and hints
- ๐ Educational: Learn security concepts through hands-on exploration
โโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ LangChain Agent Chat UI(Port 2024) โ
โโโโโโโโโโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโ
โ Orchestrator Agent โ
โ (Guides users) โ
โโโโโโโโโโโโโโโฌโโโโโโโโโโโโโโ
โ
โโโโโโโโโโโโโโโดโโโโโโโโโโโโโโโโโโโโโโโ
โ โ
โโโโโผโโโโโ โโโโโโโโโโ ... โโโโโโโโโโโผโ
โ C1 (9001)โ โC2 (9002)โ โC10 (9010)โ
โโโโโฌโโโโโ โโโโโฌโโโโโ โโโโโโฌโโโโโ
โ โ โ
โผ โผ โผ
MCP Server MCP Server MCP Server
Each challenge agent connects to its respective MCP server running on ports 9001-9010.
langgraph-security-mcp/
โโโ orchestrator_agent.py # Main orchestrator to guide users
โโโ challenge_agents.py # 10 challenge-specific agents (C1-C10)
โโโ challenge_configs.py # Challenge metadata (name, tools, resources, objectives)
โโโ mcp_client.py # MCP client wrapper
โโโ hint_agent.py # (Legacy) Interactive hint agent
โโโ langgraph.json # LangGraph configuration
โโโ pytest.ini # Pytest configuration
โโโ requirements.txt # Python dependencies
โโโ tests/ # Automated test suite
โ โโโ conftest.py # Test fixtures and MCP server management
โ โโโ test_challenge_agents.py # Main test suite
โ โโโ README.md # Testing documentation
โโโ .env # Environment variables (not committed)
โโโ .gitignore # Git ignore rules
โโโ README.md # This file
git clone git@github.com:prashantkul/mcp_security_test_agent.git
cd mcp_security_test_agent
conda create -n langgraph_mcp python=3.11
conda activate langgraph_mcp
pip install -r requirements.txt
Create a .env
file:
# Gemini Configuration
GOOGLE_API_KEY=your_gemini_api_key_here
GEMINI_MODEL=gemini-2.5-flash
# LangSmith Configuration (optional but recommended)
LANGCHAIN_TRACING_V2=true
LANGCHAIN_API_KEY=your_langsmith_api_key_here
LANGCHAIN_PROJECT=mcp-security-agent
Get API Keys:
- Gemini API: Google AI Studio
- LangSmith API: LangSmith Settings
This project includes a comprehensive test suite with automatic MCP server spawning:
# Install test dependencies
pip install pytest pytest-asyncio
# Run all tests (without MCP servers)
pytest tests/ -v -m "not integration"
# Run integration tests (automatically spawns MCP servers)
export DVMCP_SERVER_PATH="$HOME/path/to/damn-vulnerable-MCP-server"
pytest tests/ -v -m integration
# Run all tests
pytest tests/ -v
Test Features:
- โ Automatic MCP server spawning and cleanup
- โ Conversation persistence validation
- โ Function calling and tool execution tests
- โ Challenge-specific vulnerability tests
- โ Error handling and edge cases
See tests/README.md for detailed testing documentation.
First, clone and run the vulnerable MCP server:
# Clone the vulnerable server repository
git clone https://github.com/harishsg993010/damn-vulnerable-MCP-server.git
cd damn-vulnerable-MCP-server
# Start Challenge 1 (or any challenge you want to test)
cd challenges/easy/challenge1
python server_sse.py # Runs on port 9001
The server will start on the respective port (9001-9010 depending on the challenge).
In a separate terminal:
conda activate langgraph_mcp
langgraph dev
This will:
- Start the LangGraph API server on
http://127.0.0.1:2024
- Open LangGraph Studio in your browser
- Register all 11 agents (orchestrator + Challenge1-Challenge10)
The orchestrator provides guidance on how to use the challenge agents:
- Select "orchestrator" from the agent dropdown
- Ask questions or request guidance
- Switch to specific challenge agents as instructed
For hands-on testing, directly select a challenge agent:
- Select "Challenge1" (or any Challenge2-10) from the dropdown
- Start testing with commands like:
list resources
- See available MCP resourceslist tools
- See available MCP toolslist prompts
- See available MCP promptsrun get_user_info with user=admin
- Execute MCP toolread resource internal://credentials
- Read MCP resource
The LLM will intelligently interpret your requests and call the appropriate MCP tools!
You: list resources
Agent: ๐ Resources:
[
{
"uri": "notes://{user_id}",
"name": "user_notes",
"description": "Access user notes"
},
{
"uri": "internal://credentials",
"name": "credentials",
"description": "System credentials"
}
]
You: run get_user_info tool with user=admin
Agent: ๐ง Tool result:
{
"username": "admin",
"role": "administrator",
"email": "admin@example.com"
}
The agent supports all 10 challenges from the Damn Vulnerable MCP Server:
Challenge | Port | Difficulty | Description |
---|---|---|---|
Challenge 1 | 9001 | Easy | Basic Prompt Injection |
Challenge 2 | 9002 | Easy | Tool Poisoning |
Challenge 3 | 9003 | Easy | Excessive Permission Scope |
Challenge 4 | 9004 | Medium | Rug Pull Attack |
Challenge 5 | 9005 | Medium | Tool Shadowing |
Challenge 6 | 9006 | Medium | Indirect Prompt Injection |
Challenge 7 | 9007 | Medium | Token Theft |
Challenge 8 | 9008 | Hard | Malicious Code Execution |
Challenge 9 | 9009 | Hard | Remote Access Control |
Challenge 10 | 9010 | Hard | Multi-Vector Attack |
Instead of hardcoded command parsing, the agent uses LangChain function calling:
-
MCP tools are exposed as LangChain tools:
list_mcp_tools()
- List available toolslist_mcp_resources()
- List available resourceslist_mcp_prompts()
- List available promptsget_user_info(username)
- Get user informationread_mcp_resource(uri)
- Read MCP resource
-
LLM decides when to call tools: Gemini 2.5 Flash intelligently interprets natural language requests and calls the appropriate tools
-
Agent-Tool Loop: The graph executes tools and feeds results back to the LLM
Uses the add_messages
reducer to maintain full conversation history:
class ChallengeState(TypedDict):
messages: Annotated[Sequence[BaseMessage], add_messages]
This ensures the agent remembers the entire conversation within each thread.
When enabled, you get:
- Full trace visualization of agent execution
- Token usage and cost tracking
- Debug information for each LLM call
- Performance metrics and latency analysis
View your traces at: https://smith.langchain.com
- LangGraph: Multi-agent orchestration and state management
- LangChain: LLM integration and tool calling
- Gemini 2.5 Flash: Backend LLM with function calling
- MCP Python SDK: Official MCP protocol client
- LangGraph Studio: Interactive UI for testing
- LangSmith: Tracing and observability
- Damn Vulnerable MCP Server: The vulnerable MCP server this agent tests against
- MCP Protocol: Official Model Context Protocol documentation
This is a security research and educational tool. Use responsibly and only against systems you have permission to test.
- Built to test harishsg993010's Damn Vulnerable MCP Server
- Powered by Anthropic's Model Context Protocol
- Created with LangGraph