diff --git a/README.md b/README.md index 6401fbb..8032127 100644 --- a/README.md +++ b/README.md @@ -45,34 +45,36 @@ graph TB - ๐Ÿ Python 3.12+ - โšก [uv](https://docs.astral.sh/uv/getting-started/installation/) package manager -### ๐Ÿš€ Setup +### ๐Ÿš€ Setup & Testing ```bash git clone https://github.com/DeepDiagnostix-AI/spark-history-server-mcp.git cd spark-history-server-mcp -uv sync --frozen -uv run main.py + +# Install Task (if not already installed) +brew install go-task # macOS, see https://taskfile.dev/installation/ for others + +# Setup and start testing +task install # Install dependencies +task start-spark-bg # Start Spark History Server with sample data +task start-mcp-bg # Start MCP Server +task start-inspector-bg # Start MCP Inspector + +# Opens http://localhost:6274 for interactive testing +# When done: task stop-all ``` ### โš™๏ธ Configuration -Edit `config.yaml`: +Edit `config.yaml` for your Spark History Server: ```yaml servers: local: - default: true # if server name is not provided in tool calls, this Spark History Server is used + default: true url: "http://your-spark-history-server:18080" auth: # optional username: "user" password: "pass" ``` -### ๐Ÿ”ฌ Testing with MCP Inspector -```bash -# Start MCP server with Inspector (opens browser automatically) -npx @modelcontextprotocol/inspector -``` - -**๐ŸŒ Test in Browser** - The MCP Inspector opens at http://localhost:6274 for interactive tool testing! - ## ๐Ÿ“ธ Screenshots ### ๐Ÿ” Get Spark Application @@ -81,44 +83,34 @@ npx @modelcontextprotocol/inspector ### โšก Job Performance Comparison ![Job Comparison](screenshots/job-compare.png) -## ๐Ÿ› ๏ธ Available Tools - -### ๐Ÿ“Š Application & Job Analysis -| ๐Ÿ”ง Tool | ๐Ÿ“ Description | -|---------|----------------| -| `get_application` | Get detailed information about a specific Spark application | -| `get_jobs` | Get a list of all jobs for a Spark application | -| `get_slowest_jobs` | Get the N slowest jobs for a Spark application | - -### โšก Stage & Task Analysis -| ๐Ÿ”ง Tool | ๐Ÿ“ Description | -|---------|----------------| -| `get_stages` | Get a list of all stages for a Spark application | -| `get_slowest_stages` | Get the N slowest stages for a Spark application | -| `get_stage` | Get information about a specific stage | -| `get_stage_task_summary` | Get task metrics summary for a specific stage | -### ๐Ÿ–ฅ๏ธ Executor & Resource Analysis -| ๐Ÿ”ง Tool | ๐Ÿ“ Description | -|---------|----------------| -| `get_executors` | Get executor information for an application | -| `get_executor` | Get information about a specific executor | -| `get_executor_summary` | Get aggregated metrics across all executors | -| `get_resource_usage_timeline` | Get resource usage timeline for an application | +## ๐Ÿ› ๏ธ Available Tools -### ๐Ÿ” SQL & Performance Analysis +### Core Analysis Tools (All Integrations) | ๐Ÿ”ง Tool | ๐Ÿ“ Description | |---------|----------------| -| `get_slowest_sql_queries` | Get the top N slowest SQL queries for an application | -| `get_job_bottlenecks` | Identify performance bottlenecks in a Spark job | -| `get_environment` | Get comprehensive Spark runtime configuration | - -### ๐Ÿ“ˆ Comparison Tools +| `get_application` | ๐Ÿ“Š Get detailed application information | +| `get_jobs` | ๐Ÿ”— List jobs within an application | +| `compare_job_performance` | ๐Ÿ“ˆ Compare performance between applications | +| `compare_sql_execution_plans` | ๐Ÿ”Ž Compare SQL execution plans | +| `get_job_bottlenecks` | ๐Ÿšจ Identify performance bottlenecks | +| `get_slowest_jobs` | โฑ๏ธ Find slowest jobs in application | + +### Additional Tools (LlamaIndex/LangGraph HTTP Mode) | ๐Ÿ”ง Tool | ๐Ÿ“ Description | |---------|----------------| -| `compare_job_performance` | Compare performance metrics between two Spark jobs | -| `compare_job_environments` | Compare Spark environment configurations between two jobs | -| `compare_sql_execution_plans` | Compare SQL execution plans between two Spark jobs | +| `list_applications` | ๐Ÿ“‹ List Spark applications with filtering | +| `get_application_details` | ๐Ÿ“Š Get comprehensive application info | +| `get_stage_details` | โšก Analyze stage-level metrics | +| `get_task_details` | ๐ŸŽฏ Examine individual task performance | +| `get_executor_summary` | ๐Ÿ–ฅ๏ธ Review executor utilization | +| `get_application_environment` | โš™๏ธ Review Spark configuration | +| `get_storage_info` | ๐Ÿ’พ Analyze RDD storage usage | +| `get_sql_execution_details` | ๐Ÿ”Ž Deep dive into SQL queries | +| `get_resource_usage_timeline` | ๐Ÿ“ˆ Resource allocation over time | +| `compare_job_environments` | โš™๏ธ Compare Spark configurations | +| `get_slowest_stages` | โฑ๏ธ Find slowest stages | +| `get_task_metrics` | ๐Ÿ“Š Detailed task performance metrics | ## ๐Ÿš€ Production Deployment @@ -139,69 +131,13 @@ helm install spark-history-mcp ./deploy/kubernetes/helm/spark-history-mcp/ \ ๐Ÿ“š See [`deploy/kubernetes/helm/`](deploy/kubernetes/helm/) for complete deployment manifests and configuration options. -## ๐Ÿงช Testing & Development - -### ๐Ÿ”ฌ Local Development - -#### ๐Ÿ“‹ Prerequisites -- Install [Task](https://taskfile.dev/installation/) for running development commands: - ```bash - # macOS - brew install go-task - - # Other platforms - see https://taskfile.dev/installation/ - ``` - -*Note: uv will be automatically installed when you run `task install`* - -#### ๐Ÿš€ Development Commands - -**Quick Setup:** -```bash -# ๐Ÿ“ฆ Install dependencies and setup pre-commit hooks -task install -task pre-commit-install - -# ๐Ÿš€ Start services one by one (all in background) -task start-spark-bg # Start Spark History Server -task start-mcp-bg # Start MCP Server -task start-inspector-bg # Start MCP Inspector - -# ๐ŸŒ Then open http://localhost:6274 in your browser - -# ๐Ÿ›‘ When done, stop all services -task stop-all -``` - -**Essential Commands:** -```bash - -# ๐Ÿ›‘ Stop all background services -task stop-all - -# ๐Ÿงช Run tests and checks -task test # Run pytest -task lint # Check code style -task pre-commit # Run all pre-commit hooks -task validate # Run lint + tests - -# ๐Ÿ”ง Development utilities -task format # Auto-format code -task clean # Clean build artifacts -``` - -*For complete command reference, see `Taskfile.yml`* - -### ๐Ÿ“Š Sample Data +## ๐Ÿ“Š Sample Data The repository includes real Spark event logs for testing: - `spark-bcec39f6201b42b9925124595baad260` - โœ… Successful ETL job - `spark-110be3a8424d4a2789cb88134418217b` - ๐Ÿ”„ Data processing job - `spark-cc4d115f011443d787f03a71a476a745` - ๐Ÿ“ˆ Multi-stage analytics job -They are available in the [`examples/basic/events`](examples/basic/events) directory. -The [`start_local_spark_history.sh`](start_local_spark_history.sh) script automatically makes them available for local testing. - -๐Ÿ“– **Complete testing guide**: **[TESTING.md](TESTING.md)** +๐Ÿ“– **Advanced testing**: **[TESTING.md](TESTING.md)** ## โš™๏ธ Configuration @@ -229,12 +165,17 @@ MCP_DEBUG=false ## ๐Ÿค– AI Agent Integration -For production AI agent integration, see [`examples/integrations/`](examples/integrations/): +### Quick Start Options -- ๐Ÿฆ™ [LlamaIndex](examples/integrations/llamaindex.md) - Vector indexing and search -- ๐Ÿ”— [LangGraph](examples/integrations/langgraph.md) - Multi-agent workflows +| Integration | Transport | Entry Point | Best For | +|-------------|-----------|-------------|----------| +| **[Local Testing](TESTING.md)** | HTTP | `main.py` | Development, testing tools | +| **[Claude Desktop](examples/integrations/claude-desktop/)** | STDIO | `main_stdio.py` | Interactive analysis | +| **[Amazon Q CLI](examples/integrations/amazon-q-cli/)** | STDIO | `main_stdio.py` | Command-line automation | +| **[LlamaIndex](examples/integrations/llamaindex.md)** | HTTP | `main.py` | Knowledge systems, RAG | +| **[LangGraph](examples/integrations/langgraph.md)** | HTTP | `main.py` | Multi-agent workflows | -๐Ÿงช **For local testing and development, use [TESTING.md](TESTING.md) with MCP Inspector.** +**Note**: Claude Desktop and Amazon Q CLI use STDIO transport with 6 core tools. LlamaIndex/LangGraph use HTTP transport with 18 comprehensive tools. ## ๐ŸŽฏ Example Use Cases diff --git a/TESTING.md b/TESTING.md index fb3d1ed..a3c702e 100644 --- a/TESTING.md +++ b/TESTING.md @@ -2,32 +2,44 @@ ## ๐Ÿงช Quick Test with MCP Inspector (5 minutes) +**Use this for**: Local development, testing tools, understanding capabilities + ### Prerequisites - Docker must be running (for Spark History Server) - Node.js installed (for MCP Inspector) +- Python 3.12+ with uv package manager - Run commands from project root directory -### Setup (2 terminals) +### Setup Repository +```bash +git clone https://github.com/DeepDiagnostix-AI/spark-history-server-mcp.git +cd spark-history-server-mcp + +# Install Task (if not already installed) +brew install go-task # macOS +# or see https://taskfile.dev/installation/ for other platforms + +# Setup dependencies +task install +``` + +### Start Testing ```bash -# Terminal 1: Start Spark History Server with sample data -./start_local_spark_history.sh +# One-command setup (recommended) +task start-spark-bg && task start-mcp-bg && task start-inspector-bg -# Terminal 2: Start MCP server with Inspector -npx @modelcontextprotocol/inspector uv run main.py -# This will open http://localhost:6274 in your browser +# Opens http://localhost:6274 automatically in your browser +# When done: task stop-all ``` -### Alternative: Start MCP Server Separately +**Alternative** (if you prefer manual control): ```bash # Terminal 1: Start Spark History Server -./start_local_spark_history.sh - -# Terminal 2: Start MCP Server -uv run main.py +task start-spark -# Terminal 3: Start MCP Inspector (connects to existing MCP server) -DANGEROUSLY_OMIT_AUTH=true npx @modelcontextprotocol/inspector +# Terminal 2: Start MCP server with Inspector +task start-inspector ``` #### Expected Output from Terminal 1: @@ -43,9 +55,11 @@ DANGEROUSLY_OMIT_AUTH=true npx @modelcontextprotocol/inspector ### Test Applications Available Your 3 real Spark applications (all successful): -- `spark-bcec39f6201b42b9925124595baad260` -- `spark-110be3a8424d4a2789cb88134418217b` -- `spark-cc4d115f011443d787f03a71a476a745` +- `spark-bcec39f6201b42b9925124595baad260` - ETL job (104K events) +- `spark-110be3a8424d4a2789cb88134418217b` - Data processing job (512K events) +- `spark-cc4d115f011443d787f03a71a476a745` - Multi-stage analytics job (704K events) + +**Note**: Testing uses HTTP transport with `main.py` providing access to all 18 tools. ## ๐ŸŒ Using MCP Inspector @@ -75,63 +89,6 @@ Once the MCP Inspector opens in your browser (http://localhost:6274), you can: - `spark_id2` = `spark-110be3a8424d4a2789cb88134418217b` - **Expected**: Performance comparison metrics -## ๐Ÿ”ฌ Detailed Test Cases - -### 1. **Basic Connectivity** -```json -Tool: list_applications -Expected: 3 applications returned -``` - -### 2. **Job Environment Comparison** -```json -Tool: compare_job_environments -Parameters: { - "spark_id1": "spark-bcec39f6201b42b9925124595baad260", - "spark_id2": "spark-110be3a8424d4a2789cb88134418217b" -} -Expected: Configuration differences including: -- Runtime comparison (Java/Scala versions) -- Spark property differences -- System property differences -``` - -### 3. **Performance Comparison** -```json -Tool: compare_job_performance -Parameters: { - "spark_id1": "spark-bcec39f6201b42b9925124595baad260", - "spark_id2": "spark-cc4d115f011443d787f03a71a476a745" -} -Expected: Performance metrics including: -- Resource allocation comparison -- Executor metrics comparison -- Job performance ratios -``` - -### 4. **Bottleneck Analysis** -```json -Tool: get_job_bottlenecks -Parameters: { - "spark_id": "spark-cc4d115f011443d787f03a71a476a745" -} -Expected: Performance analysis with: -- Slowest stages identification -- Resource bottlenecks -- Optimization recommendations -``` - -### 5. **Resource Timeline** -```json -Tool: get_resource_usage_timeline -Parameters: { - "spark_id": "spark-bcec39f6201b42b9925124595baad260" -} -Expected: Timeline showing: -- Executor addition/removal events -- Stage execution timeline -- Resource utilization over time -``` ## โœ… Success Criteria diff --git a/Taskfile.yml b/Taskfile.yml index 34c2a49..da8c58e 100644 --- a/Taskfile.yml +++ b/Taskfile.yml @@ -76,13 +76,13 @@ tasks: pre-commit: desc: Run all pre-commit checks cmds: - - pre-commit run --all-files + - uv run pre-commit run --all-files - echo "โœ… Pre-commit checks completed!" pre-commit-install: desc: Install pre-commit hooks cmds: - - pre-commit install + - uv run pre-commit install - echo "โœ… Pre-commit hooks installed!" clean: diff --git a/examples/integrations/README.md b/examples/integrations/README.md deleted file mode 100644 index 795a592..0000000 --- a/examples/integrations/README.md +++ /dev/null @@ -1,107 +0,0 @@ -# AI Agent Integration Examples - -This directory contains comprehensive guides for integrating the Spark History Server MCP with various AI agent frameworks and platforms. - -## Available Integrations - -### ๐Ÿ”ง **Production AI Framework Integrations** -- **[LlamaIndex](llamaindex.md)** - RAG systems and knowledge bases for Spark data - - Vector indexing of Spark application data - - Query engines for performance analysis - - Real-time monitoring chat systems - - Custom embeddings for technical content - -- **[LangGraph](langgraph.md)** - Multi-agent workflows and state machines - - Complex analysis workflows - - Multi-agent monitoring systems - - Optimization recommendation pipelines - - State-based failure investigations - -## Quick Start Guide - -1. **Choose Your Platform**: Start with Claude Desktop for immediate interactive analysis -2. **Review Integration Guide**: Each guide includes complete setup instructions -3. **Test Locally**: Use the provided sample data and local Spark History Server -4. **Customize**: Adapt the examples to your specific use cases - -## Common Integration Patterns - -### **Interactive Analysis** -Perfect for ad-hoc investigation and exploration: -- Claude Desktop integration -- Jupyter notebook workflows -- Real-time query interfaces - -### **Automated Monitoring** -Ideal for production monitoring and alerting: -- LangChain monitoring agents -- Custom alerting systems -- Integration with existing monitoring tools - -### **Knowledge Systems** -Great for building organizational knowledge bases: -- LlamaIndex RAG systems -- Historical pattern analysis -- Performance regression detection - -### **Complex Workflows** -For sophisticated analysis pipelines: -- LangGraph state machines -- Multi-step optimization workflows -- Batch failure investigations - -## ๐Ÿงช Local Testing and Development - -For local testing and development, use the **MCP Inspector** instead of complex AI agent setups: - -- **[TESTING.md](../../TESTING.md)** - Complete guide for testing with MCP Inspector -- **Interactive Testing** - Use browser-based MCP Inspector for immediate tool testing -- **No Configuration Required** - Simple one-command setup for development - -The MCP Inspector provides the fastest way to test your MCP server locally before deploying to production with AI agents. - -## Best Practices - -### **Development** -1. Start with MCP Inspector for local testing -2. Use the sample Spark applications for development -3. Implement error handling and retries -4. Log all interactions for debugging - -### **Production** -1. Deploy using Kubernetes + Helm charts -2. Implement proper authentication -3. Add rate limiting and timeouts -4. Monitor agent performance and set up alerting - -### **Performance** -1. Batch API calls when possible -2. Cache frequently accessed data -3. Use appropriate similarity thresholds -4. Optimize query patterns - -## Sample Data - -All integration examples work with the provided sample data: -- **spark-bcec39f6201b42b9925124595baad260** - Successful ETL job -- **spark-110be3a8424d4a2789cb88134418217b** - Data processing job -- **spark-cc4d115f011443d787f03a71a476a745** - Multi-stage analytics job - -Use these applications to test your integrations before connecting to production data. - -## Contributing - -We welcome contributions to expand the integration examples: - -1. **New Framework Integrations**: Add support for additional AI frameworks -2. **Production Examples**: Share real-world deployment patterns -3. **Specialized Agents**: Contribute domain-specific analysis agents -4. **Best Practices**: Document lessons learned from production deployments - -See the main project [Contributing Guide](../../README.md#-contributing) for details. - -## Support - -- ๐Ÿ› **Issues**: [GitHub Issues](https://github.com/DeepDiagnostix-AI/spark-history-server-mcp/issues) -- ๐Ÿ’ก **Discussions**: [GitHub Discussions](https://github.com/DeepDiagnostix-AI/spark-history-server-mcp/discussions) -- ๐Ÿ“– **Documentation**: [Project Wiki](https://github.com/DeepDiagnostix-AI/spark-history-server-mcp/wiki) diff --git a/examples/integrations/amazon-q-cli/README.md b/examples/integrations/amazon-q-cli/README.md new file mode 100644 index 0000000..0aae468 --- /dev/null +++ b/examples/integrations/amazon-q-cli/README.md @@ -0,0 +1,93 @@ +# Amazon Q CLI Integration + +Connect Amazon Q CLI to Spark History Server for command-line Spark analysis. + +## Prerequisites + +1. **Clone and setup repository**: +```bash +git clone https://github.com/DeepDiagnostix-AI/spark-history-server-mcp.git +cd spark-history-server-mcp + +# Install Task (if not already installed) +brew install go-task # macOS +# or see https://taskfile.dev/installation/ for other platforms + +# Setup dependencies +task install +``` + +2. **Start Spark History Server with sample data**: +```bash +task start-spark-bg +# Starts server at http://localhost:18080 with 3 sample applications +``` + +3. **Verify setup**: +```bash +curl http://localhost:18080/api/v1/applications +# Should return 3 applications +``` + +## Setup + +1. **Add MCP server**: +```bash +q mcp add \ + --name spark-history-server-mcp \ + --command /Users/username/.local/bin/uv \ + --args "run,--project,/Users/username/spark-history-server-mcp,python,main_stdio.py" \ + --scope workspace +``` + +**โš ๏ธ Important**: +- Replace `/Users/username/.local/bin/uv` with output of `which uv` +- Replace `/Users/username/spark-history-server-mcp` with your actual repository path + +2. **Test connection**: `q chat --trust-all-tools` + +## Usage + +Start interactive session: +```bash +q chat --trust-all-tools +``` + +![amazon-q-cli](amazon-q-cli.png) + +Example query: +``` +Compare performance between spark-cc4d115f011443d787f03a71a476a745 and spark-110be3a8424d4a2789cb88134418217b +``` + +## Batch Analysis +```bash +echo "What are the bottlenecks in spark-cc4d115f011443d787f03a71a476a745?" | q chat --trust-all-tools +``` + +## Management +- List servers: `q mcp list` +- Remove: `q mcp remove --name spark-history-server-mcp` + +## Remote Spark History Server + +To connect to a remote Spark History Server, edit `config.yaml` in the repository: + +```yaml +servers: + production: + default: true + url: "https://spark-history-prod.company.com:18080" + auth: + username: "user" + password: "pass" +``` + +**Note**: Amazon Q CLI requires local MCP server execution. For remote MCP servers, consider: +- SSH tunnel: `ssh -L 18080:remote-server:18080 user@server` +- Deploy MCP server locally pointing to remote Spark History Server + +## Troubleshooting +- **Path errors**: Use full paths (`which uv`) +- **Tool issues**: Always use `--trust-all-tools` +- **Connection fails**: Check Spark History Server is running and accessible diff --git a/examples/integrations/amazon-q-cli/amazon-q-cli.png b/examples/integrations/amazon-q-cli/amazon-q-cli.png new file mode 100644 index 0000000..b92e8da Binary files /dev/null and b/examples/integrations/amazon-q-cli/amazon-q-cli.png differ diff --git a/examples/integrations/amazon-q-cli/amazon-q-shs-mcp-config.json b/examples/integrations/amazon-q-cli/amazon-q-shs-mcp-config.json new file mode 100644 index 0000000..a8ffe58 --- /dev/null +++ b/examples/integrations/amazon-q-cli/amazon-q-shs-mcp-config.json @@ -0,0 +1,18 @@ +{ + "mcpServers": { + "spark-history-server-mcp": { + "command": "/path/to/.local/bin/uv", + "args": [ + "run", + "--project", + "/path/to/spark-history-server-mcp", + "python", + "main_stdio.py" + ], + "env": {}, + "cwd": "/path/to/spark-history-server-mcp", + "timeout": 120000, + "disabled": false + } + } +} diff --git a/examples/integrations/claude-desktop/README.md b/examples/integrations/claude-desktop/README.md new file mode 100644 index 0000000..7c9f4af --- /dev/null +++ b/examples/integrations/claude-desktop/README.md @@ -0,0 +1,102 @@ +# Claude Desktop Integration + +Connect Claude Desktop to Spark History Server for AI-powered job analysis. + +## Prerequisites + +1. **Clone and setup repository**: +```bash +git clone https://github.com/DeepDiagnostix-AI/spark-history-server-mcp.git +cd spark-history-server-mcp + +# Install Task (if not already installed) +brew install go-task # macOS +# or see https://taskfile.dev/installation/ for other platforms + +# Setup dependencies +task install +``` + +2. **Start Spark History Server with sample data**: +```bash +task start-spark-bg +# Starts server at http://localhost:18080 with 3 sample applications +``` + +3. **Verify setup**: +```bash +curl http://localhost:18080/api/v1/applications +# Should return 3 applications +``` + +## Setup + +1. **Configure Claude Desktop** (`~/Library/Application Support/Claude/claude_desktop_config.json`): + +```json +{ + "mcpServers": { + "spark-history-server-mcp": { + "command": "uv", + "args": ["run", "--project", "/Users/username/spark-history-server-mcp", "python", "main_stdio.py"], + "cwd": "/Users/username/spark-history-server-mcp" + } + } +} +``` + +**โš ๏ธ Important**: Replace `/Users/username/spark-history-server-mcp` with your actual repository path. + +2. **Restart Claude Desktop** + +## Test Connection + +Ask Claude: "Are you connected to the Spark History Server? What tools are available?" + +You should see 6 core tools for Spark analysis: +- get_application, get_jobs, compare_job_performance, compare_sql_execution_plans, get_job_bottlenecks, get_slowest_jobs + +## Example Usage + +``` +Compare performance between these Spark applications: +- spark-cc4d115f011443d787f03a71a476a745 +- spark-110be3a8424d4a2789cb88134418217b + +Analyze execution times, bottlenecks, and provide optimization recommendations. +``` + +![claude-desktop](claude-desktop.png) + +## Available Tools + +- `get_application` - Application details +- `get_jobs` - Job information +- `compare_job_performance` - Performance comparison +- `compare_sql_execution_plans` - SQL plan analysis +- `get_job_bottlenecks` - Identify issues +- `get_slowest_jobs` - Find slow jobs + +## Remote Spark History Server + +To connect to a remote Spark History Server, edit `config.yaml` in the repository: + +```yaml +servers: + production: + default: true + url: "https://spark-history-prod.company.com:18080" + auth: + username: "user" + password: "pass" +``` + +**Note**: Claude Desktop requires local MCP server execution. For remote MCP servers, consider: +- SSH tunnel: `ssh -L 18080:remote-server:18080 user@server` +- Deploy MCP server locally pointing to remote Spark History Server + +## Troubleshooting + +- **Connection fails**: Check paths in config file +- **No tools**: Restart Claude Desktop +- **No apps found**: Ensure Spark History Server is running and accessible diff --git a/examples/integrations/claude-desktop/claude-desktop-shs-mcp-config.json b/examples/integrations/claude-desktop/claude-desktop-shs-mcp-config.json new file mode 100644 index 0000000..52660eb --- /dev/null +++ b/examples/integrations/claude-desktop/claude-desktop-shs-mcp-config.json @@ -0,0 +1,15 @@ +{ + "mcpServers": { + "spark-history-server-mcp": { + "command": "uv", + "args": [ + "run", + "--project", + "/path/to/spark-history-server-mcp", + "python", + "main_stdio.py" + ], + "cwd": "/path/to/spark-history-server-mcp" + } + } +} diff --git a/examples/integrations/claude-desktop/claude-desktop.png b/examples/integrations/claude-desktop/claude-desktop.png new file mode 100644 index 0000000..d867067 Binary files /dev/null and b/examples/integrations/claude-desktop/claude-desktop.png differ diff --git a/main_stdio.py b/main_stdio.py new file mode 100644 index 0000000..5792e46 --- /dev/null +++ b/main_stdio.py @@ -0,0 +1,254 @@ +"""Official MCP SDK entry point for STDIO transport (Claude Desktop, Amazon Q CLI).""" + +import asyncio +import json +import logging +import sys +from typing import Any + +from mcp.server import Server +from mcp.server.stdio import stdio_server +from mcp.types import Tool + +import tools +from config import Config +from spark_client import SparkRestClient + +# Configure logging to stderr for debugging +logging.basicConfig( + level=logging.DEBUG, + format="%(asctime)s - %(name)s - %(levelname)s - %(message)s", + stream=sys.stderr, +) +logger = logging.getLogger(__name__) + +# Global variables for clients +clients: dict[str, SparkRestClient] = {} +default_client: SparkRestClient | None = None + +server = Server("spark-history-server-mcp") + + +# Context class to mimic FastMCP context for compatibility with tools.py +class MockContext: + def __init__(self, clients_dict, default_client_instance): + self.request_context = MockRequestContext(clients_dict, default_client_instance) + + +class MockRequestContext: + def __init__(self, clients_dict, default_client_instance): + self.lifespan_context = MockLifespanContext( + clients_dict, default_client_instance + ) + + +class MockLifespanContext: + def __init__(self, clients_dict, default_client_instance): + self.clients = clients_dict + self.default_client = default_client_instance + + +# Monkey patch to provide context to tools.py functions +def get_mock_context(): + return MockContext(clients, default_client) + + +# Replace the mcp.get_context function used in tools.py +tools.mcp.get_context = get_mock_context + + +@server.list_tools() +async def list_tools() -> list[Tool]: + """List available tools.""" + return [ + Tool( + name="get_application", + description="Get detailed information about a specific Spark application", + inputSchema={ + "type": "object", + "properties": { + "spark_id": { + "type": "string", + "description": "Spark application ID", + }, + "server": {"type": "string", "description": "Optional server name"}, + }, + "required": ["spark_id"], + }, + ), + Tool( + name="get_jobs", + description="Get a list of all jobs for a Spark application", + inputSchema={ + "type": "object", + "properties": { + "spark_id": { + "type": "string", + "description": "Spark application ID", + }, + "server": {"type": "string", "description": "Optional server name"}, + "status": { + "type": "array", + "items": {"type": "string"}, + "description": "Optional job status filter", + }, + }, + "required": ["spark_id"], + }, + ), + Tool( + name="compare_job_performance", + description="Compare performance metrics between two Spark jobs", + inputSchema={ + "type": "object", + "properties": { + "spark_id1": { + "type": "string", + "description": "First Spark application ID", + }, + "spark_id2": { + "type": "string", + "description": "Second Spark application ID", + }, + "server": {"type": "string", "description": "Optional server name"}, + }, + "required": ["spark_id1", "spark_id2"], + }, + ), + Tool( + name="compare_sql_execution_plans", + description="Compare SQL execution plans between two Spark jobs", + inputSchema={ + "type": "object", + "properties": { + "spark_id1": { + "type": "string", + "description": "First Spark application ID", + }, + "spark_id2": { + "type": "string", + "description": "Second Spark application ID", + }, + "execution_id1": { + "type": "integer", + "description": "Optional execution ID for first app", + }, + "execution_id2": { + "type": "integer", + "description": "Optional execution ID for second app", + }, + "server": {"type": "string", "description": "Optional server name"}, + }, + "required": ["spark_id1", "spark_id2"], + }, + ), + Tool( + name="get_job_bottlenecks", + description="Identify performance bottlenecks in a Spark job", + inputSchema={ + "type": "object", + "properties": { + "spark_id": { + "type": "string", + "description": "Spark application ID", + }, + "server": {"type": "string", "description": "Optional server name"}, + "top_n": { + "type": "integer", + "description": "Number of bottlenecks to return", + "default": 5, + }, + }, + "required": ["spark_id"], + }, + ), + Tool( + name="get_slowest_jobs", + description="Get the N slowest jobs for a Spark application", + inputSchema={ + "type": "object", + "properties": { + "spark_id": { + "type": "string", + "description": "Spark application ID", + }, + "server": {"type": "string", "description": "Optional server name"}, + "include_running": { + "type": "boolean", + "description": "Include running jobs", + "default": False, + }, + "n": { + "type": "integer", + "description": "Number of jobs to return", + "default": 5, + }, + }, + "required": ["spark_id"], + }, + ), + ] + + +@server.call_tool() +async def call_tool(name: str, arguments: dict[str, Any]) -> list[dict]: + """Handle tool calls.""" + try: + # Convert result to JSON string for consistent output + def format_result(result): + try: + from app import DateTimeEncoder + + return json.dumps(result, cls=DateTimeEncoder, indent=2, default=str) + except Exception: + return str(result) + + # Map tool names to functions from tools.py + tool_functions = { + "get_application": tools.get_application, + "get_jobs": tools.get_jobs, + "compare_job_performance": tools.compare_job_performance, + "compare_sql_execution_plans": tools.compare_sql_execution_plans, + "get_job_bottlenecks": tools.get_job_bottlenecks, + "get_slowest_jobs": tools.get_slowest_jobs, + } + + if name in tool_functions: + func = tool_functions[name] + result = func(**arguments) + return [{"type": "text", "text": format_result(result)}] + else: + return [{"type": "text", "text": f"Unknown tool: {name}"}] + + except Exception as e: + logger.error(f"Tool call error for {name}: {e}") + return [{"type": "text", "text": f"Error calling {name}: {str(e)}"}] + + +async def main(): + """Main entry point.""" + global clients, default_client + + try: + logger.info("Loading configuration...") + config = Config.from_file("config.yaml") + + logger.info("Initializing clients...") + for name, server_config in config.servers.items(): + clients[name] = SparkRestClient(server_config) + if server_config.default: + default_client = clients[name] + + logger.info("Starting MCP server with stdio transport...") + async with stdio_server() as (read_stream, write_stream): + await server.run( + read_stream, write_stream, server.create_initialization_options() + ) + + except Exception as e: + logger.error(f"Failed to start MCP server: {e}") + sys.exit(1) + + +if __name__ == "__main__": + asyncio.run(main())