Skip to content
Merged
Show file tree
Hide file tree
Changes from all commits
Commits
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
4 changes: 2 additions & 2 deletions .github/pull_request_template.md
Original file line number Diff line number Diff line change
Expand Up @@ -14,15 +14,15 @@ Brief description of changes and motivation.

## πŸ§ͺ Testing
<!-- Describe how you tested your changes -->
- [ ] βœ… All existing tests pass (`uv run pytest`)
- [ ] βœ… All existing tests pass (`task test`)
- [ ] πŸ”¬ Tested with MCP Inspector
- [ ] πŸ“Š Tested with sample Spark data
- [ ] πŸš€ Tested with real Spark History Server (if applicable)

### πŸ”¬ Test Commands Run
```bash
# Example:
# uv run pytest test_tools.py -v
# task test
# npx @modelcontextprotocol/inspector uv run main.py
```

Expand Down
65 changes: 0 additions & 65 deletions Makefile

This file was deleted.

101 changes: 80 additions & 21 deletions README.md
Original file line number Diff line number Diff line change
Expand Up @@ -80,25 +80,45 @@ npx @modelcontextprotocol/inspector

### ⚑ Job Performance Comparison
![Job Comparison](screenshots/job-compare.png)
*Compare performance metrics between different Spark jobs*
![alt text](job-compare.png)


## πŸ› οΈ Available Tools

### πŸ“Š Application & Job Analysis
| πŸ”§ Tool | πŸ“ Description |
|---------|----------------|
| `get_application` | Get detailed information about a specific Spark application |
| `get_jobs` | Get a list of all jobs for a Spark application |
| `get_slowest_jobs` | Get the N slowest jobs for a Spark application |

### ⚑ Stage & Task Analysis
| πŸ”§ Tool | πŸ“ Description |
|---------|----------------|
| `get_stages` | Get a list of all stages for a Spark application |
| `get_slowest_stages` | Get the N slowest stages for a Spark application |
| `get_stage` | Get information about a specific stage |
| `get_stage_task_summary` | Get task metrics summary for a specific stage |

### πŸ–₯️ Executor & Resource Analysis
| πŸ”§ Tool | πŸ“ Description |
|---------|----------------|
| `list_applications` | πŸ“‹ List Spark applications with filtering |
| `get_application_details` | πŸ“Š Get comprehensive application info |
| `get_application_jobs` | πŸ”— List jobs within an application |
| `get_job_details` | πŸ” Get detailed job information |
| `get_stage_details` | ⚑ Analyze stage-level metrics |
| `get_task_details` | 🎯 Examine individual task performance |
| `get_executor_summary` | πŸ–₯️ Review executor utilization |
| `compare_job_performance` | πŸ“ˆ Compare multiple jobs |
| `get_application_environment` | βš™οΈ Review Spark configuration |
| `get_storage_info` | πŸ’Ύ Analyze RDD storage usage |
| `get_sql_execution_details` | πŸ”Ž Deep dive into SQL queries |
| `get_executors` | Get executor information for an application |
| `get_executor` | Get information about a specific executor |
| `get_executor_summary` | Get aggregated metrics across all executors |
| `get_resource_usage_timeline` | Get resource usage timeline for an application |

### πŸ” SQL & Performance Analysis
| πŸ”§ Tool | πŸ“ Description |
|---------|----------------|
| `get_slowest_sql_queries` | Get the top N slowest SQL queries for an application |
| `get_job_bottlenecks` | Identify performance bottlenecks in a Spark job |
| `get_environment` | Get comprehensive Spark runtime configuration |

### πŸ“ˆ Comparison Tools
| πŸ”§ Tool | πŸ“ Description |
|---------|----------------|
| `compare_job_performance` | Compare performance metrics between two Spark jobs |
| `compare_job_environments` | Compare Spark environment configurations between two jobs |
| `compare_sql_execution_plans` | Compare SQL execution plans between two Spark jobs |

## πŸš€ Production Deployment

Expand All @@ -122,17 +142,56 @@ helm install spark-history-mcp ./deploy/kubernetes/helm/spark-history-mcp/ \
## πŸ§ͺ Testing & Development

### πŸ”¬ Local Development

#### πŸ“‹ Prerequisites
- Install [Task](https://taskfile.dev/installation/) for running development commands:
```bash
# macOS
brew install go-task

# Other platforms - see https://taskfile.dev/installation/
```

*Note: uv will be automatically installed when you run `task install`*

#### πŸš€ Development Commands

**Quick Setup:**
```bash
# πŸ”₯ Start local Spark History Server with sample data
./start_local_spark_history.sh
# πŸ“¦ Install dependencies and setup pre-commit hooks
task install
task pre-commit-install

# ⚑ Start MCP server
uv run main.py
# πŸš€ Start services one by one (all in background)
task start-spark-bg # Start Spark History Server
task start-mcp-bg # Start MCP Server
task start-inspector-bg # Start MCP Inspector

# 🌐 Then open http://localhost:6274 in your browser

# 🌐 Test with MCP Inspector
npx @modelcontextprotocol/inspector uv run main.py
# πŸ›‘ When done, stop all services
task stop-all
```

**Essential Commands:**
```bash

# πŸ›‘ Stop all background services
task stop-all

# πŸ§ͺ Run tests and checks
task test # Run pytest
task lint # Check code style
task pre-commit # Run all pre-commit hooks
task validate # Run lint + tests

# πŸ”§ Development utilities
task format # Auto-format code
task clean # Clean build artifacts
```

*For complete command reference, see `Taskfile.yml`*

### πŸ“Š Sample Data
The repository includes real Spark event logs for testing:
- `spark-bcec39f6201b42b9925124595baad260` - βœ… Successful ETL job
Expand Down Expand Up @@ -215,7 +274,7 @@ For production AI agent integration, see [`examples/integrations/`](examples/int
1. 🍴 Fork the repository
2. 🌿 Create feature branch: `git checkout -b feature/new-tool`
3. πŸ§ͺ Add tests for new functionality
4. βœ… Run tests: `uv run pytest`
4. βœ… Run tests: `task test`
5. πŸ“€ Submit pull request

## πŸ“„ License
Expand Down
Loading