CLI for turning research papers into executable implementation codes
MathPilot is an intelligent agent capable of transforming natural language requests into fully functional, executable Python projects. It bridges the gap between academic research and practical implementation by automating the discovery, understanding, and coding of complex algorithms from scientific papers.
- 🔍 arXiv Paper Search: Instantly find relevant research papers using natural language queries.
- 🧠 Intelligent Parsing: Extracts core algorithms, methods, and equations from PDFs using advanced LLMs (Gemini Pro, Claude 3, etc.).
- 📋 Automated Planning: Generates structured, step-by-step implementation plans (workflows) derived directly from the paper's methodology.
- 💻 Code Generation: Produces high-quality, documented Python code for each step of the workflow, including verification tests.
- 📂 Project Management: Automatically creates organized project directories with
requirements.txt, source code, and data folders. - 🛡️ Safe Execution: Runs generated code in a controlled environment to verify correctness (supports sandboxing).
- ⚡ Interactive CLI: A rich terminal interface for seamless searching, planning, and coding.
Get up and running in minutes.
- Python 3.7+ (3.12+ Recommended)
- Gemini API Key (for parsing and code generation) or Anthropic/Groq API Key.
# Clone the repository
git clone https://github.com/yourusername/mathpilot.git
cd mathpilot
# Install in editable mode
pip install -e .Verify your installation by implementing a classic algorithm:
# Set your API key
export GEMINI_API_KEY="your_api_key_here"
# Run the end-to-end test
mathpilot implement "linear regression" --executeThis command will:
- Search arXiv for "linear regression".
- Download and parse a relevant paper.
- Generate an implementation plan.
- Write the Python code.
- Execute the result!
MathPilot offers a powerful Command Line Interface (CLI) built with Typer and Rich.
The easiest way to use MathPilot is the interactive REPL.
mathpilot interactiveFollow the on-screen prompts to search for papers, browse local PDFs, and generate code.
Find papers on arXiv without leaving your terminal.
mathpilot search "kalman filter sensor fusion" --max-results 5Create an implementation plan from a specific paper (by arXiv ID or local PDF).
# From arXiv ID
mathpilot plan 2103.12345 --output plan.json
# From local PDF
mathpilot plan ./papers/my_research.pdfTurn a plan into a working project.
mathpilot generate plan.json --project-name my_kalman_filterThis creates a folder ~/mathpilot_projects/my_kalman_filter with the generated source code.
The implement command combines all steps into one.
mathpilot implement "implement the algorithm from this paper" --paper-id 2103.12345MathPilot is configurable via environment variables or a .mathpilot.yaml file in your home directory.
| Variable | Description |
|---|---|
GEMINI_API_KEY |
Required. API key for Google Gemini models. |
ANTHROPIC_API_KEY |
Optional. For using Claude models. |
GROQ_API_KEY |
Optional. For using Groq-hosted open source models. |
LLM_PROVIDER |
Default LLM provider (e.g., gemini, anthropic, groq). |
LLM_MODEL |
Specific model name (e.g., gemini-1.5-pro-latest). |
llm:
provider: gemini
model: gemini-1.5-pro-latest
arxiv:
cache_dir: ~/.mathpilot/cache
max_results: 10
executor:
sandbox: true
timeout_seconds: 300When MathPilot generates a project, it creates a clean, standard structure:
~/mathpilot_projects/my_project/
├── workflow.yaml # The implementation plan metadata
├── requirements.txt # Detected Python dependencies
├── src/ # Source code directory
│ ├── __init__.py
│ ├── main.py # Entry point
│ ├── step_01_setup.py # Modular implementation steps
│ ├── step_02_data.py
│ └── ...
├── tests/ # Generated tests (if requested)
├── data/ # Data storage
└── logs/ # Execution logs
mathpilot.search: Handles arXiv API interactions and caching.mathpilot.parser: Uses Vision-Language Models (VLMs) to "read" PDFs and extract algorithmic details.mathpilot.planner: Breaks down complex algorithms into logical coding tasks.mathpilot.generator: The coding engine. Writes modular, documented Python code.mathpilot.executor: Safely runs generated code and captures output/errors.
We welcome contributions!
# Install dev dependencies
pip install -e ".[dev,llm]"
# Install pre-commit hooks (optional but recommended)
pre-commit install# Run the full test suite
pytest
# Run specific tests
pytest tests/test_search.py# Format code
black mathpilot tests
# Check types
mypy mathpilotQ: "Paper not found" error?
A: Try using the exact arXiv title in quotes or the specific arXiv ID (e.g., 2103.12345).
Q: API Errors / Rate Limits?
A: Ensure your GEMINI_API_KEY is set and valid. If using the free tier, you may hit rate limits; wait a minute and try again.
Q: Generated code has bugs?
A: MathPilot creates starter code. While often functional, complex algorithms may require manual fine-tuning. Check the src/ files and debug as you would any other project.
This project is licensed under the MIT License. See the LICENSE file for details.