Give your AI coding agent ML engineering superpowers.
It adds two things your coding agent doesn't have:
ML Pipeline: Seven skills that encode the workflow you already follow. Plan against real framework docs. Catch config mistakes before they cost you GPU hours. Debug OOM, NaN, and divergence by root cause, not by guessing. Get ranked next steps when metrics plateau. An agentic experiment memory carries hypotheses, results, and lessons across sessions — your agent stops repeating failed experiments and starts compounding what works.
Memory: Backed by Leeroopedia, 27k+ pages across 1000+ ML/AI frameworks. Config references, debugging heuristics, implementation patterns, and battle-tested defaults from vLLM to DeepSpeed to LangChain. Built by the Leeroo continuous learning system, structured as a browsable wiki, and continuously updated by AI and human engineers. When your agent recommends a config, it points to the page it learned it from.
Works with Claude Code, Cursor, Codex, OpenCode, and Gemini CLI.
- A session hook loads automatically, zero setup per conversation.
- Skills guide the ML workflow, verify before launch, debug by root cause, iterate on results, track what worked.
- MCP tools connect to the Leeroopedia knowledge base, your agent looks things up and cites real docs instead of guessing.
- A persistent ML agent (
ml-expert) handles deeper tasks and remembers your hardware, experiments, and lessons across sessions.
We gave 38 ML tasks to Claude Code — once with SuperML, once without — and had an independent LLM judge rate both. Each response is scored out of 15 across correctness, specificity, mistake prevention, actionability, and grounding. Tasks cover QLoRA fine-tuning, distributed training, LLM inference optimization, alignment (DPO/GRPO), RAG pipelines, model merging, quantization, and more.
| With SuperML | Without | |
|---|---|---|
| ML task average | 13.2 / 15 | 8.3 / 15 |
| ML task win rate | 91% | 9% |
See TESTED_TASKS.md for the full list of tasks and scores.
The plugin works without an API key — skills use web search to ground answers. With a key, your agent gets access to the Leeroopedia knowledge base (27k+ pages, faster and more precise lookups). The plugin will tell you if it's running without a key.
To get a key: app.leeroopedia.com — $20 free credit on signup, no credit card.
export LEEROOPEDIA_API_KEY=kpsk_your_key_hereAdd to your shell profile (~/.bashrc, ~/.zshrc) so it persists.
Register the marketplace, then install the plugin:
/plugin marketplace add leeroo-ai/leeroo-marketplace
/plugin install superml@leeroo-marketplace
Or install directly from GitHub:
claude plugin add --from-github leeroo-ai/supermlIn Cursor Agent chat (waiting for Cursor team approval):
/add-plugin superml
Or clone into your project — Cursor auto-detects .cursor-plugin/plugin.json:
git clone https://github.com/leeroo-ai/superml.gitSee .codex/INSTALL.md.
See .opencode/INSTALL.md.
git clone https://github.com/leeroo-ai/superml.git
gemini extension add ./superml/gemini-extension.jsonIf you just want the knowledge base without the full plugin, see leeroopedia-mcp for setup instructions.
You get the MCP tools (memory) but not the workflow skills (process).
Start a conversation and try something like:
I'm fine-tuning Llama 3.1 8B on 50k instruction pairs with 1xA100 80GB.
Set up the full training config — QLoRA, proper chat template, loss masking on prompts.
If it's working, your agent will ground its answer in documentation (KB citations or web sources), catch common pitfalls before they waste a training run, and give you a runnable config.
| Skill | What it does |
|---|---|
| ml-plan | Plan training runs, architectures, and multi-step pipelines |
| ml-verify | Check configs, code, and math before you burn GPU hours |
| ml-debug | Debug OOM, NaN, divergence, crashes, bad throughput |
| ml-iterate | Ranked next steps when results aren't where you want them |
| ml-experiment | Track experiments — hypotheses, results, and learnings across sessions |
| ml-research | Deep-dive into ML topics, compare approaches, survey frameworks |
| using-superml | Loaded at session start — wires up skills to KB tools and sets quality standards |
ml-expert: a persistent ML engineer agent for the bigger stuff: pipeline reviews, deep analysis, framework deep-dives. It remembers your hardware setup, past experiments, and lessons learned across sessions.
SuperML is integrated in our enterprise platform — forecasting & planning, fraud & anomaly detection, customer analytics, recommendation systems, document intelligence, and customer service automation.
See CONTRIBUTING.md for how to report bugs, suggest improvements, and submit PRs.
- Leeroopedia: the ML/AI knowledge base behind the memory
- leeroopedia-mcp: MCP server repo
- Leeroo: the team behind SuperML