Use Claude Code's UltraCode mode (xhigh effort + the Workflow/deep-reasoning
harness) with any model you already pay for — pick it live from the /model
menu.
One icon. Open Claude Code, type /model, and choose any backend you've set up —
all running with the full UltraCode harness. Your normal Claude Code install is
left untouched.
The example config ships ready-to-use entries for GPT‑5.5 (Codex login),
MiniMax‑M3, MiMo v2.5 Pro, DeepSeek V4 Pro/Flash, Step Flash,
Ollama Cloud, OpenCode Go, OpenRouter, and local models — keep
the ones you have a plan for, delete the rest. (Cursor's Composer needs the
cursor-agent CLI and isn't HTTP-based — see
docs/ADD_A_MODEL.md.)
How is this possible? At the API level, "UltraCode" is just
effort=xhigh+ adaptive thinking + a bigmax_tokens+ one system reminder — there is no secret model. The proxy adds that envelope to every request, so any backend gets the UltraCode treatment. Full breakdown (with the reverse‑engineering evidence) in docs/HOW_IT_WORKS.md.
UltraCode shines on long, autonomous runs — deep reasoning, multi-step Workflows, multi-agent fan-out. The catch with any "route to a third-party backend" shim is that those backends occasionally hiccup, and on a 40-minute agent run a single unhandled hiccup can wedge the whole session. We hardened the proxy against the three failure modes we actually hit in production, so it keeps going instead of stalling:
- 🔁 Empty turns auto-retry. A backend that returns a turn with no text and no tool call (a transient blip, or a budget-exhausted reasoning turn at high effort) is transparently re-issued. It buffers only until the first real token, so a normal turn adds zero latency and output is never duplicated — and it never retries after real output or a fatal error.
- ⏱️ A stalled stream can't freeze the run. If a GPT‑5.5/codex stream opens and then goes silent mid-turn, a bounded idle timeout turns the stall into a quick retry instead of a ~10-minute hang — so one stuck sub-agent no longer freezes an entire multi-agent / dynamic-workflow run.
- 🛠️ Rejecting a tool call just works. Declining (or skipping) a tool mid-run no longer 400s strict backends like DeepSeek — the proxy repairs the tool-call sequence and synthesizes a stub reply for anything you didn't answer, including partial parallel calls. (#3)
All three are tunable via env vars and locked down by the offline self-test in CI. Details and knobs: docs/HOW_IT_WORKS.md → Reliability.
There's a ready-to-run scenario in examples/demo/ — a buggy
little Game of Life. Launch UltraCode there, pick any model, enable auto mode,
and paste the prompt: it fixes the bug, adds an
animated color renderer + starting patterns, and runs its own self-test, ending
on a glider crawling across the screen.
Verified live against real backends: GPT‑5.5 (Codex login) and Cursor Composer, plus an offline self-test that runs in CI on Linux/Windows × Python 3.8/3.12.
- Claude Code CLI with UltraCode access (
npm i -g @anthropic-ai/claude-code). - Python 3.8+ (standard library only — there is nothing to
pip install). - At least one backend credential, e.g. an API key (MiMo / OpenRouter / OpenAI /
a local server) and/or a
codex loginfor GPT‑5.5. You only set up the ones you have.
Tested on Windows 11 (no WSL needed). macOS/Linux/WSL work too via bin/ultracode.
git clone https://github.com/OnlyTerp/UltraCode-Shim.git
cd UltraCode-Shim
# 1. Sanity-check your machine and config (safe to run anytime)
python scripts\doctor.py
# 2. Tell it which models you want (see "Configure your models" below)
# Copy config.example.json to config.json, keep the models you have,
# and put your keys in it (config.json is gitignored).
copy config.example.json config.json
# 3. Create Desktop icons (one for UltraCode, one for normal Claude Code)
.\windows\Install-DesktopIcons.ps1
# 4. Double-click "UltraCode (All Models)" — then type /model and pick a backend.Run python3 scripts/doctor.py then ./bin/ultracode.
(The launchers copy config.example.json → config.json for you on first run if
you skip step 2.)
Everything is in one file: config.json (copied from config.example.json).
It has two sections you edit:
models— what shows up in the/modelmenu. Everyidmust start withclaudeoranthropic(Claude Code filters the rest out).routes— where each of those ids actually goes. The route key must match the modelid.
Example — MiMo and an OpenRouter model:
Put your key right in config.json (it's gitignored) or use ${ENV_VAR} and
export it — or drop keys into a gitignored ultracode.env the launchers load.
Route types:
type |
Use for | Needs |
|---|---|---|
| (omit) | Real Claude or any Anthropic-compatible endpoint | nothing, or auth/upstream |
openai_compat |
MiMo, DeepSeek, OpenRouter, OpenAI, Ollama, local llama.cpp — anything that speaks OpenAI Chat Completions (tools included) | an API key |
codex_oauth |
GPT‑5.5 via a ChatGPT/Codex login (no API key) | codex login once |
cursor_agent |
Cursor Composer (experimental) | cursor-agent login |
Reasoning models (MiniMax‑M3, etc.): an
openai_compatroute can carry a"body": { ... }dict of extra params merged into every request. MiniMax‑M3 needs"body": { "reasoning_split": true }so its<think>chain‑of‑thought is returned separately instead of leaking into the visible answer — the shipped example already sets this. See docs/ADD_A_MODEL.md.
Full walkthrough: docs/ADD_A_MODEL.md.
Yes. The UltraCode launcher only sets environment variables for the launched
process and uses a session-scoped --settings file. It never edits your global
Claude config or credentials. The installer also gives you a "Claude Code (Normal)"
icon, so you can always start the plain version. Remove everything with
windows\Uninstall.ps1.
This repo is built so you can hand it to an assistant. Point it at AGENTS.md — that's a step-by-step runbook (install → configure → test → troubleshoot) written for an AI to follow.
| Doc | What |
|---|---|
| AGENTS.md | Runbook for an AI assistant to install/configure/test |
| docs/SETUP.md | Human setup guide (Windows + macOS/Linux) |
| docs/HOW_IT_WORKS.md | The mechanism + reverse-engineering evidence |
| docs/ADD_A_MODEL.md | Add any backend to the /model menu |
| docs/TROUBLESHOOTING.md | Symptom → cause → fix |
MIT — see LICENSE. This is an unofficial, community project; it is not affiliated with Anthropic, OpenAI, or any model provider. You are responsible for complying with the terms of whatever accounts you route through it.




{ "models": [ { "id": "claude-mimo", "display_name": "MiMo v2.5 Pro" }, { "id": "claude-openrouter", "display_name": "Llama 3.3 70B (OpenRouter)" } ], "routes": { "claude-mimo": { "type": "openai_compat", "upstream": "https://token-plan-sgp.xiaomimimo.com/v1", "model": "mimo-v2.5-pro", "auth": "Bearer ${MIMO_API_KEY}" }, "claude-openrouter": { "type": "openai_compat", "upstream": "https://openrouter.ai/api/v1", "model": "meta-llama/llama-3.3-70b-instruct", "auth": "Bearer ${OPENROUTER_API_KEY}" } } }