Loopi makes AI agents challenge each other on purpose.
Instead of trusting one model to plan, implement, review, and reinforce its own blind spots, Loopi turns multiple models into a structured workflow: one can plan, another can do the legwork, another can critique, and the system can loop until the result is good enough.
That means:
- smarter models can handle judgment-heavy steps
- cheaper or free models can handle execution-heavy work
- different models can catch each other's weak spots
- the workflow can keep refining instead of stopping after one pass
The result is not just "more AI." It is better output from controlled disagreement, explicit refinement loops, and grounded reference material.
Loopi is a workflow engine for getting the strongest result you can out of the models you already use.
One fast example:
- plan with Claude
- implement with Codex or OpenCode
- review with Gemini
- rerun the same workflow with different loop counts, fallback rules, or reference context
That is the core idea. Loopi turns "use multiple models" from a vague habit into a workflow you can actually inspect, reuse, and improve.
A single coding agent has one training history, one alignment profile, one set of defaults, and one set of blind spots. When it reviews its own output, it often agrees with itself.
Loopi is built around structured disagreement.
A plan written by one model can be challenged by another. An implementation produced by one agent can be reviewed by a different one. A synthesis step then reconciles the conflict into a final decision instead of letting one model dominate the whole workflow.
This is the point: different models fail differently.
With Loopi, you can:
- plan with one model, implement with another, and review with a third
- run parallel reviews so you can see where models agree and where they conflict
- force stage-to-stage handoffs through structured artifacts instead of loose chat memory
- keep reviewers read-only while the chosen implementer is allowed to write
Loopi is not multi-agent for novelty. It is multi-agent so different models can expose each other's blind spots before those blind spots become your problem.
Most AI tools give you one pass and hide the rest.
Loopi makes refinement explicit.
You can run a couple of synthesis loops for a fast, already-powerful result. Or you can push a task through deeper planning, implementation, review, and repair cycles when you want the strongest output the system can produce.
That means Loopi can be:
- a quick two-loop quality pass
- a heavier multi-stage review cycle
- a long unattended workflow that keeps improving the output while you are away
You control how much quality pressure a task gets, and how much compute and token spend it deserves.
Loopi exposes three independent loop controls:
| Setting | Used by | What it controls |
|---|---|---|
planLoops |
plan, one-shot |
plan-review-synthesis cycles (for plan mode) or plan cycles per quality loop (for one-shot) |
qualityLoops |
one-shot |
total outer one-shot reruns of the entire sequence |
sectionImplementLoops |
one-shot |
per-section implement-review-repair cycles |
implementLoops |
implement |
standalone implement -> review -> repair cycles |
Those controls let you do things like:
- loop the plan multiple times before implementation starts
- keep implementation cheap but review-heavy
- run more repair cycles only when a task is broken into units
- increase quality pressure without paying for your most expensive model at every stage
A powerful default pattern looks like this:
- use your smartest model to plan
- use a cheaper or free coding agent to implement
- use another model to review and challenge the result
- repeat the loop until the output is good enough
That is the leverage Loopi gives you.
Most AI tools work from whatever is already in the repo, whatever fits in the prompt, or whatever happens to be in the current chat.
Loopi lets you attach an explicit context folder to a task so the workflow has real reference material to reason against during planning, implementation, and review.
That context can include things like:
- design docs
- research notes
- example code
- schemas
- specifications
- contracts or policy documents
- review rubrics
- supporting project files
Prepared context roots can also include supported source formats such as pdf, docx, ipynb, and common code files like js, ts, py, html, and css. Loopi normalizes those files into a generated .loopi-context/ cache automatically, and unsupported formats are skipped rather than silently treated as promptable context.
This matters because better workflows need better evidence.
Instead of hoping one model remembers the right details, you can point Loopi at the exact body of material that should shape the work. That gives you control over not just which models run and how many times they loop, but what source material they reason against.
One practical Loopi workflow looks like this:
- Attach a
contextfolder with the docs, examples, schemas, specs, or other reference material that matters - Plan with Claude
- Implement with Codex or OpenCode
- Review with Gemini or another model
- Repeat the implement -> review -> repair cycle until the result is strong enough
- Save the scratchpad and structured per-run artifacts
- Re-run later with different models, loop counts, fallback rules, provider assignments, or context rules
Loopi is not trying to replace the individual agent tools. It is the workflow layer above them.
Start with a bare idea, let one model plan the architecture, another write the implementation, and another review the result until the output is strong enough to keep.
Point Loopi at a live repo plus supporting docs in the context folder, then use the workflow for feature work, refactors, bug hunts, and full reviews against the actual codebase.
Loopi keeps planning, review, and repair steps explicit. Instead of hidden internal reasoning, teams get a record of what was proposed, challenged, changed, and accepted.
The same workflow pattern can be applied beyond code: legal drafts using case law and example contracts, business plans grounded in research material, or academic writing built around source documents and structured review.
- explicit plan, implement, review, and repair stages instead of one long prompt thread
- multi-agent and multi-provider workflows with controlled write access
- structured artifacts and handoffs you can inspect instead of relying on chat history alone
- cost-aware model assignment across stages
- independent loop counts for outer quality cycles, implement/repair cycles, and per-unit one-shot cycles
- controlled reference context through a task-level
contextfolder - context and fallback controls you can tune for cost, reliability, and review quality
In practice, that means better output from the models you already use, with more visibility and less guesswork.
- developers who want more out of AI than one model in one chat can give them
- teams that want repeatable workflows with visible decision steps and recorded outputs
- people doing high-context work where planning, evidence, critique, and refinement all matter
- Windows is the primary platform today. The CLI and test workflow are exercised most heavily in Windows PowerShell.
- Node.js 20 or newer
- At least one supported AI coding CLI installed and authenticated
- A local Git repository for the project you want the agents to work on
git clone https://github.com/Concrete333/Loopi.git my-project-folder
cd my-project-folder
npm installYou only need to install the agent CLIs you actually want to use.
One agent is enough to get started. Two or three is where Loopi starts to show what it can really do.
Loopi works with multiple coding-agent CLIs, and it can also route stages to OpenAI-compatible HTTP providers.
| Agent | Install / docs | Auth / setup | Loopi override |
|---|---|---|---|
| Claude Code | Anthropic setup docs | Run claude, then follow the Anthropic / Claude login flow |
LOOPI_CLAUDE_PATH |
| Codex CLI | OpenAI Codex CLI getting started | Run codex auth login or sign in when prompted |
LOOPI_CODEX_JS |
| Gemini CLI | Gemini CLI quickstart | Run gemini, then choose your Google auth flow |
LOOPI_GEMINI_JS |
| Kilo Code CLI | Kilo Code CLI | Run kilo auth login and configure the provider you want to use |
LOOPI_KILO_PATH |
| Qwen Code | Qwen Code docs | Run qwen, then complete the Qwen OAuth / account setup |
LOOPI_QWEN_JS |
| OpenCode | OpenCode docs | Run opencode, then use /connect or opencode auth login to configure a provider |
LOOPI_OPENCODE_PATH |
Any OpenAI-compatible HTTP endpoint can also be registered as a provider, whether that is a local inference server, an internal deployment, or a hosted service.
HTTP providers are always read-only in Loopi today. They can plan, review, and synthesize, but they cannot be the implementer.
After you install at least one agent CLI, validate your setup. This works even before shared/task.json exists:
npm run cli -- doctorThen generate your first task interactively:
npm run cli -- planIf your task uses a prepared context root, build the reusable context cache before the first run and rebuild it after context files or context rules change:
npm run cli -- context prepareThe shorter alias works too:
npm run context:prepareIf you prefer a browser-based setup flow, launch the local UI:
npm run uiOn Windows, the shipped repo also includes a clickable launcher in the repo root:
Launch Loopi UI.cmd
That starts a localhost control plane for setup checks, task configuration, presets, and run monitoring. The launcher also refreshes a sibling Launch Loopi UI.lnk shortcut with the branded Loopi icon so the Windows-friendly shortcut can carry a custom icon even though the batch file itself cannot. The UI now treats broken saved task files as first-class errors instead of hiding them, the Runs tab shows live background sessions while a run is still in flight, and the Setup tab can launch explicit install/login helpers for supported adapters. Run Now validates the current draft before persisting it, so a blocked launch does not overwrite the saved task file, and session polling now slows down for hidden tabs or repeated failures instead of hammering at a fixed rate. See docs/ui.md for the screen-by-screen guide.
If you configure context in the UI, the Settings tab can check draft-aware context status, distinguish an invalid context path from a missing prepared cache, prepare the cache in place, and block run launch early when the prepared cache is missing or stale instead of starting a doomed session.
Typical flow:
What do you want the agents to do: Plan a small calculator app
Supported agents: 1) claude, 2) codex, 3) gemini, 4) kilo, 5) qwen, 6) opencode
Enter agent names or numbers separated by commas.
Which agents should help: 1,3
Run now? [Y/n]: y
Task written. Starting run...
If you answer n to Run now?, Loopi writes shared/task.json and prints the command to run it later.
A good first run is not "use every model." It is:
- one strong planner
- one implementer
- one different reviewer
That is usually enough to feel why the workflow matters.
| Mode | Flow | Primary loop settings |
|---|---|---|
plan |
initial plan -> review(s) -> synthesis | planLoops |
implement |
implement -> review(s) -> repair | implementLoops |
review |
initial review -> parallel reviews -> synthesis | (single pass by design) |
one-shot |
plan -> per-unit implement/review -> replan | qualityLoops, planLoops, sectionImplementLoops |
For example, one-shot with qualityLoops = 2, planLoops = 2, sectionImplementLoops = 1 becomes:
[plan x 2] -> [implement each section x 1] -> [plan x 2] -> [implement each section x 1]
With 3 planned sections, that is 4 total plan cycles and 6 total section implementations.
The important point is not just that Loopi has different modes. It is that each mode exposes a different kind of refinement loop, and you decide how much quality pressure and token spend a task deserves.
You can also assign different agents to different seats in the workflow. In one-shot, for example, settings.oneShotOrigins lets one agent own planning, another own implementation, and another own review. A separate roles.fallback target can be used if a primary provider fails.
One of the simplest useful Loopi patterns is also one of the strongest:
- use your smartest and most expensive model to plan
- use a cheaper or free coding agent to implement
- use another model to review and challenge the result
- repeat the review/repair cycle until the work is good enough to keep
That is the leverage Loopi gives you.
You do not need to pay top-tier rates for every token in the workflow. You can place expensive intelligence where judgment matters most, cheaper execution where it is sufficient, and structured critique where quality needs pressure.
This is what makes Loopi feel different in practice: it lets you treat model quality, workflow structure, and token spend as things you can actually control.
Loopi exposes separate loop controls because different tasks need different kinds of pressure.
| Setting | Used by | What it controls |
|---|---|---|
planLoops |
plan, one-shot |
plan-review-synthesis cycles (for plan mode) or plan cycles per quality loop (for one-shot) |
qualityLoops |
one-shot |
total outer one-shot reruns of the entire sequence |
sectionImplementLoops |
one-shot |
per-section implement-review-repair cycles |
implementLoops |
implement |
standalone implement -> review -> repair cycles |
In one-shot mode, the loop controls nest as follows:
- For each outer
qualityLoopscycle, run the plan stageplanLoopstimes. - After the final plan result for that outer cycle is ready, implement each planned section.
- For each section, run the implement-review-repair loop
sectionImplementLoopstimes. - If there is another outer
qualityLoopscycle remaining, rerun the full sequence again using the one-shot replan flow.
Worked example:
{
"mode": "one-shot",
"useCase": "academic-paper",
"prompt": "Write a research paper on AI safety",
"agents": ["claude", "codex", "gemini"],
"settings": {
"planLoops": 4,
"qualityLoops": 2,
"sectionImplementLoops": 2
}
}If the plan has 3 sections, this configuration means:
8total plan cycles (4 plan loops x 2 quality loops)12total section implementation cycles (3 sections x 2 section loops x 2 quality loops)
That means you can do things like:
- loop the plan multiple times before implementation starts
- keep implementation cheap but review-heavy
- run more repair cycles only when a task is broken into units
- increase quality pressure without paying for your most expensive model at every stage
- let a workflow keep improving while you are away instead of stopping after one pass
These loops are explicit and inspectable. Every pass writes artifacts, records which agent ran which stage, and leaves behind a workflow you can review, compare, and rerun later.
Loopi now records more than just the final answer.
Each run leaves behind a lightweight audit trail so a human can go back later and answer:
- which agent ran which stage
- when each step happened
- which write-enabled steps changed the worktree
- what patch snapshot was captured at run start, before and after write-enabled steps, and at run end
- whether a later attempt was manually forked from an earlier run
The main files and folders to look at are:
shared/scratchpad.txtshared/log.jsonshared/runs.ndjsonshared/tasks/<runId>/task.jsonshared/tasks/<runId>/steps.ndjsonshared/tasks/<runId>/artifacts/*.jsonshared/tasks/<runId>/patches/*.patch
In practice:
scratchpad.txtis the fastest human-readable summarylog.jsonis the legacy machine-readable run logsteps.ndjsontells you which agent ran which stage and whenworktree-snapshotartifacts capture run-start, pre-step, post-step, and run-end states; patch files are persisted for run-start/post-step/run-end, while pre-step is metadata-only by defaultfork-recordartifacts record manual lineage when one run is explicitly based on an earlier run or step
This is meant to give you a durable record, not a fully automated replay system.
If you want to retry a prior attempt manually, you can include an optional top-level fork block in shared/task.json before you run it:
{
"mode": "implement",
"prompt": "Retry the prior attempt with tighter scope.",
"agents": ["codex", "gemini"],
"fork": {
"forkedFromRunId": "run-2026-04-21T12-34-56-789Z",
"forkedFromStepId": "implement-4",
"baseCommit": "abc123def456",
"reason": "Retry with different reviewer feedback",
"recordedBy": "manual"
}
}When present, Loopi writes a fork-record artifact and includes the lineage in the scratchpad and run log.
See shared/task.example.json for a fuller manualForkExample.
Loopi is open source under the Apache License 2.0.
You are free to use, modify, and build on the project under the terms of that license.
If Loopi is useful to you or your team, there are a few ways to support the work:
- star and share the project on GitHub
- open issues and suggestions
- email cb1384@exeter.ac.uk for consulting, workflow design, implementation help, or custom integration support
If your team likes the workflow but wants help applying it in practice, the consulting path is there to accelerate adoption rather than gate the software.
See LICENSE for the full license text and LICENSING.md for a plain-language FAQ.
- Run
npm run cli -- doctorfirst. Without a task file it performs an environment/setup check; withshared/task.jsonpresent it also validates the task configuration, selected agents, and prepared-context readiness whencontextis configured. - If your task uses
context, runnpm run cli -- context prepareafter changing context files, include/exclude patterns, or manifest annotations.npm run context:prepareis the equivalent shortcut. - If an agent is installed but not detected, set the matching
LOOPI_*override. - To find an installed CLI path on Windows, use
where.exe claude,where.exe codex,where.exe gemini, and so on. - On macOS or Linux, use
which claude,which codex,which gemini, and so on. - Advanced or developer override: set
LOOPI_PROJECT_ROOTto point the CLI at a different project root.
The README is the front door.
For deeper configuration and runtime details, see:
Those tools are excellent at single-agent execution. Loopi is for workflows where one model, one pass, and one internal line of reasoning are not enough.
Use Loopi when you want to:
- Route plan, implement, and review to agents trained by different organizations on different data, so a single model's failure mode does not become the workflow's failure mode
- Run implement -> review -> repair for as many cycles as the task needs, with a different reviewer each pass
- Give the workflow an explicit body of reference material through the
contextfolder instead of relying only on repo state or chat history - Mix CLI agents and OpenAI-compatible providers in the same workflow
- Control which step can write and which steps stay read-only
- Keep workflow state in structured artifacts instead of ephemeral chat context
- Tune context delivery, fallback behavior, and loop counts per task instead of accepting one default runtime model
Loopi is not trying to replace the agent tools themselves.
It is the layer that makes them work together harder, more visibly, and more usefully than they do alone.