aibuildai v2.5.1 — 2026-06-24

aibuildai reads a dataset and a task description and builds a model: it designs
candidate models, trains them, scores them on held-out data, and writes out the
best submission.

Compared to v2.5.0, this release fixes a splitter bug.

Features

Parallel model search. aibuildai designs several candidate models, trains
and scores each on held-out data, revises the promising ones, and writes out the
single best submission. Longer trainings run concurrently when your hardware has
room; quick ones finish in sequence. Set the breadth in search
(num_designs, parallel, num_revisions, max_nodes).
Live terminal dashboard. Watch the search as it runs: a tree of every
candidate with its status, training curves, running cost, and elapsed time.
Every metric your training reports gets its own live curve tab, so you can
follow each one as it moves. Arrow keys move between candidates, Tab switches
panels, and you can open any candidate to read the full step-by-step trajectory
of how it was built — written up as formatted text, with each step showing what
was actually submitted. Ctrl+C stops the run cleanly.
Replay a finished run. aibuildai replay <run-dir> re-opens that dashboard
from the run's recorded log — no model calls, no cost. The replay faithfully
reproduces the live run: the same candidate statuses, scores, and timings you
saw while it ran.
Resume a run. aibuildai resume picks up a prior run where it left off —
pass a run id or a run directory, or run it with no argument to choose from an
interactive picker of your recent runs.
Memory across runs. aibuildai memorize task.yaml folds a task's finished
runs into an editable memory file; later runs of the same task read it and build
on what already worked.
Research tools over MCP. List MCP servers under the top-level mcps
registry (for example a literature or dataset search) and the agents can call
them while they build. Give a server a roles list to scope it to specific
agents (designer, coder, reviser) instead of all of them.
Your choice of model provider. Anthropic, DeepSeek, or any
Anthropic-compatible endpoint (see Choose a model provider below).
Time and cost caps. Bound a run by wall-clock minutes
(budget.pipeline_budget_minutes) and by dollars (budget.pipeline_cost_cap_usd,
budget.per_agent_cost_cap_usd); the run stops on its own when a cap is hit.
Stays up to date. The binary updates itself to the latest release on launch,
so you keep getting fixes without re-downloading by hand. Set
AIBUILDAI_AUTO_UPDATE=0 to turn this off.

This release also makes long runs more resilient — a diverged training curve or
an unusual cost event no longer ends the run — and a config error for an unknown
key now points you at the key that replaced it. Includes security and reliability
hardening; upgrading is recommended.

Upgrading from v2.0.0 (read this first)

The task is no longer described with command-line flags; it now lives in a YAML
config file. The old flags are removed, and an unrecognized flag now stops the
run instead of being ignored. Translate an old invocation field by field:

v2.0.0 flag	v2.5.0 YAML field
`--task-name`	`run.task_name`
`--data-dir`	`run.data_root`
`--instruction`	`run.instruction`
`--playground-dir`	`run.playground_root`
model flag	`llm.model`

To upgrade an existing install, re-run the install steps below; the binary is
overwritten in place and your aibuildai login session is kept.

Run a task

aibuildai config > task.yaml      # starter config: every field with its default
# set run.task_name, run.data_root, run.instruction, run.playground_root, llm.model
aibuildai run task.yaml

Your data must already be on disk

aibuildai reads the dataset from <data_root>/<task_name>/public/. Put the files
there before running — e.g. for task_name: titanic, data_root: /data:
/data/titanic/public/{train.csv, test.csv, sample_submission.csv}.

run.task_name is the dataset's directory name, not a free-text label.
run.instruction is a plain-English description of the ML problem to solve.
run.playground_root is where the run writes its work and the final submission,
under <playground_root>/<task_name>/.

Choose a model provider

Set llm.model in task.yaml:

Anthropic (default, claude-* ids) — sign in with aibuildai login, or set
AIBUILDAI_API_KEY.
DeepSeek — a bare deepseek-* id (for example deepseek-v4-flash) routes to
DeepSeek's endpoint; export your DeepSeek key as AIBUILDAI_API_KEY.
Another Anthropic-compatible endpoint — set AIBUILDAI_BASE_URL to it and
AIBUILDAI_API_KEY to its key.

Account commands

aibuildai whoami shows the signed-in account, aibuildai logout signs out, and
aibuildai account opens your plan page.

Requirements

Linux x86_64 only (no macOS, Windows, or ARM). glibc 2.31 or newer
(Ubuntu 20.04+, Debian 11+, CentOS 8+).
An NVIDIA GPU with a CUDA 12.x runtime.
16 GB RAM; ~300 MB disk for the unpacked binary, plus room for a per-task Python
environment and the run's output.
conda on your PATH (install Miniconda if you do not have it).
A Max plan — sign in with aibuildai login; start a plan at
https://www.aibuildai.io/#products or run aibuildai account.

Install

curl -L -o aibuildai.tar.gz \
  https://github.com/aibuildai/AI-Build-AI/releases/download/v2.5.1/aibuildai-linux-x86_64-v2.5.1.tar.gz


tar -xzf aibuildai.tar.gz
cd aibuildai-linux-x86_64-v2.5.1
./install.sh                      # installs to ~/.local/bin/aibuildai

command -v aibuildai              # expect ~/.local/bin/aibuildai
# different path -> ~/.local/bin is shadowed; put it ahead on PATH
# "command not found" -> restart the shell or: source ~/.bashrc

aibuildai login
aibuildai config > task.yaml
aibuildai run task.yaml

To uninstall: rm -rf ~/.local/bin/aibuildai ~/.local/lib/aibuildai.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

aibuildai v2.5.1

Choose a tag to compare

Sorry, something went wrong.