aibuildai v2.5.1 — 2026-06-24
aibuildai reads a dataset and a task description and builds a model: it designs
candidate models, trains them, scores them on held-out data, and writes out the
best submission.
Compared to v2.5.0, this release fixes a splitter bug.
Features
- Parallel model search. aibuildai designs several candidate models, trains
and scores each on held-out data, revises the promising ones, and writes out the
single best submission. Longer trainings run concurrently when your hardware has
room; quick ones finish in sequence. Set the breadth insearch
(num_designs,parallel,num_revisions,max_nodes). - Live terminal dashboard. Watch the search as it runs: a tree of every
candidate with its status, training curves, running cost, and elapsed time.
Every metric your training reports gets its own live curve tab, so you can
follow each one as it moves. Arrow keys move between candidates,Tabswitches
panels, and you can open any candidate to read the full step-by-step trajectory
of how it was built — written up as formatted text, with each step showing what
was actually submitted.Ctrl+Cstops the run cleanly. - Replay a finished run.
aibuildai replay <run-dir>re-opens that dashboard
from the run's recorded log — no model calls, no cost. The replay faithfully
reproduces the live run: the same candidate statuses, scores, and timings you
saw while it ran. - Resume a run.
aibuildai resumepicks up a prior run where it left off —
pass a run id or a run directory, or run it with no argument to choose from an
interactive picker of your recent runs. - Memory across runs.
aibuildai memorize task.yamlfolds a task's finished
runs into an editable memory file; later runs of the same task read it and build
on what already worked. - Research tools over MCP. List MCP servers under the top-level
mcps
registry (for example a literature or dataset search) and the agents can call
them while they build. Give a server aroleslist to scope it to specific
agents (designer, coder, reviser) instead of all of them. - Your choice of model provider. Anthropic, DeepSeek, or any
Anthropic-compatible endpoint (see Choose a model provider below). - Time and cost caps. Bound a run by wall-clock minutes
(budget.pipeline_budget_minutes) and by dollars (budget.pipeline_cost_cap_usd,
budget.per_agent_cost_cap_usd); the run stops on its own when a cap is hit. - Stays up to date. The binary updates itself to the latest release on launch,
so you keep getting fixes without re-downloading by hand. Set
AIBUILDAI_AUTO_UPDATE=0to turn this off.
This release also makes long runs more resilient — a diverged training curve or
an unusual cost event no longer ends the run — and a config error for an unknown
key now points you at the key that replaced it. Includes security and reliability
hardening; upgrading is recommended.
Upgrading from v2.0.0 (read this first)
The task is no longer described with command-line flags; it now lives in a YAML
config file. The old flags are removed, and an unrecognized flag now stops the
run instead of being ignored. Translate an old invocation field by field:
| v2.0.0 flag | v2.5.0 YAML field |
|---|---|
--task-name |
run.task_name |
--data-dir |
run.data_root |
--instruction |
run.instruction |
--playground-dir |
run.playground_root |
| model flag | llm.model |
To upgrade an existing install, re-run the install steps below; the binary is
overwritten in place and your aibuildai login session is kept.
Run a task
aibuildai config > task.yaml # starter config: every field with its default
# set run.task_name, run.data_root, run.instruction, run.playground_root, llm.model
aibuildai run task.yamlYour data must already be on disk
aibuildai reads the dataset from <data_root>/<task_name>/public/. Put the files
there before running — e.g. for task_name: titanic, data_root: /data:
/data/titanic/public/{train.csv, test.csv, sample_submission.csv}.
run.task_nameis the dataset's directory name, not a free-text label.run.instructionis a plain-English description of the ML problem to solve.run.playground_rootis where the run writes its work and the final submission,
under<playground_root>/<task_name>/.
Choose a model provider
Set llm.model in task.yaml:
- Anthropic (default,
claude-*ids) — sign in withaibuildai login, or set
AIBUILDAI_API_KEY. - DeepSeek — a bare
deepseek-*id (for exampledeepseek-v4-flash) routes to
DeepSeek's endpoint; export your DeepSeek key asAIBUILDAI_API_KEY. - Another Anthropic-compatible endpoint — set
AIBUILDAI_BASE_URLto it and
AIBUILDAI_API_KEYto its key.
Account commands
aibuildai whoami shows the signed-in account, aibuildai logout signs out, and
aibuildai account opens your plan page.
Requirements
- Linux x86_64 only (no macOS, Windows, or ARM). glibc 2.31 or newer
(Ubuntu 20.04+, Debian 11+, CentOS 8+). - An NVIDIA GPU with a CUDA 12.x runtime.
- 16 GB RAM; ~300 MB disk for the unpacked binary, plus room for a per-task Python
environment and the run's output. condaon yourPATH(install Miniconda if you do not have it).- A Max plan — sign in with
aibuildai login; start a plan at
https://www.aibuildai.io/#products or runaibuildai account.
Install
curl -L -o aibuildai.tar.gz \
https://github.com/aibuildai/AI-Build-AI/releases/download/v2.5.1/aibuildai-linux-x86_64-v2.5.1.tar.gz
tar -xzf aibuildai.tar.gz
cd aibuildai-linux-x86_64-v2.5.1
./install.sh # installs to ~/.local/bin/aibuildai
command -v aibuildai # expect ~/.local/bin/aibuildai
# different path -> ~/.local/bin is shadowed; put it ahead on PATH
# "command not found" -> restart the shell or: source ~/.bashrc
aibuildai login
aibuildai config > task.yaml
aibuildai run task.yamlTo uninstall: rm -rf ~/.local/bin/aibuildai ~/.local/lib/aibuildai.