Skip to content

aibuildai v2.5.1

Latest

Choose a tag to compare

@t2ance t2ance released this 24 Jun 06:15
· 10 commits to main since this release

aibuildai v2.5.1 — 2026-06-24

aibuildai reads a dataset and a task description and builds a model: it designs
candidate models, trains them, scores them on held-out data, and writes out the
best submission.

Compared to v2.5.0, this release fixes a splitter bug.

Features

  • Parallel model search. aibuildai designs several candidate models, trains
    and scores each on held-out data, revises the promising ones, and writes out the
    single best submission. Longer trainings run concurrently when your hardware has
    room; quick ones finish in sequence. Set the breadth in search
    (num_designs, parallel, num_revisions, max_nodes).
  • Live terminal dashboard. Watch the search as it runs: a tree of every
    candidate with its status, training curves, running cost, and elapsed time.
    Every metric your training reports gets its own live curve tab, so you can
    follow each one as it moves. Arrow keys move between candidates, Tab switches
    panels, and you can open any candidate to read the full step-by-step trajectory
    of how it was built — written up as formatted text, with each step showing what
    was actually submitted. Ctrl+C stops the run cleanly.
  • Replay a finished run. aibuildai replay <run-dir> re-opens that dashboard
    from the run's recorded log — no model calls, no cost. The replay faithfully
    reproduces the live run: the same candidate statuses, scores, and timings you
    saw while it ran.
  • Resume a run. aibuildai resume picks up a prior run where it left off —
    pass a run id or a run directory, or run it with no argument to choose from an
    interactive picker of your recent runs.
  • Memory across runs. aibuildai memorize task.yaml folds a task's finished
    runs into an editable memory file; later runs of the same task read it and build
    on what already worked.
  • Research tools over MCP. List MCP servers under the top-level mcps
    registry (for example a literature or dataset search) and the agents can call
    them while they build. Give a server a roles list to scope it to specific
    agents (designer, coder, reviser) instead of all of them.
  • Your choice of model provider. Anthropic, DeepSeek, or any
    Anthropic-compatible endpoint (see Choose a model provider below).
  • Time and cost caps. Bound a run by wall-clock minutes
    (budget.pipeline_budget_minutes) and by dollars (budget.pipeline_cost_cap_usd,
    budget.per_agent_cost_cap_usd); the run stops on its own when a cap is hit.
  • Stays up to date. The binary updates itself to the latest release on launch,
    so you keep getting fixes without re-downloading by hand. Set
    AIBUILDAI_AUTO_UPDATE=0 to turn this off.

This release also makes long runs more resilient — a diverged training curve or
an unusual cost event no longer ends the run — and a config error for an unknown
key now points you at the key that replaced it. Includes security and reliability
hardening; upgrading is recommended.

Upgrading from v2.0.0 (read this first)

The task is no longer described with command-line flags; it now lives in a YAML
config file. The old flags are removed, and an unrecognized flag now stops the
run instead of being ignored. Translate an old invocation field by field:

v2.0.0 flag v2.5.0 YAML field
--task-name run.task_name
--data-dir run.data_root
--instruction run.instruction
--playground-dir run.playground_root
model flag llm.model

To upgrade an existing install, re-run the install steps below; the binary is
overwritten in place and your aibuildai login session is kept.

Run a task

aibuildai config > task.yaml      # starter config: every field with its default
# set run.task_name, run.data_root, run.instruction, run.playground_root, llm.model
aibuildai run task.yaml

Your data must already be on disk

aibuildai reads the dataset from <data_root>/<task_name>/public/. Put the files
there before running — e.g. for task_name: titanic, data_root: /data:
/data/titanic/public/{train.csv, test.csv, sample_submission.csv}.

  • run.task_name is the dataset's directory name, not a free-text label.
  • run.instruction is a plain-English description of the ML problem to solve.
  • run.playground_root is where the run writes its work and the final submission,
    under <playground_root>/<task_name>/.

Choose a model provider

Set llm.model in task.yaml:

  • Anthropic (default, claude-* ids) — sign in with aibuildai login, or set
    AIBUILDAI_API_KEY.
  • DeepSeek — a bare deepseek-* id (for example deepseek-v4-flash) routes to
    DeepSeek's endpoint; export your DeepSeek key as AIBUILDAI_API_KEY.
  • Another Anthropic-compatible endpoint — set AIBUILDAI_BASE_URL to it and
    AIBUILDAI_API_KEY to its key.

Account commands

aibuildai whoami shows the signed-in account, aibuildai logout signs out, and
aibuildai account opens your plan page.

Requirements

  • Linux x86_64 only (no macOS, Windows, or ARM). glibc 2.31 or newer
    (Ubuntu 20.04+, Debian 11+, CentOS 8+).
  • An NVIDIA GPU with a CUDA 12.x runtime.
  • 16 GB RAM; ~300 MB disk for the unpacked binary, plus room for a per-task Python
    environment and the run's output.
  • conda on your PATH (install Miniconda if you do not have it).
  • A Max plan — sign in with aibuildai login; start a plan at
    https://www.aibuildai.io/#products or run aibuildai account.

Install

curl -L -o aibuildai.tar.gz \
  https://github.com/aibuildai/AI-Build-AI/releases/download/v2.5.1/aibuildai-linux-x86_64-v2.5.1.tar.gz


tar -xzf aibuildai.tar.gz
cd aibuildai-linux-x86_64-v2.5.1
./install.sh                      # installs to ~/.local/bin/aibuildai

command -v aibuildai              # expect ~/.local/bin/aibuildai
# different path -> ~/.local/bin is shadowed; put it ahead on PATH
# "command not found" -> restart the shell or: source ~/.bashrc

aibuildai login
aibuildai config > task.yaml
aibuildai run task.yaml

To uninstall: rm -rf ~/.local/bin/aibuildai ~/.local/lib/aibuildai.