████████╗████████╗ ██████╗ ██╗ ██████╗
╚══██╔══╝╚══██╔══╝ ██╔══██╗ ██║ ██╔═══██╗
██║ ██║ █████╗ ██████╔╝ ██║ ██║ ██║
██║ ██║ ╚════╝ ██╔══██╗ ██║ ██║ ██║
██║ ██║ ██████╔╝ ██║ ╚██████╔╝
╚═╝ ╚═╝ ╚═════╝ ╚═╝ ╚═════╝
Important
TT-Boltz is now TT-Bio
TT-Bio runs Boltz-2, ESMFold2, and Protenix-v2 structure prediction and BoltzGen binder design on Tenstorrent Blackhole and Wormhole, supporting single-card and multi-card configurations (e.g. QuietBox with 4 cards or Galaxy server with 32 cards). Multiple machines can also be combined into a single prediction run.
Create a Python virtual environment with Python 3.10 or 3.12, install TT-Bio, then install the matching Tenstorrent system dependencies.
python3.10 -m venv env
source env/bin/activate
pip install "tt-bio @ git+https://github.com/moritztng/tt-bio.git"
tt-bio install-depstt-bio install-deps installs the SFPI compiler version that matches the installed ttnn wheel and clears stale TT-Metal kernel cache entries. It may ask for your sudo password.
git clone https://github.com/moritztng/tt-bio.git
cd tt-bio
pip install -e .
tt-bio install-depsIf you need to build from source, follow the Tenstorrent Installation Guide.
tt-bio --help
tt-bio predict --help
tt-bio msa --helptt-bio predict examples/prot.yaml --model boltz2 --use_msa_server --overrideEvery command names its model with --model:
boltz2— folds complexes of proteins, DNA, RNA, and ligands and predicts binding affinity. Needs an MSA for each protein chain.esmfold2/esmfold2-fast— fold a single protein sequence on-device, no MSA required (esmfold2-fastis the lighter, faster checkpoint):protenix-v2— folds a single protein with an optional MSA (an AlphaFold3-family model, the Protenix reproduction; an MSA is recommended for best accuracy):
tt-bio predict seq.fasta --model esmfold2-fast --fast
tt-bio predict seq.fasta --model protenix-v2 --use_msa_server # or fold single-sequence with no MSA flagESMFold2 and Protenix-v2 are protein-only, so the ligand, affinity, potential, constraint, template, and energy options below apply to Boltz-2 only. Both can use an MSA: pass --use_msa_server (or a precomputed a3m via the input file / --msa_db_path); with no MSA source they fold single-sequence. The shared options — --fast, --recycling_steps, --sampling_steps, --diffusion_samples, --output_format, the MSA flags, and the multi-card / multi-machine flags — work for every model. Each model downloads its weights automatically on first use.
Boltz-2 needs an MSA (multiple sequence alignment) for each protein chain.
--use_msa_server sends sequences to the ColabFold MSA API and downloads the resulting alignments (online MSA).
--fast makes some operations use block-fp8, a lower-precision numeric format that runs faster. Accuracy is typically very close.
predict accepts either a single YAML/FASTA file or a directory containing many input files.
A live display shows the progress of each protein. On a multi-card machine such
as a QuietBox or Galaxy server, every card is used in parallel and labelled in
the display (quietbox:tt0, quietbox:tt1, ...). Models load once per card
and stay resident, so jobs flow through without per-protein reloads:
tt-bio predict proteins/ --model boltz2 --out_dir results --use_msa_server --fastIf you have additional machines with Tenstorrent cards, you can add them to a single run — see Optional: Multi-Machine Prediction.
Use this if you have enough disk and RAM and want local MSA. This avoids external MSA server calls and is faster for repeated runs.
tt-bio msa
tt-bio predict examples/prot.yaml --model boltz2 --overridett-bio msa downloads UniRef30 to ~/.boltz/msa_db (~100GB download, ~500GB on disk after indexing). predict auto-detects this path.
To add EnvDB and use it in prediction: EnvDB can improve MSA coverage when UniRef30 hits are weak, at higher disk/RAM cost.
tt-bio msa --db all
tt-bio predict examples/prot.yaml --model boltz2 --use_envdb --overrideKey Options:
--override: Re-run from scratch, ignoring cached files--use_msa_server: Generate MSA via ColabFold API--msa_db_path: Use a local database at a custom path (e.g.--msa_db_path /data/colabfold_db)--use_envdb: Include EnvDB in offline MSA (tt-bio msa --db all)--accelerator=tenstorrent: Use Tenstorrent hardware (default, or usecpu/gpu)--fast: Makes some operations use block-fp8, a lower-precision numeric format that runs faster; accuracy is typically very close--debug: Show all raw output from the hardware and libraries instead of the progress display--debug --log: Same as--debug, but also print what each device is currently working on
Predict binding affinity for protein-ligand complexes:
tt-bio predict examples/affinity.yaml --model boltz2 --use_msa_server --override --affinity_mw_correctionThe --affinity_mw_correction flag applies molecular weight correction for more accurate predictions.
ESMFold2 takes a plain protein FASTA or a YAML with one or more protein chains. The richer inputs below — ligands, affinity, DNA/RNA, constraints, and templates — are Boltz-2 features.
Create a YAML file describing your complex:
version: 1
sequences:
- protein:
id: A
sequence: MVTPEGNVSLVDESLLVGVTDEDRAVRSAHQFYERLIGLWAPAVMEAAHELGVFAALAEAPADSGELARRLDCDARAMRVLLDALYAYDVIDRIHDTNGFRYLLSAEARECLLPGTLFSLVGKFMHDINVAWPAWRNLAEVVRHGARDTSGAESPNGIAQEDYESLVGGINFWAPPIVTTLSRKLRASGRSGDATASVLDVGCGTGLYSQLLLREFPRWTATGLDVERIATLANAQALRLGVEERFATRAGDFWRGGWGTGYDLVLFANIFHLQTPASAVRLMRHAAACLAPDGLVAVVDQIVDADREPKTPQDRFALLFAASMTNTGGGDAYTFQEYEEWFTAAGLQRIETLDTPMHRILLARRATEPSAVPEGQASENLYFQ
- ligand:
id: B
smiles: 'N[C@@H](Cc1ccc(O)cc1)C(=O)O'
properties:
- affinity:
binder: BEntity Types:
- Polymers:
protein,dna,rna— providesequence - Ligands:
ligand— providesmilesorccdcode
Multiple Identical Chains:
- protein:
id: [A, B] # Two identical chains
sequence: ...boltz_results_prot/
├── structures/
│ ├── prot.cif # Best-ranked predicted structure
│ └── prot_model_1.cif # Additional samples (if diffusion_samples > 1)
├── results.json # One entry per target with confidence/affinity metrics
├── power_profile.csv # (optional, --report-energy)
├── power_profile.png # (optional, --report-energy)
├── prot_pae.npz # (optional, --write_pae)
├── prot_pde.npz # (optional, --write_pde)
└── prot_embeddings.npz # (optional, --write_embeddings)
MSA results are cached in <out_dir>/msa/ (default ./msa/), keyed by sequence hash. The same protein sequence is never searched twice, even across different input files or runs. The MSA search uses all available CPU threads and keeps the database index memory-mapped for maximum speed.
Each target entry in results.json contains confidence metrics. The fields below are Boltz-2's; an ESMFold2 entry instead carries plddt (mean, 0-1), ptm when available, and n_residues / n_chains.
{
"id": "prot",
"status": "ok",
"confidence_score": 0.84,
"ptm": 0.84,
"iptm": 0.82,
"complex_plddt": 0.84,
"chains_ptm": {
"0": 0.85,
"1": 0.83
},
"pair_chains_iptm": {
"0": {"0": 0.85, "1": 0.72},
"1": {"0": 0.82, "1": 0.83}
}
}confidence_score: Overall confidence (0-1, higher is better), calculated as 0.8 ×complex_plddt+ 0.2 ×iptm. Models are ranked by this scoreptm: Predicted TM-score for complex (0-1)iptm: Interface TM-score (0-1)complex_plddt: Average per-residue confidence (0-1)chains_ptm: Per-chain TM-scores (0-1)pair_chains_iptm: Per-chain-pair interface TM-scores (0-1)
For affinity targets, the same results.json entry also contains:
{
"affinity_pred_value": 2.47,
"affinity_probability_binary": 0.41,
"affinity_pred_value1": 2.55,
"affinity_pred_value2": 2.19,
"affinity_probability_binary1": 0.50,
"affinity_probability_binary2": 0.42
}affinity_probability_binary: Probability of binding (0-1). Use for hit discovery (higher = more likely to bind)affinity_pred_value: Predicted binding affinity as log10(IC50) in μM. Use for ligand optimization (lower = stronger binding). Only compare between known active moleculesaffinity_pred_value1,affinity_pred_value2: Individual model predictions for binding affinityaffinity_probability_binary1,affinity_probability_binary2: Individual model predictions for binding probability
- protein:
id: A
sequence: MVTPEGNVSLVDES...
msa: ./path/to/msa.a3m- protein:
id: A
sequence: MVTPEGNVSLVDES...
modifications:
- position: 5
ccd: PTR # Modified residue code- ligand:
id: B
smiles: 'CC1=CC=CC=C1' # SMILES string
# OR
ccd: ATP # CCD codePocket Constraints (binding site):
constraints:
- pocket:
binder: B # Ligand chain
contacts: [[A, 10], [A, 11], [A, 12]] # Binding site residues
max_distance: 6.0 # Angstroms (4-20A, default 6A)
force: false # Use potential to enforce (default: false)Contact Constraints:
constraints:
- contact:
token1: [A, 10]
token2: [A, 50]
max_distance: 8.0
force: falseUse experimental structures as templates:
templates:
- cif: ./template.cif
chain_id: A
template_id: A
force: true # Enforce template alignment
threshold: 2.0 # Max deviation in AngstromsOptions apply to every model unless tagged (Boltz-2).
Common Options:
| Option | Default | Description |
|---|---|---|
--model |
boltz2 |
boltz2, esmfold2, esmfold2-fast (single-sequence ESMFold2), or protenix-v2 (AlphaFold3-family folder) |
--out_dir |
./ |
Output directory |
--cache |
~/.boltz |
(Boltz-2) model cache directory; ESMFold2 uses the Hugging Face cache |
--accelerator |
tenstorrent |
(Boltz-2) tenstorrent, cpu, or gpu; ESMFold2 always runs on Tenstorrent |
--recycling_steps |
3 |
Number of recycling iterations |
--sampling_steps |
200 |
Diffusion sampling steps |
--diffusion_samples |
1 |
Number of structure samples |
--output_format |
cif |
cif or pdb |
--override |
False |
Re-run from scratch |
--use_msa_server |
False |
Use online ColabFold API for MSAs (required for Boltz-2, optional for ESMFold2) |
--use_potentials |
False |
(Boltz-2) Apply physical constraints |
--affinity_mw_correction |
False |
(Boltz-2) Apply MW correction to affinity |
--num_devices |
0 |
Number of TT devices (0=all available) |
--device_ids |
— | Comma-separated TT device IDs (e.g. 0,2) |
--fast |
False |
Makes some operations use block-fp8, a lower-precision numeric format that runs faster; accuracy is typically very close |
--listen |
— | Accept worker connections from other machines; see Multi-Machine Prediction |
--report-energy |
False |
(Boltz-2) Enables optional energy profiling for one TT device (requires tt-mgmt add-on); writes power_profile.csv and power_profile.png |
--energy-metric |
both |
(Boltz-2) Choose power channel(s): tdp, input, or both |
--energy-sample-hz |
20.0 |
(Boltz-2) Sampling rate in Hz for both power_w and input_power_w channels |
Affinity-Specific Options (Boltz-2):
| Option | Default | Description |
|---|---|---|
--sampling_steps_affinity |
200 |
Sampling steps for affinity |
--diffusion_samples_affinity |
5 |
Number of affinity samples |
MSA Options (Boltz-2; used by ESMFold2 only when you opt into an MSA):
| Option | Default | Description |
|---|---|---|
--msa_db_path |
auto-detect | Path to local ColabFold database |
--use_envdb |
False |
Also search environmental database |
--use_msa_server |
False |
Use ColabFold API for MSA |
--msa_server_url |
https://api.colabfold.com |
MSA server URL |
--msa_pairing_strategy |
greedy |
greedy or complete |
--max_msa_seqs |
8192 |
Maximum MSA sequences |
--subsample_msa |
False |
Subsample MSA |
--num_subsampled_msa |
1024 |
Number of subsampled sequences |
MSA Database Setup Options:
| Option | Default | Description |
|---|---|---|
--db |
uniref30 |
uniref30 (~500GB), envdb (~800GB), or all |
--path |
~/.boltz/msa_db |
Where to store the databases |
--install-tools |
True |
Auto-install missing mmseqs/colabfold_search |
For --use_msa_server:
Basic Authentication:
export BOLTZ_MSA_USERNAME=myuser
export BOLTZ_MSA_PASSWORD=mypassword
tt-bio predict ... --model boltz2 --use_msa_serverAPI Key Authentication:
export MSA_API_KEY_VALUE=your-api-key
tt-bio predict ... --model boltz2 --use_msa_serverCombine the cards across any mix of Tenstorrent machines — a workstation, one or more QuietBoxes, one or more Galaxy servers — into a single run.
On the machine driving the run:
tt-bio predict ./proteins --model boltz2 --listen 8765 --use_msa_server --fastOn every additional machine, replace HOST with the driving machine's
hostname or IP:
tt-bio worker --connect http://HOST:8765Use --report-energy to profile energy during prediction:
tt-bio predict examples/686.yaml --model boltz2 --override --device_ids 0 --report-energy --energy-metric both --energy-sample-hz 5Behavior:
- Select metric channel(s) with
--energy-metric(tdp,input,both) - Uses one sampling rate (
--energy-sample-hz, default 20 Hz) - Supports only Tenstorrent runs with one selected device
- Records two power channels when available:
power_w:tt-mgmtUMD telemetry power (TDP channel)input_power_w:tt-mgmtUMD telemetry input power
- Requires optional
tt-mgmtinstallation:git clone --recursive https://github.com/aperezvicente-TT/tt-mgmt.gitpip install -e ./tt-mgmt
- Prints energy summary metrics for selected channels
- Always writes:
power_profile.csvpower_profile.png
BoltzGen designs protein binders against a target. The pipeline runs design → inverse folding → folding → analysis → filtering and writes the top-ranked binders to <output>/final_ranked_designs/.
tt-bio gen run examples/binder.yaml --num_designs 10This automatically uses every available card (splitting the designs across them and merging the results) and writes to ./binder/. Add --device_ids 0,2 to run on specific cards only.
entities:
- protein:
id: B
sequence: 80..120 # designed chain, sampled length per design
- file:
path: target.cif # target structure (path relative to this yaml)
include:
- chain:
id: A80..120 randomises the binder length per design; a fixed integer pins it. Ligand, DNA, and RNA targets use the same YAML grammar as tt-bio predict. See the BoltzGen examples for binding sites, scaffolds, and residue constraints.
--protocol sets defaults appropriate for the binder type.
| Protocol | Use for |
|---|---|
protein-anything (default) |
de-novo protein binder |
peptide-anything |
peptide binder |
nanobody-anything |
nanobody / VHH |
antibody-anything |
antibody |
protein-small_molecule |
binder against a small-molecule target (adds affinity step) |
protein-redesign |
re-design existing residues (e.g. symmetric dimers) |
--steps restricts the pipeline.
tt-bio gen run examples/binder.yaml --steps design --num_designs 10
tt-bio gen run examples/binder.yaml --output existing/ --steps analysis filtering| Option | Default | Description |
|---|---|---|
--protocol |
protein-anything |
Protocol; sets defaults appropriate for the binder type |
--num_designs |
10000 |
Number of binders to generate |
--budget |
30 |
Number of top designs kept after filtering |
--output |
./<basename>/ |
Output directory |
--steps |
(all) | Run only specific stages |
--config STEP key=val |
— | Override per-stage config (e.g. --config design sampling_steps=200) |
--device_ids |
all cards | Restrict to specific cards (e.g. 0,2) |
--fast |
False |
Use block-fp8 for some ops (slightly lower precision, faster) |
--cache |
~/.boltz/boltzgen |
Cache for downloaded weights |
--debug |
False |
Disable live display; show raw stage output |
--debug --log |
False |
Add per-stage progress markers |
If you use this code or the models in your research, please cite the following papers:
@article{passaro2025boltz2,
author = {Passaro, Saro and Corso, Gabriele and Wohlwend, Jeremy and Reveiz, Mateo and Thaler, Stephan and Somnath, Vignesh Ram and Getz, Noah and Portnoi, Tally and Roy, Julien and Stark, Hannes and Kwabi-Addo, David and Beaini, Dominique and Jaakkola, Tommi and Barzilay, Regina},
title = {Boltz-2: Towards Accurate and Efficient Binding Affinity Prediction},
year = {2025},
doi = {10.1101/2025.06.14.659707},
journal = {bioRxiv}
}
@article{stark2025boltzgen,
author = {Stark, Hannes and Faltings, Felix and Choi, MinGyu and Xie, Yuxin and Hur, Eunsu and O'Donnell, Timothy John and Bushuiev, Anton and U{\c c}ar, Talip and Passaro, Saro and Mao, Weian and Reveiz, Mateo and Bushuiev, Roman and Pluskal, Tom{\'a}{\v s} and Sivic, Josef and Kreis, Karsten and Vahdat, Arash and Ray, Shamayeeta and Goldstein, Jonathan T. and Savinov, Andrew and Hambalek, Jacob A. and Gupta, Anshika and Taquiri-Diaz, Diego A. and Zhang, Yaotian and Hatstat, A. Katherine and Arada, Angelika and Kim, Nam Hyeong and Tackie-Yarboi, Ethel and Boselli, Dylan and Schnaider, Lee and Liu, Chang C. and Li, Gene-Wei and Hnisz, Denes and Sabatini, David M. and DeGrado, William F. and Wohlwend, Jeremy and Corso, Gabriele and Barzilay, Regina and Jaakkola, Tommi},
title = {BoltzGen: Toward Universal Binder Design},
year = {2025},
doi = {10.1101/2025.11.20.689494},
journal = {bioRxiv}
}
@article{wohlwend2024boltz1,
author = {Wohlwend, Jeremy and Corso, Gabriele and Passaro, Saro and Getz, Noah and Reveiz, Mateo and Leidal, Ken and Swiderski, Wojtek and Atkinson, Liam and Portnoi, Tally and Chinn, Itamar and Silterra, Jacob and Jaakkola, Tommi and Barzilay, Regina},
title = {Boltz-1: Democratizing Biomolecular Interaction Modeling},
year = {2024},
doi = {10.1101/2024.11.19.624167},
journal = {bioRxiv}
}
@misc{candido2026language,
author = {Candido, Salvatore and Hayes, Thomas and Derry, Alexander and Rao, Roshan and Lin, Zeming and Verkuil, Robert and others},
title = {Language Modeling Materializes a World Model of Protein Biology},
year = {2026},
url = {https://biohub.ai/papers/esm_protein.pdf},
note = {Preprint; ESMC / ESMFold2}
}
@misc{protenix2025,
author = {{ByteDance AML AI4Science Team}},
title = {Protenix: An AlphaFold3 Reproduction for Biomolecular Structure Prediction},
year = {2025},
url = {https://github.com/bytedance/Protenix}
}In addition if you use the automatic MSA generation, please cite:
@article{mirdita2022colabfold,
title={ColabFold: making protein folding accessible to all},
author={Mirdita, Milot and Sch{\"u}tze, Konstantin and Moriwaki, Yoshitaka and Heo, Lim and Ovchinnikov, Sergey and Steinegger, Martin},
journal={Nature methods},
year={2022}
}tt-bio is released under the MIT License (see LICENSE) and is built on the MIT-licensed Boltz-2 / Boltz-1 code. It bundles third-party code, each under its upstream license: the ESMFold2 host-side reference under tt_bio/_vendor/ (the esm pipeline, MIT, © Chan Zuckerberg Biohub; and the HuggingFace ESMFold2 model definition, Apache-2.0) and the BoltzGen binder-design source under tt_bio/boltzgen/ (MIT, © Hannes Stärk). Protenix-v2 is an independent ttnn reimplementation — no upstream code is vendored — and its weights download from ByteDance's Hugging Face mirror under Apache-2.0. See NOTICE for sources, versions, and modifications.