# 🧬 Boltz-2 Python Client Demo



Copyright (c) 2025, NVIDIA CORPORATION. All rights reserved.



This notebook demonstrates the **boltz2-python-client** package - a comprehensive Python client for NVIDIA's Boltz-2 protein structure prediction service.

## 📦 Installation

```bash
pip install boltz2-python-client
```

## 🎯 What this package provides:

- **Protein structure prediction** from amino acid sequences
- **Protein-ligand complex modeling** (SMILES/CCD ligands) 
- **Covalent complex prediction** with bond constraints
- **DNA-protein interaction modeling**
- **MSA-guided predictions** for improved accuracy
- **Rich CLI interface** with progress bars
- **Both local and NVIDIA hosted endpoint** support


## 🖥️ Command Line Interface Examples

The package provides a rich CLI with the `boltz2` command:


In [1]:
# Check available commands
!boltz2 --help


Usage: boltz2 [OPTIONS] COMMAND [ARGS]...

  Boltz-2 Python Client CLI

  Supports both local deployments and NVIDIA hosted endpoints.

  Examples:

  # Local endpoint boltz2 --base-url http://localhost:8000 protein
  "MKTVRQERLK..."

  # NVIDIA hosted endpoint   boltz2 --base-url https://health.api.nvidia.com
  --endpoint-type nvidia_hosted --api-key YOUR_KEY protein "MKTVRQERLK..."

  # Using environment variable for API key export NVIDIA_API_KEY=your_api_key
  boltz2 --base-url https://health.api.nvidia.com --endpoint-type
  nvidia_hosted protein "MKTVRQERLK..."

Options:
  --base-url TEXT                 Service base URL
  --api-key TEXT                  API key for NVIDIA hosted endpoints (or set
                                  NVIDIA_API_KEY env var)
  --endpoint-type [local|nvidia_hosted]
                                  Type of endpoint: local or nvidia_hosted
  --timeout FLOAT                 Request timeout in seconds
  --poll-seconds INTEGER          Polling interval for 

In [2]:
# 6. Health check
!boltz2 health


[2K[32m✅ Service is healthy [0m[1;32m([0m[32mStatus: healthy[0m[1;32m)[0m
[2Km⠋[0m[32m [0m[32mChecking service health...[0m
[?25h

In [3]:
# 1. Basic protein structure prediction (local endpoint)
!boltz2 protein "MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG" 


[34mℹ️ Predicting structure for protein sequence [0m[1;34m([0m[34mlength: [0m[1;36m65[0m[1;34m)[0m
[34mℹ️ Parameters: [0m[33mrecycling_steps[0m[34m=[0m[1;36m3[0m[34m, [0m[33msampling_steps[0m[34m=[0m[1;36m50[0m
[34mℹ️             [0m[33mdiffusion_samples[0m[34m=[0m[1;36m1[0m[34m, [0m[33mstep_scale[0m[34m=[0m[1;36m1[0m[1;36m.638[0m
[2K[32m⠼[0m Prediction completed! [33m0:00:06[0m0:06[0m
[?25h[32m✅ Prediction completed successfully![0m
[34mℹ️ Generated [0m[1;36m1[0m[34m [0m[1;35mstructure[0m[1;34m([0m[34ms[0m[1;34m)[0m
[34mℹ️ Average confidence: [0m[1;36m0.939[0m
[34mℹ️ Structures saved to: .[0m


In [4]:
# 2. Protein-ligand complex (aspirin)
!boltz2 ligand "MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG" \
    --smiles "CC(=O)OC1=CC=CC=C1C(=O)O" --no-save


[34mℹ️ Predicting protein-ligand complex[0m
[34mℹ️ Protein length: [0m[1;36m65[0m
[34mℹ️ Ligand: [0m[1;35mCC[0m[1;34m([0m[34m=O[0m[1;34m)[0m[33mOC1[0m[34m=[0m[35mCC[0m[34m=[0m[33mCC[0m[34m=[0m[1;35mC1C[0m[1;34m([0m[34m=O[0m[1;34m)[0m[34mO[0m
[2K[32m⠼[0m Prediction completed! [33m0:00:07[0m
[?25h[32m✅ Complex prediction completed successfully![0m
[34mℹ️ Generated [0m[1;36m1[0m[34m [0m[1;35mstructure[0m[1;34m([0m[34ms[0m[1;34m)[0m
[34mℹ️ Average confidence: [0m[1;36m0.866[0m


In [5]:
# 3. Covalent complex with bond constraints
!boltz2 covalent "MKTVRQERLKSCVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG" \
    --ccd U4U --bond A:12:SG:LIG:C22 --no-save


[34mℹ️ Predicting covalent complex structure[0m
[34mℹ️ Protein length: [0m[1;36m65[0m
[34mℹ️ Ligand CCD: U4U[0m
[34mℹ️ Bond constraints: [0m[1;36m1[0m
[34mℹ️   [0m[1;36m1[0m[34m. [0m[1;92mA:12[0m[34m:SG ↔ LIG:[0m[1;92m1:C22[0m
[2K[32m⠙[0m Prediction completed! [33m0:00:08[0m0:08[0mm
[?25h[32m✅ Covalent complex prediction completed successfully![0m
[34mℹ️ Generated [0m[1;36m1[0m[34m [0m[1;35mstructure[0m[1;34m([0m[34ms[0m[1;34m)[0m
[34mℹ️ Average confidence: [0m[1;36m0.834[0m


In [6]:
# 4. DNA-protein complex
!boltz2 dna-protein \
    --protein-sequences "MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG" \
    --dna-sequences "ATCGATCGATCGATCG" --no-save


[34mℹ️ Predicting DNA-protein complex[0m
[34mℹ️ Proteins: [0m[1;36m1[0m[34m sequences[0m
[34mℹ️ DNA: [0m[1;36m1[0m[34m sequences[0m
[34mℹ️ Concatenate MSAs: [0m[3;91mFalse[0m
[2K[32m⠦[0m Prediction completed! [33m0:00:06[0m0:06[0m
[?25h[32m✅ DNA-protein complex prediction completed successfully![0m
[34mℹ️ Generated [0m[1;36m1[0m[34m [0m[1;35mstructure[0m[1;34m([0m[34ms[0m[1;34m)[0m
[34mℹ️ Average confidence: [0m[1;36m0.862[0m


In [7]:
# 5. NVIDIA hosted endpoint with API key  
# Note: Set NVIDIA_API_KEY environment variable first
# !boltz2 --base-url https://health.api.nvidia.com --endpoint-type nvidia_hosted \
#     protein "MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG" --no-save


In [8]:
# Affinity Prediction Using CCD code (Chemical Component Dictionary)
# !boltz2 ligand GMGLGYGSWEIDPKDLTFLKELGTGQFGVVKYGKWRGQYDVAIKMIKEGSMSEDEFIEEAKVMMNLSHEKLVQLYGVCTKQRPIFIITEYMANGCLLNYLREMRHRFQTQQLLEMCKDVCEAMEYLESKQFLHRDLAARNCLVNDQGVVKVSDFGLSRYVLDDEYTSSVGSKFPVRWSPPEVLMYSKFSSKSDIWAFGVLMWEIYSLGKMPYERFTNSETAEHIAQGLRLYRPHLASEKVYTIMYSCWHEKADERPTFKILLSNILDVMDEES \
#   --ccd Y7W \
#   --predict-affinity \
#   --output-dir ./affinity_results


!boltz2 ligand GMGLGYGSWEIDPKDLTFLKELGTGQFGVVKYGKWRGQYDVAIKMIKEGSMSEDEFIEEAKVMMNLSHEKLVQLYGVCTKQRPIFIITEYMANGCLLNYLREMRHRFQTQQLLEMCKDVCEAMEYLESKQFLHRDLAARNCLVNDQGVVKVSDFGLSRYVLDDEYTSSVGSKFPVRWSPPEVLMYSKFSSKSDIWAFGVLMWEIYSLGKMPYERFTNSETAEHIAQGLRLYRPHLASEKVYTIMYSCWHEKADERPTFKILLSNILDVMDEES \
  --smiles "CC(=O)Oc1ccccc1C(=O)O" \
  --predict-affinity \
  --output-dir ./affinity_results

[34mℹ️ Predicting protein-ligand complex[0m
[34mℹ️ Protein length: [0m[1;36m273[0m
[34mℹ️ Ligand: [0m[1;35mCC[0m[1;34m([0m[34m=O[0m[1;34m)[0m[1;35mOc1ccccc1C[0m[1;34m([0m[34m=O[0m[1;34m)[0m[34mO[0m
[34mℹ️ Affinity prediction: ENABLED[0m
[34mℹ️   - Sampling steps: [0m[1;36m200[0m
[34mℹ️   - Diffusion samples: [0m[1;36m5[0m
[34mℹ️   - MW correction: [0m[3;91mFalse[0m
[2K[32m⠹[0m Prediction completed! [33m0:00:37[0m
[?25h[32m✅ Complex prediction completed successfully![0m
[34mℹ️ Generated [0m[1;36m1[0m[34m [0m[1;35mstructure[0m[1;34m([0m[34ms[0m[1;34m)[0m
[34mℹ️ Average confidence: [0m[1;36m0.914[0m

[1;36m📊 Affinity Prediction Results:[0m
┏━━━━━━━━━━━━━━━━━━━━━┳━━━━━━━━━━━━┓
┃[1;35m [0m[1;35mMetric             [0m[1;35m [0m┃[1;35m [0m[1;35mValue     [0m[1;35m [0m┃
┡━━━━━━━━━━━━━━━━━━━━━╇━━━━━━━━━━━━┩
│[36m [0m[36mpIC50              [0m[36m [0m│[32m [0m[32m5.418     [0m[32m [0m│
│[36m [0m[36mlog

## 🐍 Python API Examples

The Python API provides both synchronous and asynchronous interfaces:


In [9]:
# Minimal imports - that's all you need!
from boltz2_client import Boltz2Client
print(f"Ready to predict! 🚀")


Ready to predict! 🚀


### 🚀 Ultra-Minimal Usage Examples

The absolute shortest ways to use boltz2-python-client:

**Note**: In Jupyter notebooks, use the async client with `await`. The `Boltz2SyncClient` is for regular Python scripts only.


In [10]:
# ONE-LINER: Predict protein structure (Jupyter notebook)
result = await Boltz2Client().predict_protein_structure(sequence="MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG")

# For synchronous usage (outside Jupyter - in regular Python scripts):
# from boltz2_client import Boltz2SyncClient
# result = Boltz2SyncClient().predict_protein_structure(sequence="...")

# TWO-LINER: Create client once, use multiple times
client = Boltz2Client()
result = await client.predict_protein_structure(sequence="MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG")

# Save structure directly
await client.predict_protein_structure(sequence="MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG", save_structures=True)


PredictionResponse(structures=[StructureData(format='mmcif', structure="data_model\n_entry.id model\n_struct.entry_id model\n_struct.pdbx_model_details .\n_struct.pdbx_structure_determination_methodology computational\n_struct.title .\n_audit_conform.dict_location https://raw.githubusercontent.com/ihmwg/ModelCIF/80e1e22/dist/mmcif_ma.dic\n_audit_conform.dict_name mmcif_ma.dic\n_audit_conform.dict_version 1.4.7\n#\nloop_\n_chem_comp.id\n_chem_comp.type\n_chem_comp.name\n_chem_comp.formula\n_chem_comp.formula_weight\n_chem_comp.ma_provenance\nALA 'L-peptide linking' . . . 'CCD Core'\nARG 'L-peptide linking' . . . 'CCD Core'\nASN 'L-peptide linking' . . . 'CCD Core'\nASP 'L-peptide linking' . . . 'CCD Core'\nGLN 'L-peptide linking' . . . 'CCD Core'\nGLU 'L-peptide linking' . . . 'CCD Core'\nGLY 'L-peptide linking' . . . 'CCD Core'\nILE 'L-peptide linking' . . . 'CCD Core'\nLEU 'L-peptide linking' . . . 'CCD Core'\nLYS 'L-peptide linking' . . . 'CCD Core'\nMET 'L-peptide linking' . . . 'CC

### 1. Basic Protein Structure Prediction (3 Lines!)


In [11]:
# Minimal protein prediction (3 lines!)
from boltz2_client import Boltz2Client

client = Boltz2Client()
result = await client.predict_protein_structure(sequence="MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG")
print(f"Confidence: {result.confidence_scores[0]:.3f}")


Confidence: 0.940


### 2. Protein-Ligand Complex (2 Lines!)


In [12]:
# Minimal protein-ligand complex (2 lines!)
result = await client.predict_protein_ligand_complex(
    protein_sequence="MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG",
    ligand_smiles="CC(=O)OC1=CC=CC=C1C(=O)O"  # Aspirin
)
print(f"Ligand complex confidence: {result.confidence_scores[0]:.3f}")


Ligand complex confidence: 0.873


### 3. More Minimal Examples


In [13]:
# Covalent complex (using bond constraints)
from boltz2_client import BondConstraint

result = await client.predict_covalent_complex(
    protein_sequence="MKTVRQERLKSCVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG",
    ligand_ccd="U4U",
    covalent_bonds=[(12, "SG", "C22")]  # (residue_number, protein_atom, ligand_atom)
)
print(f"Covalent complex confidence: {result.confidence_scores[0]:.3f}")

# DNA-protein complex (one line!)
result = await client.predict_dna_protein_complex(
    protein_sequences=["MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG"],
    dna_sequences=["ATCGATCGATCGATCG"]
)
print(f"DNA-protein confidence: {result.confidence_scores[0]:.3f}")

# With MSA guidance - only if you have an MSA file!
# First check if example MSA file exists
import os
msa_path = "boltz2-python-client/examples/msa-kras-g12c_combined.a3m"
if os.path.exists(msa_path):
    result = await client.predict_protein_structure(
        sequence="MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG",
        msa_files=[(msa_path, "a3m")]
    )
    print(f"MSA-guided confidence: {result.confidence_scores[0]:.3f}")
else:
    print("ℹ️ MSA example skipped - no MSA file found")
    print("   To use MSA, provide path to your .a3m, .sto, .fasta, or .csv alignment file")


Covalent complex confidence: 0.847
DNA-protein confidence: 0.872
ℹ️ MSA example skipped - no MSA file found
   To use MSA, provide path to your .a3m, .sto, .fasta, or .csv alignment file


## ⚡ Raw API vs Client Comparison

Here's a direct comparison showing the advantages of using the boltz2-python-client:


### 🔴 Raw API Request (Complex & Error-Prone)


In [14]:
import httpx
import asyncio
import json
import time
import os

async def predict_protein_raw_api(sequence):
    """
    Raw API approach - complex and error-prone!
    """
    # Manual endpoint configuration
    base_url = "https://health.api.nvidia.com"
    api_key = os.getenv("NVIDIA_API_KEY")
    
    if not api_key:
        print("❌ API key required")
        return None
    
    # Manual header construction
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
        "NVCF-POLL-SECONDS": "10"
    }
    
    # Manual payload construction
    payload = {
        "polymers": [{
            "id": "A",
            "molecule_type": "protein", 
            "sequence": sequence
        }],
        "recycling_steps": 1,
        "sampling_steps": 10
    }
    
    print("🔴 Raw API: Making request...")
    
    try:
        async with httpx.AsyncClient(timeout=300.0) as client:
            # Initial request
            response = await client.post(
                f"{base_url}/v1/biology/mit/boltz2/predict",
                json=payload,
                headers=headers
            )
            
            # Manual 202 handling for NVIDIA endpoints
            if response.status_code == 202:
                print("🔴 Raw API: Got 202, handling polling manually...")
                
                # Extract task ID
                task_id = response.headers.get("nvcf-reqid")
                if not task_id:
                    raise Exception("No task ID in 202 response")
                
                # Manual polling loop
                poll_url = f"https://api.nvcf.nvidia.com/v2/nvcf/pexec/status/{task_id}"
                
                max_attempts = 30
                for attempt in range(max_attempts):
                    print(f"🔴 Raw API: Polling attempt {attempt + 1}/{max_attempts}...")
                    await asyncio.sleep(10)  # Fixed polling interval
                    
                    status_response = await client.get(poll_url, headers=headers)
                    
                    if status_response.status_code == 200:
                        print("🔴 Raw API: Task completed!")
                        response = status_response
                        break
                    elif status_response.status_code in [400, 401, 404, 422, 500]:
                        raise Exception(f"Task failed: {status_response.status_code} - {status_response.text}")
                    
                    # Continue polling for other status codes
                else:
                    raise Exception("Polling timeout - task did not complete")
            
            elif response.status_code != 200:
                raise Exception(f"Request failed: {response.status_code} - {response.text}")
            
            # Manual response parsing
            data = response.json()
            
            # Extract confidence (manual parsing)
            confidence = None
            if "confidence_scores" in data and data["confidence_scores"]:
                confidence = data["confidence_scores"][0]
            
            print(f"🔴 Raw API: Success! Confidence: {confidence:.3f}")
            return data
            
    except Exception as e:
        print(f"🔴 Raw API: Error - {e}")
        return None

# Example usage (commented out to avoid actual API calls)
# sequence = "MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG"
# result = await predict_protein_raw_api(sequence)

print("🔴 Raw API code shown above - complex with manual polling, error handling, etc.")


🔴 Raw API code shown above - complex with manual polling, error handling, etc.


### 🟢 boltz2-python-client (Simple & Robust)


In [15]:
# boltz2-python-client: ALL complexity handled automatically! (2 lines)
result = await client.predict_protein_structure(sequence="MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG")
print(f"🟢 Success! Confidence: {result.confidence_scores[0]:.3f}")

# That's it! 
# ✅ Automatic polling for NVIDIA endpoints
# ✅ Automatic error handling and retries  
# ✅ Automatic response parsing
# ✅ Type-safe with full IDE support


🟢 Success! Confidence: 0.941


In [16]:
from boltz2_client import Boltz2Client

client = Boltz2Client()
result = await client.predict_protein_structure(
    sequence="MKTVRQERLKSIVRILERSKEPVSGAQLAEELSVSRQVIVQDIAYLRSLGYNIVATPRGYVLAGG" 
)
print(f"Confidence: {result.confidence_scores[0]:.3f}")

Confidence: 0.941


## 📊 Key Advantages Summary

| Feature | Raw API | boltz2-python-client |
|---------|---------|----------------------|
| **Lines of code** | ~80+ lines | **1-3 lines!** |
| **Error handling** | Manual try/catch | **Automatic with retries** |
| **Rate limiting** | Manual delays | **Built-in protection** |
| **NVIDIA polling** | Manual 202 handling | **Automatic polling** |
| **Response parsing** | Manual JSON parsing | **Pydantic models** |
| **Type safety** | None | **Full type hints** |
| **Progress tracking** | None | **Rich progress bars** |
| **File handling** | Manual save/load | **Automatic CIF output** |
| **Configuration** | Manual payload building | **YAML + Python objects** |
| **CLI interface** | None | **Rich CLI with colors** |

### 🎯 The client transforms:
- **80+ lines of complex code** → **1-3 lines of simple code**
- **Manual error handling** → **Automatic retries and graceful failures**
- **Raw JSON responses** → **Typed Python objects**
- **No progress feedback** → **Rich progress bars and status updates**


### 🚀 Minimal Examples Shown:
- **Basic prediction**: 3 lines (or 1 line!)
- **Protein-ligand**: 2 lines
- **Synchronous (no async)**: 1 line with `Boltz2SyncClient`
- **Any complex**: Usually just 1-2 lines!


## 🎉 Conclusion

The **boltz2-python-client** package provides:

✅ **Simplified API** - Complex predictions in just a few lines  
✅ **Production-ready** - Built-in error handling, retries, and rate limiting  
✅ **Rich CLI** - Beautiful command-line interface with progress bars  
✅ **Type safety** - Full Pydantic model validation  
✅ **Flexible endpoints** - Seamless local ↔ NVIDIA hosted switching  
✅ **Advanced features** - Batch processing, MSA support, covalent complexes  

### 📦 Installation:
```bash
pip install boltz2-python-client
```

### 📚 Documentation:
- README.md - Getting started guide
- YAML_GUIDE.md - YAML configuration reference  
- ASYNC_GUIDE.md - Async programming patterns
- COVALENT_COMPLEX_GUIDE.md - Covalent bonding examples
- PARAMETERS.md - Complete parameter reference

**Transform complex protein structure prediction into simple, reliable Python code!** 🧬✨
