# Covalent Protein-Ligand Complex Prediction using Boltz-2 NIM and Visualization with Molstar
Copyright (c) 2025, NVIDIA CORPORATION. Licensed under the Apache License, Version 2.0 (the "License") you may not use this file except in compliance with the License. You may obtain a copy of the License at http://www.apache.org/licenses/LICENSE-2.0 Unless required by applicable law or agreed to in writing, software distributed under the License is distributed on an "AS IS" BASIS, WITHOUT WARRANTIES OR CONDITIONS OF ANY KIND, either express or implied. See the License for the specific language governing permissions and limitations under the License.

## Prerequisites
This notebook leverages NVIDIA BioNeMo Boltz-2 NIM hosted locally. It is also possible to use NVIDIA-hosted NIM to run this workflow.  
Visit https://build.nvidia.com for instructions to run self-hosted or NVIDIA-hosted NIMs and system requirements for individual NIMs.

### Steps to launch the Boltz-2 NIM locally
Execute the following code snippets in a bash terminal.
```bash
docker login nvcr.io
Username: $oauthtoken
Password: <PASTE_API_KEY_HERE>

export NGC_API_KEY=<your personal NGC key>
export LOCAL_NIM_CACHE=~/.cache/nim
mkdir -p $LOCAL_NIM_CACHE

docker run -it \
    --runtime=nvidia \
    -p 8000:8000 \
    -e NGC_API_KEY \
    -v "$LOCAL_NIM_CACHE":/opt/nim/.cache \
    nvcr.io/nim/mit/boltz2:1.1.0
```

--- 

__This notebook demonstrates covalent protein-ligand complex prediction using Boltz-2 NIM.__

### Example Details 
- **Source**: Human KRAS G12C bound to a covalent ligand (https://www.rcsb.org/structure/8DNJ)
- **Protein**: Ras-like protein (169 residues)
- **Ligand**: U4U (Chemical Component Dictionary code)
- **Covalent Bond**: Protein Chain A, Cys12 SG ↔ LIG C22

## Setup and Imports

In [1]:
# Install required packages
import subprocess
import sys

def install_package(package):
    try:
        __import__(package.split('==')[0] if '==' in package else package)
        print(f"✅ {package} is already installed")
    except ImportError:
        print(f"📦 Installing {package}...")
        subprocess.check_call([sys.executable, "-m", "pip", "install", package])
        print(f"✅ {package} installed successfully")

# Install required packages
install_package("httpx")
install_package("ipymolstar")
print("\n🎉 All packages ready!")

✅ httpx is already installed
✅ ipymolstar is already installed

🎉 All packages ready!


In [2]:
import asyncio
import json
import os
import time
from pathlib import Path
from datetime import datetime
from typing import Dict, Any, Optional
import httpx
from ipymolstar import PDBeMolstar
from IPython.display import display, HTML

print("All imports successful!")

All imports successful!


### Configuration

In [3]:
# Local Boltz-2 NIM endpoint
BOLTZ2_URL = "http://localhost:8000/biology/mit/boltz2/predict"
HEALTH_URL = "http://localhost:8000/v1/health/live"

# Protein sequence from covalent.txt
PROTEIN_SEQUENCE = "MTEYKLVVVGACGVGKSALTIQLIQNHFVDEYDPTIEDSYRKQVVIDGETCLLDILDTAGQEEYSAMRDQYMRTGEGFLCVFAINNTKSFEDIHHYREQIKRVKDSEDVPMVLVGNKCDLPSRTVDTKQAQDLARSYGIPFIETSAKTRQGVDDAFYTLVREIRKHKE"

print(f"Boltz-2 Endpoint: {BOLTZ2_URL}")
print(f"Protein sequence length: {len(PROTEIN_SEQUENCE)} residues")

Boltz-2 Endpoint: http://localhost:8000/biology/mit/boltz2/predict
Protein sequence length: 168 residues


### Health Check

In [4]:
async def check_nim_health():
    try:
        async with httpx.AsyncClient(timeout=10.0) as client:
            response = await client.get(HEALTH_URL)
            if response.status_code == 200:
                print("✅ Boltz-2 NIM is running and accessible")
                return True
            else:
                print(f"⚠️ Health check returned status {response.status_code}")
                return False
    except Exception as e:
        print(f"❌ Cannot connect to Boltz-2 NIM: {e}")
        return False

nim_healthy = await check_nim_health()

✅ Boltz-2 NIM is running and accessible


### API Client Functions

In [5]:
async def make_boltz2_prediction(request_data: Dict[str, Any], timeout: int = 900) -> Optional[Dict]:
    headers = {"Content-Type": "application/json"}
    
    async with httpx.AsyncClient(timeout=timeout) as client:
        print(f"🚀 Making prediction request...")
        
        try:
            start_time = time.time()
            response = await client.post(BOLTZ2_URL, json=request_data, headers=headers)
            duration = time.time() - start_time
            
            print(f"📡 Response received in {duration:.2f} seconds")
            print(f"📊 Status code: {response.status_code}")
            
            if response.status_code == 200:
                print("✅ Prediction successful!")
                return response.json()
            else:
                print(f"❌ Prediction failed: {response.status_code}")
                print(f"Error: {response.text}")
                return None
        except Exception as e:
            print(f"❌ Request failed: {e}")
            return None

print("API client functions defined!")

API client functions defined!


### Molstar Visualization Functions

In [6]:
def create_molstar_viewer(structure_data: str, title: str = "Covalent Complex", 
                         width="900px", height="600px"):
    """
    Create Molstar viewer for covalent complex visualization.
    
    Args:
        structure_data: mmCIF structure data as string
        title: Title for the visualization
        width: Viewer width in pixels
        height: Viewer height in pixels
    
    Returns:
        PDBeMolstar viewer widget
    """
    # Save structure data to temporary file for Molstar
    temp_file = f"temp_structure_{int(time.time())}.cif"
    with open(temp_file, 'w') as f:
        f.write(structure_data)
    
    # Read as binary for Molstar
    with open(temp_file, 'rb') as f:
        structure_bytes = f.read()
    
    # Clean up temp file
    os.remove(temp_file)
    
    # Create custom data for Molstar
    custom_data = {
        'data': structure_bytes,
        'format': 'cif',
        'binary': False,
    }
    
    # Create Molstar viewer with enhanced settings
    viewer = PDBeMolstar(
        bg_color="black",
        custom_data=custom_data,
        theme='dark',
        hide_water=True,
        hide_carbs=True,
        hide_non_standard=False,
        width=width,
        height=height,
        hide_controls_icon=False,
        hide_expand_icon=False,
        hide_settings_icon=False,
        hide_selection_icon=False,
        hide_animation_icon=False
    )
    
    return viewer

def create_molstar_viewer_minimal(structure_data: str, title: str = "Minimal View", 
                                 width="900px", height="600px"):
    """
    Create minimal Molstar viewer with hidden controls for clean presentation.
    """
    # Save structure data to temporary file
    temp_file = f"temp_structure_minimal_{int(time.time())}.cif"
    with open(temp_file, 'w') as f:
        f.write(structure_data)
    
    # Read as binary
    with open(temp_file, 'rb') as f:
        structure_bytes = f.read()
    
    # Clean up
    os.remove(temp_file)
    
    custom_data = {
        'data': structure_bytes,
        'format': 'cif',
        'binary': False,
    }
    
    # Minimal viewer for clean presentation
    viewer = PDBeMolstar(
        custom_data=custom_data,
        theme='light',
        hide_water=True,
        hide_carbs=True,
        width=width,
        height=height,
        hide_controls_icon=True,
        hide_expand_icon=True,
        hide_settings_icon=True,
        hide_selection_icon=True,
        hide_animation_icon=True
    )
    
    return viewer

print("Molstar visualization functions defined!")

Molstar visualization functions defined!


### Information Panels

In [7]:
def create_covalent_analysis_panel():
    """Create analysis panel for covalent interactions."""
    return """
    <div style='background: #f8f9fa; border: 2px solid #28a745; border-radius: 8px; padding: 20px; margin: 15px 0; font-family: Arial, sans-serif;'>
        <h4 style='margin-top: 0; color: #28a745; text-align: center;'>🔗 Covalent Complex Analysis Guide</h4>
        <div style='display: grid; grid-template-columns: repeat(3, 1fr); gap: 15px; font-size: 14px;'>
            <div style='background: black; padding: 12px; border-radius: 5px; border-left: 4px solid #dc3545;'>
                <h5 style='margin-top: 0; color: #dc3545;'>🎯 Binding Site</h5>
                <p style='margin: 0; line-height: 1.5;'>
                    <strong>Residue:</strong> Cys12<br>
                    <strong>Atom:</strong> SG (Sulfur)<br>
                    <strong>Role:</strong> Nucleophilic attack site
                </p>
            </div>
            <div style='background: black; padding: 12px; border-radius: 5px; border-left: 4px solid #007bff;'>
                <h5 style='margin-top: 0; color: #007bff;'>💊 Ligand</h5>
                <p style='margin: 0; line-height: 1.5;'>
                    <strong>Code:</strong> U4U<br>
                    <strong>Atom:</strong> C22 (Carbon)<br>
                    <strong>Role:</strong> Electrophilic center
                </p>
            </div>
            <div style='background: black; padding: 12px; border-radius: 5px; border-left: 4px solid #ffc107;'>
                <h5 style='margin-top: 0; color: #ffc107;'>⚡ Bond</h5>
                <p style='margin: 0; line-height: 1.5;'>
                    <strong>Type:</strong> Covalent C-S<br>
                    <strong>Mechanism:</strong> Nucleophilic substitution<br>
                    <strong>Strength:</strong> Irreversible
                </p>
            </div>
        </div>
        <div style='margin-top: 15px; padding: 12px; background: black; border-radius: 5px; text-align: center;'>
            <strong>🔍 Analysis Tips:</strong> Look for the covalent bond between Cys12 and the ligand. 
            Use Molstar's measurement tools to analyze bond lengths and angles.
        </div>
    </div>
    """

print("Information panel functions defined!")

Information panel functions defined!


## Covalent Complex Request Setup

In [8]:
# Covalent complex request payload
covalent_request_data = {
    "polymers": [
        {
            "id": "A",
            "molecule_type": "protein",
            "sequence": PROTEIN_SEQUENCE,
            "cyclic": False,
            "modifications": []
        }
    ],
    "ligands": [
        {
            "id": "LIG",
            "ccd": "U4U"
        }
    ],
    "constraints": [
        {
            "constraint_type": "bond",
            "atoms": [
                {
                    "id": "A",
                    "residue_index": 12,
                    "atom_name": "SG"
                },
                {
                    "id": "LIG",
                    "residue_index": 1,
                    "atom_name": "C22"
                }
            ]
        }
    ],
    "recycling_steps": 4,
    "sampling_steps": 75,
    "diffusion_samples": 1,
    "step_scale": 1.4,
    "without_potentials": False,
    "output_format": "mmcif",
    "concatenate_msas": False
}

print("🧬 Covalent Complex Setup")
print(f"Protein: Ras-like ({len(PROTEIN_SEQUENCE)} residues)")
print(f"Ligand: U4U (CCD code)")
print(f"Covalent bond: Cys12 SG ↔ LIG C22")
print(f"Parameters: {covalent_request_data['recycling_steps']} recycling, {covalent_request_data['sampling_steps']} sampling")

🧬 Covalent Complex Setup
Protein: Ras-like (168 residues)
Ligand: U4U (CCD code)
Covalent bond: Cys12 SG ↔ LIG C22
Parameters: 4 recycling, 75 sampling


## Run Prediction

In [9]:
if nim_healthy:
    print(f"🎯 Starting covalent complex prediction at {datetime.now()}")
    
    covalent_result = await make_boltz2_prediction(covalent_request_data, timeout=900)
    
    if covalent_result:
        timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
        output_file = f"covalent_molstar_{timestamp}.json"
        
        with open(output_file, 'w') as f:
            json.dump(covalent_result, f, indent=2)
        
        print(f"💾 Results saved to: {output_file}")
        
        # Save structure files
        structure_files = []
        if 'structures' in covalent_result:
            for i, structure in enumerate(covalent_result['structures']):
                if structure.get('format') == 'mmcif':
                    structure_file = f"covalent_molstar_structure_{i+1}_{timestamp}.cif"
                    with open(structure_file, 'w') as f:
                        f.write(structure['structure'])
                    structure_files.append(structure_file)
                    print(f"💾 Structure {i+1} saved to: {structure_file}")
    else:
        print("❌ Prediction failed")
        covalent_result = None
else:
    print("❌ NIM not accessible")
    covalent_result = None

🎯 Starting covalent complex prediction at 2025-06-05 11:38:28.979915
🚀 Making prediction request...
📡 Response received in 7.19 seconds
📊 Status code: 200
✅ Prediction successful!
💾 Results saved to: covalent_molstar_20250605_113836.json
💾 Structure 1 saved to: covalent_molstar_structure_1_20250605_113836.cif


## 🌟 Visualize with Molstar

Let's visualize the resulting .cif file(s) using MolStar:

In [10]:
if covalent_result and 'structures' in covalent_result:
    print("🌟 Creating Molstar visualizations...")
    
   
    # Display covalent analysis panel
    display(HTML(create_covalent_analysis_panel()))
    
    for i, structure in enumerate(covalent_result['structures']):
        if structure.get('format') == 'mmcif':
            structure_data = structure['structure']
            
            # Get confidence info
            confidence_info = ""
            if 'confidence_scores' in covalent_result:
                avg_conf = sum(covalent_result['confidence_scores']) / len(covalent_result['confidence_scores'])
                confidence_info = f" (Confidence: {avg_conf:.3f})"
            
            title = f"Molstar: Ras + U4U Covalent Complex{confidence_info}"
            
            print(f"\n🌟 Molstar Visualization {i+1}:")
            
            # Create Molstar viewer with full controls
            viewer = create_molstar_viewer(structure_data, title, width="1000px", height="700px")
            
                        
            display(viewer)
            

else:
    print("❌ No structures available for Molstar visualization")

🌟 Creating Molstar visualizations...



🌟 Molstar Visualization 1:


PDBeMolstar(bg_color='black', custom_data={'data': b"data_model\n_entry.id model\n_struct.entry_id model\n_str…

## Summary

This notebook demonstrates comprehensive Boltz-2 NIM usage with interactive visualization:

### ✅ **Key Features:**
1. **Local NIM Integration** - Direct connection to your local Boltz-2 instance
2. **Health Checking** - Verifies NIM availability before making requests
3. **Interactive 3D Visualization** - py3Dmol integration for structure viewing
4. **Multiple Visualization Styles** - Cartoon, stick, sphere, and line representations
5. **Protein-Ligand Complex Support** - Specialized visualization for complexes
6. **Confidence Score Analysis** - Visual and statistical confidence assessment
7. **File Output** - Saves both JSON results and individual mmCIF structure files
8. **Parameter Flexibility** - Easy to adjust prediction quality vs. speed

### 🎨 **Visualization Features:**
- **Interactive Controls**: Mouse rotation, zoom, pan
- **Color Schemes**: Spectrum, chain, residue, atom coloring
- **Style Options**: Cartoon, stick, sphere, line representations
- **Complex Visualization**: Protein-ligand interaction highlighting
- **Confidence Mapping**: Visual confidence score representation

### �� **API Parameters:**
- **recycling_steps**: 1-6 (affects accuracy, default: 3)
- **sampling_steps**: 10-1000 (affects quality, default: 50)
- **diffusion_samples**: 1-5 (multiple predictions, default: 1)
- **step_scale**: 0.5-5.0 (temperature, default: 1.638)

### 📁 **Output Files:**
- `boltz2_prediction_YYYYMMDD_HHMMSS.json` - Complete API response
- `boltz2_structure_N_YYYYMMDD_HHMMSS.cif` - Individual structure files

### 🚀 **Next Steps:**
1. Experiment with different protein sequences
2. Try various ligands using SMILES notation
3. Adjust parameters for your speed/quality needs
4. Export structures for external analysis
5. Compare multiple predictions side-by-side
6. Analyze confidence scores for structure quality assessment