# Structure Server - Comprehensive Test Suite

This notebook tests all MCP tools in `structure_server.py`:

1. **fetch_molecules** - Fetch structures from PDB/AlphaFold/PDB-REDO
2. **inspect_molecules** - Inspect structure files to analyze chains and molecules
3. **split_molecules** - Split multi-chain structures into individual chains
4. **clean_protein** - Clean protein structures for MD simulation
5. **clean_ligand** - Clean and prepare ligands using SMILES template matching
6. **run_antechamber_robust** - GAFF2 parameterization with AM1-BCC charges
7. **prepare_complex** - Complete workflow (split + clean + parameterize)

Each tool is tested for:
- Normal operation
- Edge cases
- Error handling (LLM-friendly error responses)
- Boltz-2 predicted structures (computational models)
- Ligand preparation and force field generation


In [43]:
# Setup
import sys
sys.path.insert(0, '..')

from pathlib import Path
import json
import importlib
import asyncio

# For running async functions in notebook
import nest_asyncio
nest_asyncio.apply()

print("Setup complete")


Setup complete


In [44]:
# Check dependencies
print("Checking dependencies...\n")

deps = {
    "gemmi": "Structure parsing (mmCIF/PDB)",
    "pdbfixer": "Protein structure cleaning",
    "openmm": "Molecular simulation",
    "httpx": "Async HTTP client",
    "rdkit": "Ligand processing and charge estimation"
}

for module, desc in deps.items():
    try:
        __import__(module)
        print(f"✓ {module}: {desc}")
    except ImportError:
        print(f"✗ {module}: {desc} (NOT INSTALLED)")

# Check external tools
print("\nChecking external tools...")
from common.base import BaseToolWrapper

tools = {
    "pdb4amber": "Amber naming conventions",
    "antechamber": "GAFF2 parameterization",
    "parmchk2": "Missing parameter generation",
    "obabel": "Format conversion"
}

for tool, desc in tools.items():
    wrapper = BaseToolWrapper(tool, conda_env="mcp-md")
    print(f"{'✓' if wrapper.is_available() else '✗'} {tool} ({desc})")


Checking dependencies...

✓ gemmi: Structure parsing (mmCIF/PDB)
✓ pdbfixer: Protein structure cleaning
✓ openmm: Molecular simulation
✓ httpx: Async HTTP client
✓ rdkit: Ligand processing and charge estimation

Checking external tools...
✓ pdb4amber (Amber naming conventions)
✓ antechamber (GAFF2 parameterization)
✓ parmchk2 (Missing parameter generation)
✓ obabel (Format conversion)


In [45]:
# Import and reload the structure server module
import servers.structure_server as structure_module
importlib.reload(structure_module)

# Import tools directly
from servers.structure_server import (
    fetch_molecules,
    inspect_molecules,
    split_molecules,
    clean_protein,
    clean_ligand,
    run_antechamber_robust,
    prepare_complex
)

print("Structure server tools imported successfully")


Structure server tools imported successfully


In [46]:
# Helper function to display results nicely
def show_result(result: dict, title: str = "Result"):
    """Display result dictionary with formatting"""
    print(f"\n{'='*60}")
    print(f" {title}")
    print(f"{'='*60}")
    
    # Check success status
    if result.get('success'):
        print("\n✓ SUCCESS")
    else:
        print("\n✗ FAILED")
    
    # Show errors if any
    if result.get('errors'):
        print("\nErrors:")
        for err in result['errors']:
            print(f"  - {err}")
    
    # Show warnings if any
    if result.get('warnings'):
        print("\nWarnings:")
        for warn in result['warnings']:
            print(f"  - {warn}")
    
    # Show key fields
    skip_keys = {'success', 'errors', 'warnings', 'operations'}
    print("\nDetails:")
    for k, v in result.items():
        if k not in skip_keys:
            if isinstance(v, (dict, list)) and len(str(v)) > 100:
                print(f"  {k}: [complex data, {len(v) if isinstance(v, list) else 'dict'}]")
            else:
                print(f"  {k}: {v}")
    
    # Show operations if present
    if result.get('operations'):
        print("\nOperations:")
        for op in result['operations']:
            status_icon = "✓" if op.get('status') in ['success', 'detected', 'added', 'replaced'] else "○"
            print(f"  {status_icon} {op.get('step')}: {op.get('status')} - {op.get('details', '')[:60]}")

print("Helper function defined")


Helper function defined


---
## Test 1: fetch_molecules

Test fetching structures from different sources.


In [47]:
# Test 1.1: Fetch from PDB (small protein: 1CRN - crambin)
print("Test 1.1: Fetch 1CRN from PDB")

result = asyncio.run(fetch_molecules("1CRN", source="pdb"))
show_result(result, "Fetch 1CRN from PDB")

# Verify file exists
if result['success'] and result['file_path']:
    print(f"\nFile size: {Path(result['file_path']).stat().st_size} bytes")


2025-12-05 19:09:35,894 - servers.structure_server - INFO - Fetching 1CRN from pdb


Test 1.1: Fetch 1CRN from PDB


2025-12-05 19:09:36,324 - servers.structure_server - INFO - Downloaded 1CRN to output/1CRN.cif


2025-12-05 19:09:36,330 - servers.structure_server - INFO - Successfully fetched 1CRN: 327 atoms, chains: ['A']



 Fetch 1CRN from PDB

✓ SUCCESS

Details:
  pdb_id: 1CRN
  source: pdb
  file_path: output/1CRN.cif
  file_format: cif
  num_atoms: 327
  chains: ['A']

File size: 69506 bytes


In [48]:
# Test 1.2: Fetch non-existent PDB ID (error handling)
print("Test 1.2: Fetch non-existent PDB ID")

result = asyncio.run(fetch_molecules("XXXX", source="pdb"))
show_result(result, "Fetch Invalid PDB ID")

# Check that error handling is LLM-friendly
assert not result['success'], "Should fail for invalid PDB ID"
assert len(result['errors']) > 0, "Should have error messages"
print("\n✓ Error handling works correctly")


2025-12-05 19:09:36,338 - servers.structure_server - INFO - Fetching XXXX from pdb


Test 1.2: Fetch non-existent PDB ID



 Fetch Invalid PDB ID

✗ FAILED

Errors:
  - Structure not found: XXXX (HTTP 404)
  - Hint: Verify the PDB ID is correct. Try searching at https://www.rcsb.org/

  - mmCIF not available, falling back to PDB format

Details:
  pdb_id: XXXX
  source: pdb
  file_path: None
  file_format: None
  num_atoms: 0
  chains: []

✓ Error handling works correctly


---
## Test 2: inspect_molecules

Test inspecting structure files to analyze chains and molecular composition.


In [49]:
# Test 2.1: Inspect 1AKE (homodimer with ligand)
print("Test 2.1: Inspect 1AKE structure")

# First fetch 1AKE
fetch_result = asyncio.run(fetch_molecules("1AKE", source="pdb"))
if fetch_result['success']:
    result = inspect_molecules(fetch_result['file_path'])
    show_result(result, "Inspect 1AKE")
    
    # Show detailed chain information
    if result['success']:
        print("\n--- Header Information ---")
        for k, v in result.get('header', {}).items():
            print(f"  {k}: {v}")
        
        print("\n--- Entities (from header) ---")
        for entity in result.get('entities', []):
            print(f"  Entity {entity['entity_id']}: {entity.get('name') or '(no name)'}")
            print(f"    Type: {entity['entity_type']}, Polymer: {entity.get('polymer_type')}")
            print(f"    Chains: {entity['chain_ids']}")
        
        print("\n--- Chain Summary ---")
        summary = result.get('summary', {})
        print(f"  Proteins: {summary.get('num_protein_chains', 0)} chains {summary.get('protein_chain_ids', [])}")
        print(f"  Ligands: {summary.get('num_ligand_chains', 0)} chains {summary.get('ligand_chain_ids', [])}")
        print(f"  Waters: {summary.get('num_water_chains', 0)} chains {summary.get('water_chain_ids', [])}")
        print(f"  Ions: {summary.get('num_ion_chains', 0)} chains {summary.get('ion_chain_ids', [])}")
        
        print("\n--- Chains Detail ---")
        for chain in result.get('chains', []):
            print(f"  Chain {chain['chain_id']} ({chain['author_chain']}): {chain['chain_type']}")
            print(f"    Entity: {chain.get('entity_name') or chain.get('entity_id') or 'N/A'}")
            print(f"    Residues: {chain['num_residues']}, Atoms: {chain['num_atoms']}")
            if chain.get('sequence'):
                seq = chain['sequence']
                print(f"    Sequence: {seq[:50]}{'...' if len(seq) > 50 else ''}")
else:
    print("Failed to fetch 1AKE for inspect test")


2025-12-05 19:09:37,624 - servers.structure_server - INFO - Fetching 1AKE from pdb


Test 2.1: Inspect 1AKE structure


2025-12-05 19:09:37,664 - servers.structure_server - INFO - Downloaded 1AKE to output/1AKE.cif


2025-12-05 19:09:37,688 - servers.structure_server - INFO - Successfully fetched 1AKE: 3816 atoms, chains: ['A', 'B', 'C', 'D', 'E', 'F']


2025-12-05 19:09:37,690 - servers.structure_server - INFO - Inspecting molecules in: output/1AKE.cif


2025-12-05 19:09:37,691 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:37,713 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:09:37,714 - servers.structure_server - INFO -   Proteins: 2, Ligands: 2, Waters: 2, Ions: 0



 Inspect 1AKE

✓ SUCCESS

Details:
  source_file: output/1AKE.cif
  file_format: cif
  header: [complex data, dict]
  entities: [complex data, 3]
  num_models: 1
  chains: [complex data, 6]
  summary: [complex data, dict]

--- Header Information ---
  pdb_id: 1AKE
  title: STRUCTURE OF THE COMPLEX BETWEEN ADENYLATE KINASE FROM ESCHERICHIA COLI AND THE INHIBITOR AP5A REFINED AT 1.9 ANGSTROMS RESOLUTION: A MODEL FOR A CATALYTIC TRANSITION STATE
  resolution: 2.0
  spacegroup: P 21 2 21
  experiment_method: X-RAY DIFFRACTION

--- Entities (from header) ---
  Entity 1: (no name)
    Type: polymer, Polymer: PeptideL
    Chains: ['A', 'B']
  Entity 2: (no name)
    Type: nonpolymer, Polymer: None
    Chains: ['C', 'D']
  Entity 3: (no name)
    Type: water, Polymer: None
    Chains: ['E', 'F']

--- Chain Summary ---
  Proteins: 2 chains ['A', 'B']
  Ligands: 2 chains ['C', 'D']
  Waters: 2 chains ['E', 'F']
  Ions: 0 chains []

--- Chains Detail ---
  Chain A (A): protein
    Entity: 1
    

In [50]:
# Test 2.2: Inspect Boltz-2 predicted structure (computational model)
print("Test 2.2: Inspect Boltz-2 predicted structure")

boltz_cif = "boltz_results_ligand/predictions/ligand/ligand_model_0.cif"
result = inspect_molecules(boltz_cif)
show_result(result, "Inspect Boltz-2 Prediction")

# Show detailed information for AI-generated structure
if result['success']:
    print("\n--- Header Information ---")
    for k, v in result.get('header', {}).items():
        print(f"  {k}: {v}")
    
    print("\n--- Entities ---")
    for entity in result.get('entities', []):
        print(f"  Entity {entity['entity_id']}: {entity.get('name') or '(no name)'}")
        print(f"    Type: {entity['entity_type']}, Polymer: {entity.get('polymer_type')}")
        print(f"    Chains: {entity['chain_ids']}")
    
    print("\n--- Chains Summary ---")
    summary = result.get('summary', {})
    print(f"  Proteins: {summary.get('num_protein_chains', 0)} chains")
    print(f"  Ligands: {summary.get('num_ligand_chains', 0)} chains")
    
    print("\n--- Chain Details ---")
    for chain in result.get('chains', []):
        print(f"  Chain {chain['chain_id']}: {chain['chain_type']}")
        print(f"    Residues: {chain['num_residues']}, Atoms: {chain['num_atoms']}")
        print(f"    Residue types: {chain['residue_names']}")


2025-12-05 19:09:37,721 - servers.structure_server - INFO - Inspecting molecules in: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


Test 2.2: Inspect Boltz-2 predicted structure


2025-12-05 19:09:37,723 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:37,744 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:09:37,745 - servers.structure_server - INFO -   Proteins: 2, Ligands: 4, Waters: 0, Ions: 0



 Inspect Boltz-2 Prediction

✓ SUCCESS

Details:
  source_file: boltz_results_ligand/predictions/ligand/ligand_model_0.cif
  file_format: cif
  header: {'pdb_id': 'model'}
  entities: [complex data, 3]
  num_models: 1
  chains: [complex data, 6]
  summary: [complex data, dict]

--- Header Information ---
  pdb_id: model

--- Entities ---
  Entity 1: (no name)
    Type: polymer, Polymer: PeptideL
    Chains: ['A', 'B']
  Entity 2: (no name)
    Type: nonpolymer, Polymer: None
    Chains: ['C', 'D']
  Entity 3: (no name)
    Type: nonpolymer, Polymer: None
    Chains: ['E', 'F']

--- Chains Summary ---
  Proteins: 2 chains
  Ligands: 4 chains

--- Chain Details ---
  Chain A: protein
    Residues: 384, Atoms: 2961
    Residue types: ['ALA', 'ARG', 'ASN', 'ASP', 'CYS', 'GLN', 'GLU', 'GLY', 'HIS', 'ILE', 'LEU', 'LYS', 'MET', 'PHE', 'PRO', 'SER', 'THR', 'TRP', 'TYR', 'VAL']
  Chain B: protein
    Residues: 384, Atoms: 2961
    Residue types: ['ALA', 'ARG', 'ASN', 'ASP', 'CYS', 'GLN', 'GLU'

---
## Test 3: split_molecules

Test splitting multi-chain structures into individual chain files.
The `split_molecules` function uses `inspect_molecules` internally.


In [51]:
# Test 3.1: Split 1AKE (homodimer with ligand)
print("Test 3.1: Split 1AKE structure")

# First fetch 1AKE (if not already available)
fetch_result = asyncio.run(fetch_molecules("1AKE", source="pdb"))
if fetch_result['success']:
    result = split_molecules(fetch_result['file_path'])
    show_result(result, "Split 1AKE")
    
    # Show chain files
    if result['success']:
        print("\nProtein files:")
        for f in result['protein_files']:
            print(f"  - {f}")
        print("\nLigand files:")
        for f in result['ligand_files']:
            print(f"  - {f}")
        if result['ion_files']:
            print("\nIon files:")
            for f in result['ion_files']:
                print(f"  - {f}")
        
        # Show chain mapping
        print("\nChain to file mapping:")
        for info in result.get('chain_file_info', []):
            print(f"  Chain {info['chain_id']} ({info['chain_type']}): {info['file']}")
else:
    print("Failed to fetch 1AKE for split test")


2025-12-05 19:09:37,752 - servers.structure_server - INFO - Fetching 1AKE from pdb


Test 3.1: Split 1AKE structure


2025-12-05 19:09:37,787 - servers.structure_server - INFO - Downloaded 1AKE to output/1AKE.cif


2025-12-05 19:09:37,807 - servers.structure_server - INFO - Successfully fetched 1AKE: 3816 atoms, chains: ['A', 'B', 'C', 'D', 'E', 'F']


2025-12-05 19:09:37,808 - servers.structure_server - INFO - Splitting structure: output/1AKE.cif


2025-12-05 19:09:37,810 - servers.structure_server - INFO - Inspecting molecules in: output/1AKE.cif


2025-12-05 19:09:37,811 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:37,830 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:09:37,831 - servers.structure_server - INFO -   Proteins: 2, Ligands: 2, Waters: 2, Ions: 0


2025-12-05 19:09:37,833 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:37,836 - servers.structure_server - INFO - Chains to export: ['A', 'B', 'C', 'D']


2025-12-05 19:09:37,848 - servers.structure_server - INFO - Wrote protein: output/85347c82/protein_1.pdb


2025-12-05 19:09:37,859 - servers.structure_server - INFO - Wrote protein: output/85347c82/protein_2.pdb


2025-12-05 19:09:37,861 - servers.structure_server - INFO - Wrote ligand: output/85347c82/ligand_1.pdb


2025-12-05 19:09:37,862 - servers.structure_server - INFO - Wrote ligand: output/85347c82/ligand_2.pdb


2025-12-05 19:09:37,864 - servers.structure_server - INFO - Successfully split structure: 2 protein, 2 ligand, 0 ion, 0 water files



 Split 1AKE

✓ SUCCESS

Details:
  job_id: 85347c82
  output_dir: output/85347c82
  source_file: output/1AKE.cif
  file_format: pdb
  protein_files: ['output/85347c82/protein_1.pdb', 'output/85347c82/protein_2.pdb']
  ligand_files: ['output/85347c82/ligand_1.pdb', 'output/85347c82/ligand_2.pdb']
  ion_files: []
  water_files: []
  all_chains: [complex data, 6]
  chain_file_info: [complex data, 4]
  exclude_waters: True

Protein files:
  - output/85347c82/protein_1.pdb
  - output/85347c82/protein_2.pdb

Ligand files:
  - output/85347c82/ligand_1.pdb
  - output/85347c82/ligand_2.pdb

Chain to file mapping:
  Chain A (protein): output/85347c82/protein_1.pdb
  Chain B (protein): output/85347c82/protein_2.pdb
  Chain C (ligand): output/85347c82/ligand_1.pdb
  Chain D (ligand): output/85347c82/ligand_2.pdb


In [52]:
# Test 3.2: Split with chain selection
print("Test 3.2: Split 1AKE - select only chain A")

if fetch_result['success']:
    result = split_molecules(
        fetch_result['file_path'],
        select_chains=['A']
    )
    show_result(result, "Split 1AKE (Chain A only)")
    
    if result['success']:
        print(f"\nExtracted {len(result['protein_files'])} protein chain(s)")
        print(f"Output directory: {result['output_dir']}")
else:
    print("Skipped - 1AKE not available")


2025-12-05 19:09:37,869 - servers.structure_server - INFO - Splitting structure: output/1AKE.cif


Test 3.2: Split 1AKE - select only chain A


2025-12-05 19:09:37,870 - servers.structure_server - INFO - Inspecting molecules in: output/1AKE.cif


2025-12-05 19:09:37,871 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:37,890 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:09:37,891 - servers.structure_server - INFO -   Proteins: 2, Ligands: 2, Waters: 2, Ions: 0


2025-12-05 19:09:37,893 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:37,896 - servers.structure_server - INFO - Chains to export: ['A']


2025-12-05 19:09:37,908 - servers.structure_server - INFO - Wrote protein: output/a8bb87f0/protein_1.pdb


2025-12-05 19:09:37,909 - servers.structure_server - INFO - Successfully split structure: 1 protein, 0 ligand, 0 ion, 0 water files



 Split 1AKE (Chain A only)

✓ SUCCESS

Details:
  job_id: a8bb87f0
  output_dir: output/a8bb87f0
  source_file: output/1AKE.cif
  file_format: pdb
  protein_files: ['output/a8bb87f0/protein_1.pdb']
  ligand_files: []
  ion_files: []
  water_files: []
  all_chains: [complex data, 6]
  chain_file_info: [complex data, 1]
  exclude_waters: True

Extracted 1 protein chain(s)
Output directory: output/a8bb87f0


In [53]:
# Test 3.3: Split Boltz-2 predicted structure
print("Test 3.3: Split Boltz-2 predicted structure")

boltz_cif = "boltz_results_ligand/predictions/ligand/ligand_model_0.cif"
result = split_molecules(boltz_cif)
show_result(result, "Split Boltz-2 Prediction")

if result['success']:
    print("\nProtein files:")
    for f in result['protein_files']:
        print(f"  - {f}")
    print("\nLigand files:")
    for f in result['ligand_files']:
        print(f"  - {f}")
    if result['ion_files']:
        print("\nIon files:")
        for f in result['ion_files']:
            print(f"  - {f}")
    
    # Show chain mapping
    print("\nChain to file mapping:")
    for info in result.get('chain_file_info', []):
        print(f"  Chain {info['chain_id']} ({info['chain_type']}): {info['file']}")


2025-12-05 19:09:37,915 - servers.structure_server - INFO - Splitting structure: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


Test 3.3: Split Boltz-2 predicted structure


2025-12-05 19:09:37,916 - servers.structure_server - INFO - Inspecting molecules in: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


2025-12-05 19:09:37,917 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:37,944 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:09:37,948 - servers.structure_server - INFO -   Proteins: 2, Ligands: 4, Waters: 0, Ions: 0


2025-12-05 19:09:37,950 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:37,957 - servers.structure_server - INFO - Chains to export: ['A', 'B', 'C', 'D', 'E', 'F']


2025-12-05 19:09:37,980 - servers.structure_server - INFO - Wrote protein: output/adec8515/protein_1.pdb


2025-12-05 19:09:38,002 - servers.structure_server - INFO - Wrote protein: output/adec8515/protein_2.pdb


2025-12-05 19:09:38,003 - servers.structure_server - INFO - Wrote ligand: output/adec8515/ligand_1.pdb


2025-12-05 19:09:38,005 - servers.structure_server - INFO - Wrote ligand: output/adec8515/ligand_2.pdb


2025-12-05 19:09:38,006 - servers.structure_server - INFO - Wrote ligand: output/adec8515/ligand_3.pdb


2025-12-05 19:09:38,007 - servers.structure_server - INFO - Wrote ligand: output/adec8515/ligand_4.pdb


2025-12-05 19:09:38,009 - servers.structure_server - INFO - Successfully split structure: 2 protein, 4 ligand, 0 ion, 0 water files



 Split Boltz-2 Prediction

✓ SUCCESS

Details:
  job_id: adec8515
  output_dir: output/adec8515
  source_file: boltz_results_ligand/predictions/ligand/ligand_model_0.cif
  file_format: pdb
  protein_files: ['output/adec8515/protein_1.pdb', 'output/adec8515/protein_2.pdb']
  ligand_files: [complex data, 4]
  ion_files: []
  water_files: []
  all_chains: [complex data, 6]
  chain_file_info: [complex data, 6]
  exclude_waters: True

Protein files:
  - output/adec8515/protein_1.pdb
  - output/adec8515/protein_2.pdb

Ligand files:
  - output/adec8515/ligand_1.pdb
  - output/adec8515/ligand_2.pdb
  - output/adec8515/ligand_3.pdb
  - output/adec8515/ligand_4.pdb

Chain to file mapping:
  Chain A (protein): output/adec8515/protein_1.pdb
  Chain B (protein): output/adec8515/protein_2.pdb
  Chain C (ligand): output/adec8515/ligand_1.pdb
  Chain D (ligand): output/adec8515/ligand_2.pdb
  Chain E (ligand): output/adec8515/ligand_3.pdb
  Chain F (ligand): output/adec8515/ligand_4.pdb


---
## Test 4: clean_protein

Test protein structure cleaning with PDBFixer.


In [54]:
# Test 4.1: Clean 1CRN (crambin - has disulfide bonds)
print("Test 4.1: Clean 1CRN (crambin with disulfide bonds)")

# First fetch and split
fetch_result = asyncio.run(fetch_molecules("1CRN", source="pdb"))
if fetch_result['success']:
    split_result = split_molecules(fetch_result['file_path'])
    if split_result['success'] and split_result['protein_files']:
        protein_pdb = split_result['protein_files'][0]
        
        result = clean_protein(protein_pdb)
        show_result(result, "Clean 1CRN")
        
        # Check disulfide bonds
        if result.get('disulfide_bonds'):
            print("\nDisulfide bonds detected:")
            for bond in result['disulfide_bonds']:
                print(f"  {bond['residue1']} <-> {bond['residue2']}")
    else:
        print("Failed to split 1CRN")
else:
    print("Failed to fetch 1CRN")


2025-12-05 19:09:38,015 - servers.structure_server - INFO - Fetching 1CRN from pdb


Test 4.1: Clean 1CRN (crambin with disulfide bonds)


2025-12-05 19:09:38,043 - servers.structure_server - INFO - Downloaded 1CRN to output/1CRN.cif


2025-12-05 19:09:38,045 - servers.structure_server - INFO - Successfully fetched 1CRN: 327 atoms, chains: ['A']


2025-12-05 19:09:38,046 - servers.structure_server - INFO - Splitting structure: output/1CRN.cif


2025-12-05 19:09:38,047 - servers.structure_server - INFO - Inspecting molecules in: output/1CRN.cif


2025-12-05 19:09:38,048 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:38,050 - servers.structure_server - INFO - Successfully inspected structure: 1 chains found


2025-12-05 19:09:38,052 - servers.structure_server - INFO -   Proteins: 1, Ligands: 0, Waters: 0, Ions: 0


2025-12-05 19:09:38,053 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:38,055 - servers.structure_server - INFO - Chains to export: ['A']


2025-12-05 19:09:38,058 - servers.structure_server - INFO - Wrote protein: output/d0b00ed3/protein_1.pdb


2025-12-05 19:09:38,059 - servers.structure_server - INFO - Successfully split structure: 1 protein, 0 ligand, 0 ion, 0 water files


2025-12-05 19:09:38,061 - servers.structure_server - INFO - Cleaning protein structure: output/d0b00ed3/protein_1.pdb


2025-12-05 19:09:38,061 - servers.structure_server - INFO - Loading structure with PDBFixer


2025-12-05 19:09:38,078 - servers.structure_server - INFO - Finding missing residues


2025-12-05 19:09:38,079 - servers.structure_server - INFO - Finding non-standard residues


2025-12-05 19:09:38,080 - servers.structure_server - INFO - Removing heterogens (keep_water=False)


2025-12-05 19:09:38,082 - servers.structure_server - INFO - Finding and adding missing atoms


2025-12-05 19:09:38,083 - servers.structure_server - INFO - Detecting disulfide bonds


2025-12-05 19:09:38,085 - servers.structure_server - INFO - Detected 3 disulfide bonds, renamed 6 residues to CYX


2025-12-05 19:09:38,086 - servers.structure_server - INFO - Adding hydrogens at pH 7.4


2025-12-05 19:09:38,218 - servers.structure_server - INFO - Writing cleaned structure to output/d0b00ed3/protein_1.clean.pdb


2025-12-05 19:09:38,222 - servers.structure_server - INFO - Running pdb4amber to convert to Amber conventions


2025-12-05 19:09:39,553 - servers.structure_server - INFO - pdb4amber conversion successful: output/d0b00ed3/protein_1.amber.pdb


2025-12-05 19:09:39,555 - servers.structure_server - INFO - Successfully cleaned protein structure: output/d0b00ed3/protein_1.amber.pdb



 Clean 1CRN

✓ SUCCESS

Details:
  output_file: output/d0b00ed3/protein_1.amber.pdb
  input_file: output/d0b00ed3/protein_1.pdb
  cap_termini_required: False
  statistics: {'initial_chains': 1, 'initial_residues': 46, 'final_residues': 46, 'final_atoms': 618}
  disulfide_bonds: [complex data, 3]
  pdbfixer_output: output/d0b00ed3/protein_1.clean.pdb

Operations:
  ✓ load_structure: success - Loaded 1 chain(s), 46 residue(s)
  ○ missing_residues: none_found - No missing residues found
  ○ nonstandard_residues: none_found - No non-standard residues found
  ✓ remove_heterogens: success - Removed heterogens, water removed
  ○ missing_atoms: none_found - No missing atoms or residues found
  ✓ disulfide_bonds: detected - Found 3 disulfide bond(s), renamed 6 CYS -> CYX for Amber
  ✓ protonation: success - Added hydrogens at pH 7.4
  ✓ write_output: success - Wrote 618 atoms to output/d0b00ed3/protein_1.clean.pdb
  ✓ pdb4amber: success - Converted to Amber conventions: output/d0b00ed3/protein

In [55]:
# Test 4.2: Clean with custom options
print("Test 4.2: Clean with custom options (with termini capping)")

if fetch_result['success'] and split_result['success']:
    protein_pdb = split_result['protein_files'][0]
    
    result = clean_protein(
        protein_pdb,
        cap_termini=True,
        ph=7.0
    )
    show_result(result, "Clean with Custom Options")
else:
    print("Skipped - previous test failed")


2025-12-05 19:09:39,561 - servers.structure_server - INFO - Cleaning protein structure: output/d0b00ed3/protein_1.pdb


Test 4.2: Clean with custom options (with termini capping)


2025-12-05 19:09:39,563 - servers.structure_server - INFO - Loading structure with PDBFixer


2025-12-05 19:09:39,574 - servers.structure_server - INFO - Finding missing residues


2025-12-05 19:09:39,575 - servers.structure_server - INFO - Added ACE/NME caps to missingResidues for chains: ['A']


2025-12-05 19:09:39,576 - servers.structure_server - INFO - Finding non-standard residues


2025-12-05 19:09:39,577 - servers.structure_server - INFO - Removing heterogens (keep_water=False)


2025-12-05 19:09:39,579 - servers.structure_server - INFO - Finding and adding missing atoms


2025-12-05 19:09:39,800 - servers.structure_server - INFO - Added missing atoms/residues: 2 missing residue(s)


2025-12-05 19:09:39,801 - servers.structure_server - INFO - Detecting disulfide bonds


2025-12-05 19:09:39,802 - servers.structure_server - INFO - Detected 3 disulfide bonds, renamed 6 residues to CYX


2025-12-05 19:09:39,803 - servers.structure_server - INFO - Adding hydrogens at pH 7.0


2025-12-05 19:09:39,918 - servers.structure_server - INFO - Writing cleaned structure to output/d0b00ed3/protein_1.clean.pdb


2025-12-05 19:09:39,922 - servers.structure_server - INFO - Running pdb4amber to convert to Amber conventions


2025-12-05 19:09:40,830 - servers.structure_server - INFO - pdb4amber conversion successful: output/d0b00ed3/protein_1.amber.pdb


2025-12-05 19:09:40,832 - servers.structure_server - INFO - Successfully cleaned protein structure: output/d0b00ed3/protein_1.amber.pdb



 Clean with Custom Options

✓ SUCCESS

Details:
  output_file: output/d0b00ed3/protein_1.amber.pdb
  input_file: output/d0b00ed3/protein_1.pdb
  cap_termini_required: True
  statistics: {'initial_chains': 1, 'initial_residues': 46, 'final_residues': 48, 'final_atoms': 627}
  disulfide_bonds: [complex data, 3]
  pdbfixer_output: output/d0b00ed3/protein_1.clean.pdb

Operations:
  ✓ load_structure: success - Loaded 1 chain(s), 46 residue(s)
  ○ terminal_caps: added_to_missing - Added ACE/NME caps as missing residues for 1 chain(s): ['A']
  ○ nonstandard_residues: none_found - No non-standard residues found
  ✓ remove_heterogens: success - Removed heterogens, water removed
  ✓ missing_atoms: added - Added 2 missing residue(s)
  ✓ disulfide_bonds: detected - Found 3 disulfide bond(s), renamed 6 CYS -> CYX for Amber
  ✓ protonation: success - Added hydrogens at pH 7.0
  ✓ write_output: success - Wrote 627 atoms to output/d0b00ed3/protein_1.clean.pdb
  ✓ pdb4amber: success - Converted to Amb

In [56]:
# Test 4.3: Clean non-existent file (error handling)
print("Test 4.3: Clean non-existent file")

result = clean_protein("/nonexistent/protein.pdb")
show_result(result, "Clean Non-existent File")

assert not result['success'], "Should fail for non-existent file"
print("\n✓ File not found error handling works")


2025-12-05 19:09:40,838 - servers.structure_server - INFO - Cleaning protein structure: /nonexistent/protein.pdb


Test 4.3: Clean non-existent file


2025-12-05 19:09:40,840 - servers.structure_server - ERROR - Input file not found: /nonexistent/protein.pdb



 Clean Non-existent File

✗ FAILED

Errors:
  - Input file not found: /nonexistent/protein.pdb

Details:
  output_file: None
  input_file: /nonexistent/protein.pdb
  cap_termini_required: False
  statistics: {}
  disulfide_bonds: []

✓ File not found error handling works


---
## Test 5: clean_ligand

Test ligand cleaning using SMILES template matching.


In [57]:
# Test 5.1: Clean ligand from 1AKE (AP5A inhibitor)
print("Test 5.1: Clean ligand from 1AKE")

# Fetch and split 1AKE to get ligand
fetch_result = asyncio.run(fetch_molecules("1AKE", source="pdb"))
if fetch_result['success']:
    # Split to get ligand chains
    split_result = split_molecules(fetch_result['file_path'])
    
    if split_result['success'] and split_result['ligand_files']:
        ligand_pdb = split_result['ligand_files'][0]
        print(f"Ligand PDB: {ligand_pdb}")
        
        # Get ligand ID from chain info
        ligand_info = [c for c in split_result['chain_file_info'] if c['chain_type'] == 'ligand'][0]
        ligand_id = split_result['all_chains'][2]['residue_names'][0]  # Get ligand name
        print(f"Ligand ID: {ligand_id}")
        
        # Clean ligand using SMILES template matching
        result = clean_ligand(
            ligand_pdb=ligand_pdb,
            ligand_id=ligand_id,  # AP5A
            target_ph=7.4,
            optimize=True
        )
        show_result(result, "Clean 1AKE Ligand (AP5A)")
        
        if result['success']:
            print(f"\nOutput SDF: {result['sdf_file']}")
            print(f"Net charge: {result['net_charge']}")
            print(f"SMILES source: {result['smiles_source']}")
    else:
        print("No ligand files found")
else:
    print("Failed to fetch 1AKE")


2025-12-05 19:09:40,851 - servers.structure_server - INFO - Fetching 1AKE from pdb


Test 5.1: Clean ligand from 1AKE


2025-12-05 19:09:40,890 - servers.structure_server - INFO - Downloaded 1AKE to output/1AKE.cif


2025-12-05 19:09:40,912 - servers.structure_server - INFO - Successfully fetched 1AKE: 3816 atoms, chains: ['A', 'B', 'C', 'D', 'E', 'F']


2025-12-05 19:09:40,915 - servers.structure_server - INFO - Splitting structure: output/1AKE.cif


2025-12-05 19:09:40,916 - servers.structure_server - INFO - Inspecting molecules in: output/1AKE.cif


2025-12-05 19:09:40,917 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:40,938 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:09:40,939 - servers.structure_server - INFO -   Proteins: 2, Ligands: 2, Waters: 2, Ions: 0


2025-12-05 19:09:40,940 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:40,944 - servers.structure_server - INFO - Chains to export: ['A', 'B', 'C', 'D']


2025-12-05 19:09:40,956 - servers.structure_server - INFO - Wrote protein: output/0ca00649/protein_1.pdb


2025-12-05 19:09:40,967 - servers.structure_server - INFO - Wrote protein: output/0ca00649/protein_2.pdb


2025-12-05 19:09:40,969 - servers.structure_server - INFO - Wrote ligand: output/0ca00649/ligand_1.pdb


2025-12-05 19:09:40,970 - servers.structure_server - INFO - Wrote ligand: output/0ca00649/ligand_2.pdb


2025-12-05 19:09:40,971 - servers.structure_server - INFO - Successfully split structure: 2 protein, 2 ligand, 0 ion, 0 water files


2025-12-05 19:09:40,972 - servers.structure_server - INFO - Cleaning ligand: output/0ca00649/ligand_1.pdb (ID: AP5)


Ligand PDB: output/0ca00649/ligand_1.pdb
Ligand ID: AP5


2025-12-05 19:09:41,338 - servers.structure_server - INFO - Fetched SMILES for AP5 from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CO[...


2025-12-05 19:09:41,715 - servers.structure_server - INFO - Fetched SMILES for AP5 from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CO[...


2025-12-05 19:09:41,722 - servers.structure_server - INFO - Using SMILES from ccd: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CO[...


2025-12-05 19:09:41,725 - servers.structure_server - INFO - Applying pH 7.4 protonation to SMILES...


2025-12-05 19:09:41,751 - servers.structure_server - INFO - Protonation result: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]... → Nc1ncnc2c1ncn2[C@@H]1O[C@H](CO... (charge: -5)


2025-12-05 19:09:41,753 - servers.structure_server - INFO - Protonated SMILES at pH 7.4: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CO[P@](=O)([O-])O[P@](...


2025-12-05 19:09:41,755 - servers.structure_server - INFO - Calculated net charge: -5


2025-12-05 19:09:41,756 - servers.structure_server - INFO - Read PDB: 57 atoms





2025-12-05 19:09:41,762 - servers.structure_server - INFO - Added hydrogens: 81 total atoms


2025-12-05 19:09:41,763 - servers.structure_server - INFO - Running MMFF94 optimization (max 200 iters)...


2025-12-05 19:09:41,816 - servers.structure_server - INFO - MMFF94 optimization did not converge


2025-12-05 19:09:41,818 - servers.structure_server - INFO - Final net charge: -5 (source: dimorphite)


2025-12-05 19:09:41,819 - servers.structure_server - INFO - Wrote prepared ligand: /Users/yasu/tmp/mcp-md/notebooks/output/0ca00649/ligand_1_prepared.sdf


2025-12-05 19:09:41,820 - servers.structure_server - INFO - Successfully cleaned ligand: /Users/yasu/tmp/mcp-md/notebooks/output/0ca00649/ligand_1_prepared.sdf



 Clean 1AKE Ligand (AP5A)

✓ SUCCESS

  - Template matching with H failed: Template matching failed: No matching found. PDB atoms: 57, Template atoms: 81, trying without H

Details:
  ligand_pdb: output/0ca00649/ligand_1.pdb
  ligand_id: AP5
  sdf_file: /Users/yasu/tmp/mcp-md/notebooks/output/0ca00649/ligand_1_prepared.sdf
  net_charge: -5
  charge_source: dimorphite
  mol_formal_charge: -5
  smiles_used: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CO[P@](=O)([O-])O[P@](=O)([O-])OP(=O)([O-])O[P@@](=O)([O-])O[P@@](=O)([O-])OC[C@H]2O[C@@H](n3cnc4c(N)ncnc43)[C@H](O)[C@@H]2O)[C@@H](O)[C@H]1O
  smiles_original: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CO[P@](=O)(O)O[P@](=O)(O)OP(=O)(O)O[P@@](=O)(O)O[P@@](=O)(O)OC[C@@H]4[C@H]([C@H]([C@@H](O4)n5cnc6c5ncnc6N)O)O)O)O)N
  smiles_source: ccd
  target_ph: 7.4
  num_atoms: 81
  num_heavy_atoms: 57
  optimized: True
  optimization_converged: False
  output_dir: /Users/yasu/tmp/mcp-md/notebooks/output/0ca00649

Output SDF: /Users/yasu/tmp/mcp-md/notebooks/out

In [58]:
# Test 5.2: Clean SAH ligand from Boltz-2 prediction
print("Test 5.2: Clean SAH ligand from Boltz-2 prediction")

boltz_cif = "boltz_results_ligand/predictions/ligand/ligand_model_0.cif"
split_result = split_molecules(boltz_cif)

if split_result['success'] and split_result['ligand_files']:
    # Find SAH ligand chain
    sah_file = None
    sah_chain = None
    for info in split_result['chain_file_info']:
        if info['chain_type'] == 'ligand':
            # Get residue name from all_chains
            for chain in split_result['all_chains']:
                if chain['chain_id'] == info['chain_id']:
                    if 'SAH' in chain['residue_names']:
                        sah_file = info['file']
                        sah_chain = chain
                        break
        if sah_file:
            break
    
    if sah_file:
        print(f"SAH ligand PDB: {sah_file}")
        
        result = clean_ligand(
            ligand_pdb=sah_file,
            ligand_id="SAH",
            target_ph=7.4,
            optimize=True
        )
        show_result(result, "Clean Boltz-2 SAH Ligand")
        
        if result['success']:
            print(f"\nOutput SDF: {result['sdf_file']}")
            print(f"Net charge: {result['net_charge']}")
    else:
        print("SAH ligand not found in Boltz-2 structure")
else:
    print("Failed to split Boltz-2 structure")


2025-12-05 19:09:41,827 - servers.structure_server - INFO - Splitting structure: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


Test 5.2: Clean SAH ligand from Boltz-2 prediction


2025-12-05 19:09:41,828 - servers.structure_server - INFO - Inspecting molecules in: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


2025-12-05 19:09:41,829 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:41,849 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:09:41,850 - servers.structure_server - INFO -   Proteins: 2, Ligands: 4, Waters: 0, Ions: 0


2025-12-05 19:09:41,852 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:41,856 - servers.structure_server - INFO - Chains to export: ['A', 'B', 'C', 'D', 'E', 'F']


2025-12-05 19:09:41,875 - servers.structure_server - INFO - Wrote protein: output/3463ffd6/protein_1.pdb


2025-12-05 19:09:41,895 - servers.structure_server - INFO - Wrote protein: output/3463ffd6/protein_2.pdb


2025-12-05 19:09:41,897 - servers.structure_server - INFO - Wrote ligand: output/3463ffd6/ligand_1.pdb


2025-12-05 19:09:41,898 - servers.structure_server - INFO - Wrote ligand: output/3463ffd6/ligand_2.pdb


2025-12-05 19:09:41,900 - servers.structure_server - INFO - Wrote ligand: output/3463ffd6/ligand_3.pdb


2025-12-05 19:09:41,901 - servers.structure_server - INFO - Wrote ligand: output/3463ffd6/ligand_4.pdb


2025-12-05 19:09:41,902 - servers.structure_server - INFO - Successfully split structure: 2 protein, 4 ligand, 0 ion, 0 water files


2025-12-05 19:09:41,903 - servers.structure_server - INFO - Cleaning ligand: output/3463ffd6/ligand_1.pdb (ID: SAH)


SAH ligand PDB: output/3463ffd6/ligand_1.pdb


2025-12-05 19:09:42,257 - servers.structure_server - INFO - Fetched SMILES for SAH from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:09:42,616 - servers.structure_server - INFO - Fetched SMILES for SAH from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:09:42,619 - servers.structure_server - INFO - Using SMILES from ccd: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:09:42,621 - servers.structure_server - INFO - Applying pH 7.4 protonation to SMILES...


2025-12-05 19:09:42,632 - servers.structure_server - INFO - Protonation result: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]... → Nc1ncnc2c1ncn2[C@@H]1O[C@H](CS... (charge: -1)


2025-12-05 19:09:42,633 - servers.structure_server - INFO - Protonated SMILES at pH 7.4: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)[O-])...


2025-12-05 19:09:42,635 - servers.structure_server - INFO - Calculated net charge: -1


2025-12-05 19:09:42,637 - servers.structure_server - INFO - Read PDB: 26 atoms





2025-12-05 19:09:42,641 - servers.structure_server - INFO - Added hydrogens: 45 total atoms


2025-12-05 19:09:42,642 - servers.structure_server - INFO - Running MMFF94 optimization (max 200 iters)...


2025-12-05 19:09:42,666 - servers.structure_server - INFO - MMFF94 optimization did not converge


2025-12-05 19:09:42,668 - servers.structure_server - INFO - Final net charge: -1 (source: dimorphite)


2025-12-05 19:09:42,671 - servers.structure_server - INFO - Wrote prepared ligand: /Users/yasu/tmp/mcp-md/notebooks/output/3463ffd6/ligand_1_prepared.sdf


2025-12-05 19:09:42,674 - servers.structure_server - INFO - Successfully cleaned ligand: /Users/yasu/tmp/mcp-md/notebooks/output/3463ffd6/ligand_1_prepared.sdf



 Clean Boltz-2 SAH Ligand

✓ SUCCESS

  - Template matching with H failed: Template matching failed: No matching found. PDB atoms: 26, Template atoms: 45, trying without H

Details:
  ligand_pdb: output/3463ffd6/ligand_1.pdb
  ligand_id: SAH
  sdf_file: /Users/yasu/tmp/mcp-md/notebooks/output/3463ffd6/ligand_1_prepared.sdf
  net_charge: -1
  charge_source: dimorphite
  mol_formal_charge: -1
  smiles_used: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)[O-])[C@@H](O)[C@H]1O
  smiles_original: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSCC[C@@H](C(=O)O)N)O)O)N
  smiles_source: ccd
  target_ph: 7.4
  num_atoms: 45
  num_heavy_atoms: 26
  optimized: True
  optimization_converged: False
  output_dir: /Users/yasu/tmp/mcp-md/notebooks/output/3463ffd6

Output SDF: /Users/yasu/tmp/mcp-md/notebooks/output/3463ffd6/ligand_1_prepared.sdf
Net charge: -1


In [59]:
# Test 5.3: Clean ligand with user-provided SMILES
print("Test 5.3: Clean ligand with user-provided SMILES")

# Use Boltz-2 SAH ligand with explicit SMILES
boltz_cif = "boltz_results_ligand/predictions/ligand/ligand_model_0.cif"
split_result = split_molecules(boltz_cif)

if split_result['success'] and split_result['ligand_files']:
    # Get first ligand file
    ligand_file = split_result['ligand_files'][0]
    ligand_chain = split_result['chain_file_info'][2]  # First ligand
    ligand_name = split_result['all_chains'][2]['residue_names'][0]
    
    print(f"Ligand: {ligand_name}")
    print(f"File: {ligand_file}")
    
    # SAH SMILES from PDB CCD
    sah_smiles = "Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)O)[C@@H](O)[C@H]1O"
    
    result = clean_ligand(
        ligand_pdb=ligand_file,
        ligand_id=ligand_name,
        smiles=sah_smiles,  # User-provided SMILES
        target_ph=7.4,
        optimize=False  # Skip optimization for speed
    )
    show_result(result, "Clean Ligand with User SMILES")
    
    if result['success']:
        print(f"\nSMILES source: {result['smiles_source']}")  # Should be 'user'
else:
    print("Failed to get ligand from Boltz-2")


2025-12-05 19:09:42,682 - servers.structure_server - INFO - Splitting structure: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


Test 5.3: Clean ligand with user-provided SMILES


2025-12-05 19:09:42,685 - servers.structure_server - INFO - Inspecting molecules in: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


2025-12-05 19:09:42,687 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:42,712 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:09:42,713 - servers.structure_server - INFO -   Proteins: 2, Ligands: 4, Waters: 0, Ions: 0


2025-12-05 19:09:42,715 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:42,719 - servers.structure_server - INFO - Chains to export: ['A', 'B', 'C', 'D', 'E', 'F']


2025-12-05 19:09:42,742 - servers.structure_server - INFO - Wrote protein: output/b09df635/protein_1.pdb


2025-12-05 19:09:42,762 - servers.structure_server - INFO - Wrote protein: output/b09df635/protein_2.pdb


2025-12-05 19:09:42,763 - servers.structure_server - INFO - Wrote ligand: output/b09df635/ligand_1.pdb


2025-12-05 19:09:42,764 - servers.structure_server - INFO - Wrote ligand: output/b09df635/ligand_2.pdb


2025-12-05 19:09:42,766 - servers.structure_server - INFO - Wrote ligand: output/b09df635/ligand_3.pdb


2025-12-05 19:09:42,767 - servers.structure_server - INFO - Wrote ligand: output/b09df635/ligand_4.pdb


2025-12-05 19:09:42,768 - servers.structure_server - INFO - Successfully split structure: 2 protein, 4 ligand, 0 ion, 0 water files


2025-12-05 19:09:42,769 - servers.structure_server - INFO - Cleaning ligand: output/b09df635/ligand_1.pdb (ID: SAH)


Ligand: SAH
File: output/b09df635/ligand_1.pdb


2025-12-05 19:09:42,771 - servers.structure_server - INFO - Using user-provided SMILES for SAH


2025-12-05 19:09:42,771 - servers.structure_server - INFO - Using SMILES from user: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)O)[C@...


2025-12-05 19:09:42,773 - servers.structure_server - INFO - Applying pH 7.4 protonation to SMILES...


2025-12-05 19:09:42,779 - servers.structure_server - INFO - Protonation result: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CS... → Nc1ncnc2c1ncn2[C@@H]1O[C@H](CS... (charge: -1)


2025-12-05 19:09:42,780 - servers.structure_server - INFO - Protonated SMILES at pH 7.4: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)[O-])...


2025-12-05 19:09:42,781 - servers.structure_server - INFO - Calculated net charge: -1


2025-12-05 19:09:42,782 - servers.structure_server - INFO - Read PDB: 26 atoms





2025-12-05 19:09:42,785 - servers.structure_server - INFO - Added hydrogens: 45 total atoms


2025-12-05 19:09:42,786 - servers.structure_server - INFO - Final net charge: -1 (source: dimorphite)


2025-12-05 19:09:42,787 - servers.structure_server - INFO - Wrote prepared ligand: /Users/yasu/tmp/mcp-md/notebooks/output/b09df635/ligand_1_prepared.sdf


2025-12-05 19:09:42,788 - servers.structure_server - INFO - Successfully cleaned ligand: /Users/yasu/tmp/mcp-md/notebooks/output/b09df635/ligand_1_prepared.sdf



 Clean Ligand with User SMILES

✓ SUCCESS

  - Template matching with H failed: Template matching failed: No matching found. PDB atoms: 26, Template atoms: 45, trying without H

Details:
  ligand_pdb: output/b09df635/ligand_1.pdb
  ligand_id: SAH
  sdf_file: /Users/yasu/tmp/mcp-md/notebooks/output/b09df635/ligand_1_prepared.sdf
  net_charge: -1
  charge_source: dimorphite
  mol_formal_charge: -1
  smiles_used: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)[O-])[C@@H](O)[C@H]1O
  smiles_original: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)O)[C@@H](O)[C@H]1O
  smiles_source: user
  target_ph: 7.4
  num_atoms: 45
  num_heavy_atoms: 26
  optimized: False
  optimization_converged: False
  output_dir: /Users/yasu/tmp/mcp-md/notebooks/output/b09df635

SMILES source: user


---
## Test 6: run_antechamber_robust

Test GAFF2 parameterization with AM1-BCC charges.


In [60]:
# Test 6.1: Run antechamber on cleaned SAH ligand
print("Test 6.1: Run antechamber on SAH ligand (GAFF2 + AM1-BCC)")

# First clean the SAH ligand
boltz_cif = "boltz_results_ligand/predictions/ligand/ligand_model_0.cif"
split_result = split_molecules(boltz_cif)

if split_result['success'] and split_result['ligand_files']:
    # Find SAH ligand
    sah_file = None
    for info in split_result['chain_file_info']:
        if info['chain_type'] == 'ligand':
            for chain in split_result['all_chains']:
                if chain['chain_id'] == info['chain_id'] and 'SAH' in chain['residue_names']:
                    sah_file = info['file']
                    break
        if sah_file:
            break
    
    if sah_file:
        # Clean ligand first
        clean_result = clean_ligand(
            ligand_pdb=sah_file,
            ligand_id="SAH",
            target_ph=7.4
        )
        
        if clean_result['success']:
            print(f"Cleaned SDF: {clean_result['sdf_file']}")
            print(f"Net charge: {clean_result['net_charge']}")
            
            # Run antechamber
            result = run_antechamber_robust(
                ligand_file=clean_result['sdf_file'],
                net_charge=clean_result['net_charge'],
                residue_name="SAH"
            )
            show_result(result, "Antechamber SAH")
            
            if result['success']:
                print(f"\nGenerated files:")
                print(f"  MOL2: {result['mol2']}")
                print(f"  FRCMOD: {result['frcmod']}")
                print(f"  Total charge: {result['total_charge']:.4f}")
                
                # Check frcmod validation
                if result['frcmod_validation']:
                    if result['frcmod_validation']['valid']:
                        print("  frcmod: ✓ Valid")
                    else:
                        print(f"  frcmod: ✗ {result['frcmod_validation']['attn_count']} parameters need attention")
        else:
            print(f"Clean failed: {clean_result['errors']}")
    else:
        print("SAH not found")
else:
    print("Failed to split structure")


2025-12-05 19:09:42,794 - servers.structure_server - INFO - Splitting structure: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


Test 6.1: Run antechamber on SAH ligand (GAFF2 + AM1-BCC)


2025-12-05 19:09:42,795 - servers.structure_server - INFO - Inspecting molecules in: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


2025-12-05 19:09:42,796 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:42,815 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:09:42,816 - servers.structure_server - INFO -   Proteins: 2, Ligands: 4, Waters: 0, Ions: 0


2025-12-05 19:09:42,818 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:09:42,821 - servers.structure_server - INFO - Chains to export: ['A', 'B', 'C', 'D', 'E', 'F']


2025-12-05 19:09:42,840 - servers.structure_server - INFO - Wrote protein: output/2fdaadfa/protein_1.pdb


2025-12-05 19:09:42,859 - servers.structure_server - INFO - Wrote protein: output/2fdaadfa/protein_2.pdb


2025-12-05 19:09:42,861 - servers.structure_server - INFO - Wrote ligand: output/2fdaadfa/ligand_1.pdb


2025-12-05 19:09:42,862 - servers.structure_server - INFO - Wrote ligand: output/2fdaadfa/ligand_2.pdb


2025-12-05 19:09:42,864 - servers.structure_server - INFO - Wrote ligand: output/2fdaadfa/ligand_3.pdb


2025-12-05 19:09:42,865 - servers.structure_server - INFO - Wrote ligand: output/2fdaadfa/ligand_4.pdb


2025-12-05 19:09:42,867 - servers.structure_server - INFO - Successfully split structure: 2 protein, 4 ligand, 0 ion, 0 water files


2025-12-05 19:09:42,868 - servers.structure_server - INFO - Cleaning ligand: output/2fdaadfa/ligand_1.pdb (ID: SAH)


2025-12-05 19:09:43,218 - servers.structure_server - INFO - Fetched SMILES for SAH from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:09:43,572 - servers.structure_server - INFO - Fetched SMILES for SAH from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:09:43,575 - servers.structure_server - INFO - Using SMILES from ccd: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:09:43,577 - servers.structure_server - INFO - Applying pH 7.4 protonation to SMILES...


2025-12-05 19:09:43,587 - servers.structure_server - INFO - Protonation result: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]... → Nc1ncnc2c1ncn2[C@@H]1O[C@H](CS... (charge: -1)


2025-12-05 19:09:43,588 - servers.structure_server - INFO - Protonated SMILES at pH 7.4: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)[O-])...


2025-12-05 19:09:43,590 - servers.structure_server - INFO - Calculated net charge: -1


2025-12-05 19:09:43,591 - servers.structure_server - INFO - Read PDB: 26 atoms





2025-12-05 19:09:43,595 - servers.structure_server - INFO - Added hydrogens: 45 total atoms


2025-12-05 19:09:43,596 - servers.structure_server - INFO - Running MMFF94 optimization (max 200 iters)...


2025-12-05 19:09:43,614 - servers.structure_server - INFO - MMFF94 optimization did not converge


2025-12-05 19:09:43,616 - servers.structure_server - INFO - Final net charge: -1 (source: dimorphite)


2025-12-05 19:09:43,617 - servers.structure_server - INFO - Wrote prepared ligand: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.sdf


2025-12-05 19:09:43,619 - servers.structure_server - INFO - Successfully cleaned ligand: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.sdf


2025-12-05 19:09:43,620 - servers.structure_server - INFO - Running robust antechamber: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.sdf


Cleaned SDF: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.sdf
Net charge: -1


2025-12-05 19:09:43,621 - servers.structure_server - INFO - Attempt 1: trying charge = -1


2025-12-05 19:10:17,135 - servers.structure_server - INFO - Antechamber succeeded with charge = -1


2025-12-05 19:10:17,137 - servers.structure_server - INFO - Running parmchk2...


2025-12-05 19:10:17,860 - servers.structure_server - INFO - parmchk2 completed: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.frcmod


2025-12-05 19:10:17,864 - servers.structure_server - INFO - Successfully parameterized ligand: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.gaff.mol2



 Antechamber SAH

✓ SUCCESS

Details:
  mol2: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.gaff.mol2
  frcmod: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.frcmod
  charge_used: -1
  charge_method: bcc
  atom_type: gaff2
  residue_name: SAH
  charges: [complex data, 45]
  total_charge: -1.0000000000000016
  frcmod_validation: [complex data, dict]
  sqm_diagnostics: None
  charge_estimation: None
  diagnostics_dir: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/diagnostics

Generated files:
  MOL2: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.gaff.mol2
  FRCMOD: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.frcmod
  Total charge: -1.0000
  frcmod: ✓ Valid


In [61]:
# Test 6.2: Run antechamber with auto charge estimation
print("Test 6.2: Run antechamber with auto charge estimation")

# Use the same cleaned SDF but let antechamber estimate the charge
if 'clean_result' in dir() and clean_result['success']:
    result = run_antechamber_robust(
        ligand_file=clean_result['sdf_file'],
        net_charge=None,  # Auto-estimate
        residue_name="SAH",
        charge_method="bcc",
        atom_type="gaff2"
    )
    show_result(result, "Antechamber (Auto Charge)")
    
    if result['success']:
        print(f"\nCharge estimation:")
        if result['charge_estimation']:
            print(f"  Estimated: {result['charge_estimation'].get('estimated_charge_at_ph')}")
            print(f"  Confidence: {result['charge_estimation'].get('confidence')}")
        print(f"  Charge used: {result['charge_used']}")
else:
    print("Skipped - clean_result not available")


2025-12-05 19:10:17,869 - servers.structure_server - INFO - Running robust antechamber: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.sdf


Test 6.2: Run antechamber with auto charge estimation


2025-12-05 19:10:17,871 - servers.structure_server - INFO - Auto-estimating net charge...


2025-12-05 19:10:17,872 - servers.structure_server - INFO - Estimating net charge for: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.sdf


2025-12-05 19:10:17,875 - servers.structure_server - INFO - Estimated charge: -1 (formal: -1, confidence: high)


2025-12-05 19:10:17,876 - servers.structure_server - INFO - Estimated charge: -1 (confidence: high)


2025-12-05 19:10:17,877 - servers.structure_server - INFO - Attempt 1: trying charge = -1


2025-12-05 19:10:51,483 - servers.structure_server - INFO - Antechamber succeeded with charge = -1


2025-12-05 19:10:51,485 - servers.structure_server - INFO - Running parmchk2...


2025-12-05 19:10:52,306 - servers.structure_server - INFO - parmchk2 completed: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.frcmod


2025-12-05 19:10:52,310 - servers.structure_server - INFO - Successfully parameterized ligand: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.gaff.mol2



 Antechamber (Auto Charge)

✓ SUCCESS

Details:
  mol2: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.gaff.mol2
  frcmod: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/ligand_1_prepared.frcmod
  charge_used: -1
  charge_method: bcc
  atom_type: gaff2
  residue_name: SAH
  charges: [complex data, 45]
  total_charge: -1.0000000000000016
  frcmod_validation: [complex data, dict]
  sqm_diagnostics: None
  charge_estimation: [complex data, dict]
  diagnostics_dir: /Users/yasu/tmp/mcp-md/notebooks/output/2fdaadfa/diagnostics

Charge estimation:
  Estimated: -1
  Confidence: high
  Charge used: -1


---
## Test 7: Integration Test

Test complete workflows: fetch -> split -> clean_protein + clean_ligand -> antechamber


In [62]:
# Test 7.1: Complete workflow using prepare_complex
print("Test 7.1: Complete Boltz-2 workflow using prepare_complex")
print("="*60)

boltz_cif = "boltz_results_ligand/predictions/ligand/ligand_model_0.cif"

# Run complete workflow with a single function call
result = prepare_complex(
    structure_file=boltz_cif,
    ph=7.4,
    cap_termini=False,
    process_proteins=True,
    process_ligands=True,
    run_parameterization=True,
    optimize_ligands=True,
    # Optional: provide SMILES for specific ligands
    # ligand_smiles={"SAH": "Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)O)[C@@H](O)[C@H]1O"}
)

show_result(result, "prepare_complex Result")

if result['success']:
    print("\n--- Summary ---")
    print(f"Output directory: {result['output_dir']}")
    
    # Show inspection summary
    if result['inspection']:
        summary = result['inspection'].get('summary', {})
        print(f"\nStructure: {summary.get('num_protein_chains', 0)} proteins, "
              f"{summary.get('num_ligand_chains', 0)} ligands")
    
    # Show processed proteins
    print(f"\nProteins processed: {len(result['proteins'])}")
    for p in result['proteins']:
        status = "✓" if p['success'] else "✗"
        print(f"  {status} Chain {p['chain_id']}: {p.get('output_file', 'N/A')}")
        if p['success'] and p.get('statistics'):
            print(f"      Atoms: {p['statistics'].get('final_atoms', 'N/A')}")
    
    # Show processed ligands
    print(f"\nLigands processed: {len(result['ligands'])}")
    for l in result['ligands']:
        status = "✓" if l['success'] else "✗"
        print(f"  {status} {l['ligand_id']} (Chain {l['chain_id']})")
        if l['success']:
            print(f"      SDF: {l.get('sdf_file', 'N/A')}")
            print(f"      MOL2: {l.get('mol2_file', 'N/A')}")
            print(f"      Charge: {l.get('net_charge', 'N/A')}")

print("\n" + "="*60)
print("Workflow complete!")


2025-12-05 19:10:52,316 - servers.structure_server - INFO - Preparing complex: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


Test 7.1: Complete Boltz-2 workflow using prepare_complex


2025-12-05 19:10:52,318 - servers.structure_server - INFO - Step 1: Inspecting structure...


2025-12-05 19:10:52,319 - servers.structure_server - INFO - Inspecting molecules in: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


2025-12-05 19:10:52,320 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:10:52,341 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:10:52,343 - servers.structure_server - INFO -   Proteins: 2, Ligands: 4, Waters: 0, Ions: 0


2025-12-05 19:10:52,344 - servers.structure_server - INFO - Found: 2 proteins, 4 ligands, 0 ions


2025-12-05 19:10:52,345 - servers.structure_server - INFO - Step 2: Splitting structure...


2025-12-05 19:10:52,347 - servers.structure_server - INFO - Splitting structure: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


2025-12-05 19:10:52,348 - servers.structure_server - INFO - Inspecting molecules in: boltz_results_ligand/predictions/ligand/ligand_model_0.cif


2025-12-05 19:10:52,349 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:10:52,368 - servers.structure_server - INFO - Successfully inspected structure: 6 chains found


2025-12-05 19:10:52,370 - servers.structure_server - INFO -   Proteins: 2, Ligands: 4, Waters: 0, Ions: 0


2025-12-05 19:10:52,371 - servers.structure_server - INFO - Reading structure with gemmi (.cif)...


2025-12-05 19:10:52,375 - servers.structure_server - INFO - Chains to export: ['A', 'B', 'C', 'D', 'E', 'F']


2025-12-05 19:10:52,395 - servers.structure_server - INFO - Wrote protein: output/8399c47c/protein_1.pdb


2025-12-05 19:10:52,416 - servers.structure_server - INFO - Wrote protein: output/8399c47c/protein_2.pdb


2025-12-05 19:10:52,417 - servers.structure_server - INFO - Wrote ligand: output/8399c47c/ligand_1.pdb


2025-12-05 19:10:52,419 - servers.structure_server - INFO - Wrote ligand: output/8399c47c/ligand_2.pdb


2025-12-05 19:10:52,420 - servers.structure_server - INFO - Wrote ligand: output/8399c47c/ligand_3.pdb


2025-12-05 19:10:52,421 - servers.structure_server - INFO - Wrote ligand: output/8399c47c/ligand_4.pdb


2025-12-05 19:10:52,422 - servers.structure_server - INFO - Successfully split structure: 2 protein, 4 ligand, 0 ion, 0 water files


2025-12-05 19:10:52,423 - servers.structure_server - INFO - Step 3: Processing 2 protein(s)...


2025-12-05 19:10:52,424 - servers.structure_server - INFO - Cleaning protein structure: output/8399c47c/protein_1.pdb


2025-12-05 19:10:52,425 - servers.structure_server - INFO - Loading structure with PDBFixer


2025-12-05 19:10:52,521 - servers.structure_server - INFO - Finding missing residues


2025-12-05 19:10:52,521 - servers.structure_server - INFO - Finding non-standard residues


2025-12-05 19:10:52,522 - servers.structure_server - INFO - Removing heterogens (keep_water=False)


2025-12-05 19:10:52,534 - servers.structure_server - INFO - Finding and adding missing atoms


2025-12-05 19:10:52,708 - servers.structure_server - INFO - Added missing atoms/residues: 1 terminal atom(s)


2025-12-05 19:10:52,709 - servers.structure_server - INFO - Detecting disulfide bonds


2025-12-05 19:10:52,711 - servers.structure_server - INFO - No disulfide bonds detected


2025-12-05 19:10:52,712 - servers.structure_server - INFO - Adding hydrogens at pH 7.4


2025-12-05 19:10:53,024 - servers.structure_server - INFO - Writing cleaned structure to output/8399c47c/protein_1.clean.pdb


2025-12-05 19:10:53,045 - servers.structure_server - INFO - Running pdb4amber to convert to Amber conventions


2025-12-05 19:10:54,313 - servers.structure_server - INFO - pdb4amber conversion successful: output/8399c47c/protein_1.amber.pdb


2025-12-05 19:10:54,315 - servers.structure_server - INFO - Successfully cleaned protein structure: output/8399c47c/protein_1.amber.pdb


2025-12-05 19:10:54,317 - servers.structure_server - INFO -   ✓ Protein A: output/8399c47c/protein_1.amber.pdb


2025-12-05 19:10:54,318 - servers.structure_server - INFO - Cleaning protein structure: output/8399c47c/protein_2.pdb


2025-12-05 19:10:54,319 - servers.structure_server - INFO - Loading structure with PDBFixer


2025-12-05 19:10:54,425 - servers.structure_server - INFO - Finding missing residues


2025-12-05 19:10:54,426 - servers.structure_server - INFO - Finding non-standard residues


2025-12-05 19:10:54,427 - servers.structure_server - INFO - Removing heterogens (keep_water=False)


2025-12-05 19:10:54,438 - servers.structure_server - INFO - Finding and adding missing atoms


2025-12-05 19:10:54,571 - servers.structure_server - INFO - Added missing atoms/residues: 1 terminal atom(s)


2025-12-05 19:10:54,573 - servers.structure_server - INFO - Detecting disulfide bonds


2025-12-05 19:10:54,574 - servers.structure_server - INFO - No disulfide bonds detected


2025-12-05 19:10:54,575 - servers.structure_server - INFO - Adding hydrogens at pH 7.4


2025-12-05 19:10:54,944 - servers.structure_server - INFO - Writing cleaned structure to output/8399c47c/protein_2.clean.pdb


2025-12-05 19:10:54,965 - servers.structure_server - INFO - Running pdb4amber to convert to Amber conventions


2025-12-05 19:10:56,138 - servers.structure_server - INFO - pdb4amber conversion successful: output/8399c47c/protein_2.amber.pdb


2025-12-05 19:10:56,140 - servers.structure_server - INFO - Successfully cleaned protein structure: output/8399c47c/protein_2.amber.pdb


2025-12-05 19:10:56,142 - servers.structure_server - INFO -   ✓ Protein B: output/8399c47c/protein_2.amber.pdb


2025-12-05 19:10:56,143 - servers.structure_server - INFO - Step 4: Processing 4 ligand(s)...


2025-12-05 19:10:56,144 - servers.structure_server - INFO - Cleaning ligand: output/8399c47c/ligand_1.pdb (ID: SAH)


2025-12-05 19:10:56,500 - servers.structure_server - INFO - Fetched SMILES for SAH from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:10:56,866 - servers.structure_server - INFO - Fetched SMILES for SAH from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:10:56,871 - servers.structure_server - INFO - Using SMILES from ccd: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:10:56,874 - servers.structure_server - INFO - Applying pH 7.4 protonation to SMILES...


2025-12-05 19:10:56,887 - servers.structure_server - INFO - Protonation result: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]... → Nc1ncnc2c1ncn2[C@@H]1O[C@H](CS... (charge: -1)


2025-12-05 19:10:56,889 - servers.structure_server - INFO - Protonated SMILES at pH 7.4: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)[O-])...


2025-12-05 19:10:56,891 - servers.structure_server - INFO - Calculated net charge: -1


2025-12-05 19:10:56,892 - servers.structure_server - INFO - Read PDB: 26 atoms





2025-12-05 19:10:56,897 - servers.structure_server - INFO - Added hydrogens: 45 total atoms


2025-12-05 19:10:56,898 - servers.structure_server - INFO - Running MMFF94 optimization (max 200 iters)...


2025-12-05 19:10:56,919 - servers.structure_server - INFO - MMFF94 optimization did not converge


2025-12-05 19:10:56,920 - servers.structure_server - INFO - Final net charge: -1 (source: dimorphite)


2025-12-05 19:10:56,922 - servers.structure_server - INFO - Wrote prepared ligand: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_1_prepared.sdf


2025-12-05 19:10:56,923 - servers.structure_server - INFO - Successfully cleaned ligand: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_1_prepared.sdf


2025-12-05 19:10:56,924 - servers.structure_server - INFO -   ✓ Ligand SAH (C): cleaned, charge=-1


2025-12-05 19:10:56,925 - servers.structure_server - INFO - Running robust antechamber: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_1_prepared.sdf


2025-12-05 19:10:56,927 - servers.structure_server - INFO - Attempt 1: trying charge = -1


2025-12-05 19:11:30,963 - servers.structure_server - INFO - Antechamber succeeded with charge = -1


2025-12-05 19:11:30,966 - servers.structure_server - INFO - Running parmchk2...


2025-12-05 19:11:31,811 - servers.structure_server - INFO - parmchk2 completed: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_1_prepared.frcmod


2025-12-05 19:11:31,815 - servers.structure_server - INFO - Successfully parameterized ligand: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_1_prepared.gaff.mol2


2025-12-05 19:11:31,816 - servers.structure_server - INFO -     ✓ Parameterized: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_1_prepared.gaff.mol2


2025-12-05 19:11:31,817 - servers.structure_server - INFO - Cleaning ligand: output/8399c47c/ligand_2.pdb (ID: SAH)


2025-12-05 19:11:32,171 - servers.structure_server - INFO - Fetched SMILES for SAH from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:11:32,531 - servers.structure_server - INFO - Fetched SMILES for SAH from CCD: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:11:32,535 - servers.structure_server - INFO - Using SMILES from ccd: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]([C@@H]([C@H](O3)CSC...


2025-12-05 19:11:32,537 - servers.structure_server - INFO - Applying pH 7.4 protonation to SMILES...


2025-12-05 19:11:32,547 - servers.structure_server - INFO - Protonation result: c1nc(c2c(n1)n(cn2)[C@H]3[C@@H]... → Nc1ncnc2c1ncn2[C@@H]1O[C@H](CS... (charge: -1)


2025-12-05 19:11:32,548 - servers.structure_server - INFO - Protonated SMILES at pH 7.4: Nc1ncnc2c1ncn2[C@@H]1O[C@H](CSCC[C@H](N)C(=O)[O-])...


2025-12-05 19:11:32,550 - servers.structure_server - INFO - Calculated net charge: -1


2025-12-05 19:11:32,552 - servers.structure_server - INFO - Read PDB: 26 atoms





2025-12-05 19:11:32,557 - servers.structure_server - INFO - Added hydrogens: 45 total atoms


2025-12-05 19:11:32,558 - servers.structure_server - INFO - Running MMFF94 optimization (max 200 iters)...


2025-12-05 19:11:32,578 - servers.structure_server - INFO - MMFF94 optimization did not converge


2025-12-05 19:11:32,580 - servers.structure_server - INFO - Final net charge: -1 (source: dimorphite)


2025-12-05 19:11:32,582 - servers.structure_server - INFO - Wrote prepared ligand: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_2_prepared.sdf


2025-12-05 19:11:32,583 - servers.structure_server - INFO - Successfully cleaned ligand: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_2_prepared.sdf


2025-12-05 19:11:32,584 - servers.structure_server - INFO -   ✓ Ligand SAH (D): cleaned, charge=-1


2025-12-05 19:11:32,585 - servers.structure_server - INFO - Running robust antechamber: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_2_prepared.sdf


2025-12-05 19:11:32,587 - servers.structure_server - INFO - Attempt 1: trying charge = -1


2025-12-05 19:12:01,634 - servers.structure_server - INFO - Antechamber succeeded with charge = -1


2025-12-05 19:12:01,636 - servers.structure_server - INFO - Running parmchk2...


2025-12-05 19:12:02,492 - servers.structure_server - INFO - parmchk2 completed: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_2_prepared.frcmod


2025-12-05 19:12:02,495 - servers.structure_server - INFO - Successfully parameterized ligand: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_2_prepared.gaff.mol2


2025-12-05 19:12:02,496 - servers.structure_server - INFO -     ✓ Parameterized: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_2_prepared.gaff.mol2


2025-12-05 19:12:02,497 - servers.structure_server - INFO - Cleaning ligand: output/8399c47c/ligand_3.pdb (ID: LIG1)




2025-12-05 19:12:02,856 - servers.structure_server - ERROR - No SMILES found for ligand LIG1




2025-12-05 19:12:02,860 - servers.structure_server - INFO - Cleaning ligand: output/8399c47c/ligand_4.pdb (ID: LIG1)




2025-12-05 19:12:03,220 - servers.structure_server - ERROR - No SMILES found for ligand LIG1




2025-12-05 19:12:03,227 - servers.structure_server - INFO - Complex preparation complete: output/8399c47c


2025-12-05 19:12:03,229 - servers.structure_server - INFO -   Proteins processed: 2/2


2025-12-05 19:12:03,230 - servers.structure_server - INFO -   Ligands processed: 2/4



 prepare_complex Result

✓ SUCCESS

  - Ligand LIG1 cleaning failed: ['No SMILES found for ligand LIG1', "Hint: Provide SMILES manually via the 'smiles' parameter, or add it to KNOWN_LIGAND_SMILES dictionary"]
  - Ligand LIG1 cleaning failed: ['No SMILES found for ligand LIG1', "Hint: Provide SMILES manually via the 'smiles' parameter, or add it to KNOWN_LIGAND_SMILES dictionary"]

Details:
  job_id: 322818b5
  output_dir: output/8399c47c
  source_file: boltz_results_ligand/predictions/ligand/ligand_model_0.cif
  inspection: [complex data, dict]
  split: [complex data, dict]
  proteins: [complex data, 2]
  ligands: [complex data, 4]

--- Summary ---
Output directory: output/8399c47c

Structure: 2 proteins, 4 ligands

Proteins processed: 2
  ✓ Chain A: output/8399c47c/protein_1.amber.pdb
      Atoms: 5848
  ✓ Chain B: output/8399c47c/protein_2.amber.pdb
      Atoms: 5848

Ligands processed: 4
  ✓ SAH (Chain C)
      SDF: /Users/yasu/tmp/mcp-md/notebooks/output/8399c47c/ligand_1_prepare

---
## Summary

This notebook tested all tools in `structure_server.py`:

| Tool | Tests | Purpose |
|------|-------|---------|
| `fetch_molecules` | 2 | Download structures from PDB |
| `inspect_molecules` | 2 | Inspect structure files and analyze chains/molecules |
| `split_molecules` | 3 | Split multi-chain structures (including Boltz-2) |
| `clean_protein` | 3 | Clean and prepare proteins for MD |
| `clean_ligand` | 3 | Clean ligands using SMILES template matching |
| `run_antechamber_robust` | 2 | GAFF2 parameterization with AM1-BCC charges |
| `prepare_complex` | 1 | Complete workflow (split + clean + parameterize) |

### Key Features Tested:
- **LLM-friendly error handling**: All tools return structured `success`/`errors`/`warnings` fields
- **SMILES template matching**: Correct bond orders from CCD or user-provided SMILES
- **pH-dependent protonation**: Dimorphite-DL for correct protonation state
- **GAFF2 parameterization**: AM1-BCC charges with robust error handling
- **frcmod validation**: Check for missing/estimated parameters
- **Boltz-2 support**: Full workflow for AI-predicted protein-ligand complexes
- **One-step workflow**: `prepare_complex` combines all steps for convenience


In [63]:
print("All tests completed!")


All tests completed!
