# Example: End-To-End *De Novo* Protein Design Pipeline

## Overview

This notebook demonstrates an end-to-end protein design workflow using three deep learning networks from the Institute for Protein Design:

| Step | Model | Purpose |
|------|-------|---------|
| 1. **Backbone Generation** | RFD3 | Generate novel protein backbones via diffusion |
| 2. **Sequence Design** | MPNN | Design amino acid sequences for the generated backbone |
| 3. **Structure Validation** | RF3 | Predict the structure from designed sequence to validate designability |

All models are unified through [AtomWorks](https://github.com/RosettaCommons/atomworks) (for both inference and training), relying on Biotite `AtomArray` objects.

This notebook assumes you have the base checkpoints downloaded: `foundry install rfd3 ligandmpnn rf3`. You can also specify the paths directly yourself if you wish. You can register your foundry venv to jupyter with: `python -m ipykernel install --user --name=foundry --display-name "foundry"`.

### Pipeline Flow
```
RFD3 (backbone) → MPNN (sequence) → RF3 (validation) → RMSD comparison
```
---

## Section 0: Installation

Install the Foundry package (includes RFD3, MPNN, and RF3):

```bash
pip install 'rc-foundry[all]'
```

Download the model weights (~6GB total, takes a couple minutes):

```bash
foundry install rfd3 ligandmpnn rf3
```

---

In [1]:
import os

# 强制禁用 cuequivariance（包括硬编码情况）
os.environ["SHOULD_USE_CUEQUIVARIANCE"] = "0"
os.environ["CUEQUIVARIANCE_USE_FALLBACK"] = "1"          # 部分版本支持
os.environ["DISABLE_CUEQUIVARIANCE"] = "1"               # 更强禁用
os.environ["TORCH_DTYPE"] = "fp16"                       # 强制 fp16
os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"

print("强制禁用 cuequivariance 已设置")
print(os.environ.get("SHOULD_USE_CUEQUIVARIANCE"))
print(os.environ.get("DISABLE_CUEQUIVARIANCE"))

强制禁用 cuequivariance 已设置
0
1


In [2]:
# Shared utilities for visualization (from AtomWorks)
from atomworks.io.utils.visualize import view

Environment variable CCD_MIRROR_PATH not set. Will not be able to use function requiring this variable. To set it you may:
  (1) add the line 'export VAR_NAME=path/to/variable' to your .bashrc or .zshrc file
  (2) set it in your current shell with 'export VAR_NAME=path/to/variable'
  (3) write it to a .env file in the root of the atomworks.io repository
Environment variable PDB_MIRROR_PATH not set. Will not be able to use function requiring this variable. To set it you may:
  (1) add the line 'export VAR_NAME=path/to/variable' to your .bashrc or .zshrc file
  (2) set it in your current shell with 'export VAR_NAME=path/to/variable'
  (3) write it to a .env file in the root of the atomworks.io repository


## Section 1: Backbone Generation with RFD3

RFdiffusion3 (RFD3) generates *de novo* all-atom proteins that meet specific conditioning requirements.

**Parameters Used** *(many more are available for more complex protein design tasks)*:
- `length`: Target protein length in residues
- `diffusion_batch_size`: Number of structures to generate per batch
- `n_batches`: Number of batches to run

**Outputs:** Dictionary of `RFD3Output` objects.

In [3]:
from lightning.fabric import seed_everything
from rfd3.engine import RFD3InferenceConfig, RFD3InferenceEngine

20:41:55 DEBUG transforms: Debug mode is on


In [4]:
# Set seed for reproducibility
seed_everything(0)

Seed set to 0


0

In [14]:
# run1
import json
from rfd3.engine import RFD3InferenceConfig, RFD3InferenceEngine

spec = json.load(open('/home/alex/aidd/foundry/models/rfd3/docs/protein_binder_design.json'))['pdl1_clean']  # 或 ['pdl1']


config = RFD3InferenceConfig(
    specification=spec,      # dict，直接喂
    diffusion_batch_size=4,
)
engine = RFD3InferenceEngine(**config)

outputs = engine.run(
    inputs='/home/alex/aidd/foundry/models/rfd3/docs/input_pdbs/pd_l1_clean.pdb',  # 模板 PDB 路径
    out_dir='/home/alex/aidd/foundry/out_cdr',
    n_batches=10,            # 4×10=40 条
)

19:42:36 INFO rfd3.engine: [rank: 0] Outputs will be written to /home/alex/aidd/foundry/out_cdr.
19:42:36 INFO rfd3.engine: [rank: 0] Prevalidating design specification for example: pd_l1_clean
19:42:36 INFO rfd3.engine: [rank: 0] Found 0 existing example IDs in the output directory.
Using bfloat16 Automatic Mixed Precision (AMP)
19:43:21 INFO rfd3.engine: [rank: 0] Finished inference batch in 33.88 seconds.
19:43:21 INFO rfd3.engine: [rank: 0] Outputs for pd_l1_clean_4_model_0 written to /home/alex/aidd/foundry/out_cdr/pd_l1_clean_4_model_0.
19:43:21 INFO rfd3.engine: [rank: 0] Outputs for pd_l1_clean_4_model_1 written to /home/alex/aidd/foundry/out_cdr/pd_l1_clean_4_model_1.
19:43:21 INFO rfd3.engine: [rank: 0] Outputs for pd_l1_clean_4_model_2 written to /home/alex/aidd/foundry/out_cdr/pd_l1_clean_4_model_2.
19:43:21 INFO rfd3.engine: [rank: 0] Outputs for pd_l1_clean_4_model_3 written to /home/alex/aidd/foundry/out_cdr/pd_l1_clean_4_model_3.
19:43:58 INFO rfd3.engine: [rank: 0] Fin

In [15]:
# View generated example IDs (one key per generated structure)
outputs.keys()

dict_keys([])

In [5]:
from pathlib import Path
from atomworks.io.utils.io_utils import load_any
import biotite.structure as struc

# 1. 列出所有生成的 CIF
cif_files = sorted(Path('out_cdr').glob('pd_l1_clean*_model_*.cif.gz'))

# 2. 取第一条
first_cif = cif_files[0]
atom_array = load_any(str(first_cif), model=1)

# 3. 可视化
view(atom_array)

# 4. （可选）打印
print(f'Loaded {first_cif.name}  ({atom_array.shape[0]} atoms)')

Loaded pd_l1_clean_0_model_0.cif.gz  (982 atoms)


In [24]:
# Inspect RFD3 outputs and extract the generated backbone
for idx, data in outputs.items():
    print(f"Batch {idx}: {len(data)} structure(s)")
    print(f"  Output type: {type(data[0]).__name__}")
    print(f"  AtomArray: {data[0].atom_array}")

# Extract the first generated backbone for downstream use
first_key = next(iter(outputs.keys()))
atom_array = outputs[first_key][0].atom_array

# Visualize the generated backbone
view(atom_array)

ModuleNotFoundError: No module named 'atomworks.visualize'

---

## Section 2: Sequence Design with MPNN

Protein and Ligand MPNN (Message Passing Neural Network) designs amino acid sequences that will fold into a target backbone structure.

**Model Options:**
- `protein_mpnn`: Original ProteinMPNN for protein-only design
- `ligand_mpnn`: Extended model supporting ligand-aware design

**Key Parameters:**
- `batch_size`: Number of sequences to generate per structure
- `remove_waters`: Whether to exclude water molecules from context

In [6]:
from mpnn.inference_engines.mpnn import MPNNInferenceEngine

# Configure MPNN inference engine
# See mpnn.utils.inference.MPNN_GLOBAL_INFERENCE_DEFAULTS for all options
engine_config = {
    "model_type": "protein_mpnn",  # or "protein_mpnn" for vanilla ProteinMPNN
    "is_legacy_weights": True,    # Required for now for ligand_mpnn and protein_mpnn
    "out_directory": None,        # Return results in memory
    "write_structures": False,
    "write_fasta": False,
}

# Configure per-input inference options
# See mpnn.utils.inference.MPNN_PER_INPUT_INFERENCE_DEFAULTS for all options
input_configs = [
    {
        "batch_size": 10,         # Generate 10 sequences per structure
        "remove_waters": True,
    }
]

# Run sequence design on the RFD3-generated backbone
model = MPNNInferenceEngine(**engine_config)
mpnn_outputs = model.run(input_dicts=input_configs, atom_arrays=[atom_array])

20:42:16 INFO mpnn.inference_engines.mpnn: [rank: 0] Loading legacy MPNN weights.
20:42:17 INFO mpnn.utils.inference: Annotated AtomArray has 2097 atoms 
20:42:17 INFO mpnn.inference_engines.mpnn: [rank: 0] Running MPNN inference for input 0, batch 0...


In [7]:
from biotite.structure import get_residue_starts
from biotite.sequence import ProteinSequence

# Extract and display the designed sequences
print(f"Generated {len(mpnn_outputs)} designed sequences:\n")

for i, item in enumerate(mpnn_outputs):
    res_starts = get_residue_starts(item.atom_array)
    # Convert 3-letter codes to 1-letter using Biotite
    seq_1letter = ''.join(
        ProteinSequence.convert_letter_3to1(res_name)
        for res_name in item.atom_array.res_name[res_starts]
    )
    print(f"Sequence {i+1}: {seq_1letter}")

Generated 10 designed sequences:

Sequence 1: SEKEKEELEKLIEELKEFKVSVPTSTYEVKEGSDMSISCNFPVGDSLDVDKLIVLWTKDGKTIIVYVKGKVLEELVDPEYHERASLDLSSLPNGVATLNIKNVTKEDAGTYTCVVSYNGSDFATITVKVVS
Sequence 2: SAAELEALRQEIEALRAFTVSVPESVYEVKLGSDASITCNFPVGSSLDISKLIVLWSKDGKTIIVWVQGKLLEEKIDPRYHERAYLDLSSLPNGRATLVIKNVTLEDAGVYTCVVDYNGLDSAKIEVKVIA
Sequence 3: SEELKKKLDELIEKLKEFTVSVPQSTYYVKEGSTASISCLFPVGESLDVENLIVLWQKDGKTIIVYVQGKVLEELIDPRFHDRAYLDLSSLPNGEATLVIKNVTKEDAGTYLCVVSYNGSDFVEIKLIVES
Sequence 4: SEEEKKELLELIEELREFTVSVPTETYEVKLGSTATISCNFPVGSSLDVSELIVLWSKDGKTIIVYVQGKVLKELVDPRYHERAYLDLSSLPNGVASLVIKNVTEEDAGTYTCVVSYNGSDFKKITLKVVS
Sequence 5: SEEEKEELEELVEELLEFKVSVPKSVYEVKEGSTMSISCNFPVGESLDMSELVVLWEKDGETIIVYVRGEVLKEKVDPRYHERASLDLSSLPKGQATLVIKNVTKEDAGTYTCVVKYNGLDSKTIEVKVVS
Sequence 6: SAAEQAARDAEIAALRAFTVSVPSSTYNVKLGSTASISCNFPVGSSLDVSKLTVLWEKDGKRIIVWVQGKELKEKIDPRYHDRASLDKSSLPNGEATLIIKNVTKEDAGTYTCTVSYNGSDSKTITVNVVG
Sequence 7: SEEEKKKLKELIEKLKEFKVTVPQTTYNVKLGSDASISCNFPVGDSLNVEELIVLWSKDGKTIIVWVKGKVLEELVDPRYHERASLEKSK

---

## Section 3: Structure Prediction with RF3

RF3 (RoseTTAFold 3) predicts protein structures from sequences. By re-folding the MPNN-designed sequence, we can validate whether the design is likely to adopt the intended backbone structure.

**Outputs:** `RF3Output` objects containing:
- `atom_array`: Predicted structure as Biotite AtomArray
- `summary_confidences`: Overall confidence metrics (pLDDT, PAE, pTM, etc.)
- `confidences`: Per-atom/residue confidence scores

**Confidence Metrics:**
| Metric | Description |
|--------|-------------|
| pLDDT | Per-residue confidence (0-1, higher is better) |
| PAE | Predicted Aligned Error (lower is better) |
| pTM | Predicted TM-score |
| ranking_score | Overall model quality score |

In [8]:
# import os

# os.environ["PYTORCH_CUDA_ALLOC_CONF"] = "expandable_segments:True"  # 防碎片
# os.environ["CUDA_VISIBLE_DEVICES"] = "0"  # 确保单卡

# os.environ["SHOULD_USE_CUEQUIVARIANCE"] = "0"
# os.environ["TORCH_DTYPE"] = "fp16"          # 可选加强
# # 重启 kernel 后不需要，但建议运行后重启 kernel 再跑 RF3 代码
print("环境变量已设置：")
print("SHOULD_USE_CUEQUIVARIANCE =", os.environ.get("SHOULD_USE_CUEQUIVARIANCE"))
print("TORCH_DTYPE =", os.environ.get("TORCH_DTYPE"))

环境变量已设置：
SHOULD_USE_CUEQUIVARIANCE = 0
TORCH_DTYPE = fp16


In [9]:
from rf3.inference_engines.rf3 import RF3InferenceEngine
from rf3.utils.inference import InferenceInput


# Initialize RF3 inference engine
# inference_engine = RF3InferenceEngine(ckpt_path='rf3', verbose=False)
inference_engine = RF3InferenceEngine(
    ckpt_path='rf3',
    verbose=False,
    n_recycles=1,              # 默认10 → 降到1，大幅减内存（recycle 是主要吃内存点）
    diffusion_batch_size=1,    # 默认5 → 降到1（batch=1 最省）
    num_steps=20               # 默认50 → 减半
)

# Create input from the MPNN-designed structure (first design)
# This re-folds the sequence to validate it adopts the intended structure
input_structure = InferenceInput.from_atom_array(atom_array, example_id="example_protein")
rf3_outputs = inference_engine.run(inputs=input_structure)

# Outputs: dict mapping example_id -> list[RF3Output] (multiple models per input)
print(f"Output keys: {rf3_outputs.keys()}")
print(f"Number of models for 'example_protein': {len(rf3_outputs['example_protein'])}")

20:43:07 INFO rdkit: Enabling RDKit 2025.03.6 jupyter extensions
  (1) add the line 'export VAR_NAME=path/to/variable' to your .bashrc or .zshrc file
  (2) set it in your current shell with 'export VAR_NAME=path/to/variable'
  (3) write it to a .env file in the root of the atomworks.io repository
20:43:07 INFO rf3.inference_engines.rf3: [rank: 0] Loading checkpoint from /home/alex/.foundry/checkpoints/rf3_foundry_01_24_latest_remapped.ckpt...
Using bfloat16 Automatic Mixed Precision (AMP)
20:43:15 INFO rf3.inference_engines.rf3: [rank: 0] Found 1 structures to predict!
20:43:16 INFO rf3.inference_engines.rf3: [rank: 0] Predicting structure 1/1: example_protein


Output keys: dict_keys(['example_protein'])
Number of models for 'example_protein': 1


In [38]:
from rf3.inference_engines.rf3 import RF3InferenceEngine
from rf3.utils.inference import InferenceInput


# Initialize RF3 inference engine
inference_engine = RF3InferenceEngine(ckpt_path='rf3', verbose=False)

# Create input from the MPNN-designed structure (first design)
# This re-folds the sequence to validate it adopts the intended structure
input_structure = InferenceInput.from_atom_array(atom_array, example_id="example_protein")
rf3_outputs = inference_engine.run(inputs=input_structure)

# Outputs: dict mapping example_id -> list[RF3Output] (multiple models per input)
print(f"Output keys: {rf3_outputs.keys()}")
print(f"Number of models for 'example_protein': {len(rf3_outputs['example_protein'])}")

20:34:22 INFO rf3.inference_engines.rf3: [rank: 0] Loading checkpoint from /home/alex/.foundry/checkpoints/rf3_foundry_01_24_latest_remapped.ckpt...
Using bfloat16 Automatic Mixed Precision (AMP)


InstantiationException: Error in call to target 'rf3.model.RF3.RF3WithConfidence':
OutOfMemoryError('CUDA out of memory. Tried to allocate 20.00 MiB. GPU 0 has a total capacity of 11.55 GiB of which 57.69 MiB is free. Process 61153 has 3.07 GiB memory in use. Including non-PyTorch memory, this process has 7.59 GiB memory in use. Of the allocated memory 7.46 GiB is allocated by PyTorch, and 12.06 MiB is reserved by PyTorch but unallocated. If reserved but unallocated memory is large try setting PYTORCH_CUDA_ALLOC_CONF=expandable_segments:True to avoid fragmentation.  See documentation for Memory Management  (https://pytorch.org/docs/stable/notes/cuda.html#environment-variables)')
full_key: model.net

In [10]:
# Extract the top-ranked prediction
rf3_output = rf3_outputs["example_protein"][0]

# Inspect RF3Output structure
print(f"RF3Output contains:")
print(f"  - atom_array: {len(rf3_output.atom_array)} atoms")
print(f"  - summary_confidences: {list(rf3_output.summary_confidences.keys())}")
print(f"  - confidences: {list(rf3_output.confidences.keys()) if rf3_output.confidences else None}")

# Visualize the predicted structure
view(rf3_output.atom_array)

RF3Output contains:
  - atom_array: 1029 atoms
  - summary_confidences: ['chain_ptm', 'chain_pair_pae_min', 'chain_pair_pde_min', 'chain_pair_pae', 'chain_pair_pde', 'overall_plddt', 'overall_pde', 'overall_pae', 'ptm', 'iptm', 'has_clash', 'ranking_score']
  - confidences: ['atom_chain_ids', 'atom_plddts', 'pae', 'token_chain_ids', 'token_res_ids']


<py3Dmol.view at 0x705e481fef90>

In [11]:
# Summary confidences: overall model quality metrics
summary = rf3_output.summary_confidences

print("=== Summary Confidences ===")
print(f"  Overall pLDDT:    {summary['overall_plddt']:.3f}")
print(f"  Overall PAE:      {summary['overall_pae']:.2f} A")
print(f"  Overall PDE:      {summary['overall_pde']:.3f}")
print(f"  pTM:              {summary['ptm']:.3f}")
print(f"  ipTM:             {summary.get('iptm', 'N/A (single chain)')}")
print(f"  Ranking score:    {summary['ranking_score']:.3f}")
print(f"  Has clash:        {summary['has_clash']}")

=== Summary Confidences ===
  Overall pLDDT:    0.706
  Overall PAE:      13.76 A
  Overall PDE:      4.652
  pTM:              0.487
  ipTM:             0.20275276899337769
  Ranking score:    0.260
  Has clash:        False


In [5]:
# Detailed per-atom/residue confidences
conf = rf3_output.confidences

print("=== Per-Atom/Residue Confidences ===")
print(f"  atom_plddts:      {len(conf['atom_plddts'])} values (one per atom)")
print(f"  atom_chain_ids:   {len(conf['atom_chain_ids'])} values")
print(f"  token_chain_ids:  {len(conf['token_chain_ids'])} values (one per residue)")
print(f"  token_res_ids:    {len(conf['token_res_ids'])} values")
print(f"  PAE matrix:       {len(conf['pae'])}x{len(conf['pae'][0])}")

# Preview first 10 atom pLDDT scores
import numpy as np
print(f"\nFirst 10 atom pLDDTs: {np.round(conf['atom_plddts'][:10], 2).tolist()}")

NameError: name 'rf3_output' is not defined

---

## Section 4: Validation and Export

The final step compares the RF3-predicted structure against the original RFD3-generated backbone. A low backbone RMSD indicates the designed sequence is likely to fold into the intended structure (high designability).

In [4]:
from biotite.structure import rmsd, superimpose
from atomworks.constants import PROTEIN_BACKBONE_ATOM_NAMES
import numpy as np

# Get structures for comparison
aa_generated = atom_array              # Original RFD3 backbone (from Section 1)
aa_refolded = rf3_output.atom_array    # RF3-predicted structure

# Filter to backbone atoms (N, CA, C, O)
bb_generated = aa_generated[np.isin(aa_generated.atom_name, PROTEIN_BACKBONE_ATOM_NAMES)]
bb_refolded = aa_refolded[np.isin(aa_refolded.atom_name, PROTEIN_BACKBONE_ATOM_NAMES)]

# Superimpose structures and calculate RMSD
bb_refolded_fitted, _ = superimpose(bb_generated, bb_refolded)
rmsd_value = rmsd(bb_generated, bb_refolded_fitted)

print(f"Backbone RMSD: {rmsd_value:.2f} A")
print(f"\nInterpretation: {'Excellent' if rmsd_value < 1.0 else 'Good' if rmsd_value < 2.0 else 'Moderate'} designability")

NameError: name 'atom_array' is not defined

In [3]:
# 先确认变量存在（如果之前 cell 运行过，这些变量应该在内存中）
# 如果报错，说明 kernel 重启过，需要重新运行生成 atom_array 和 rf3_output 的部分

from biotite.structure import rmsd, superimpose
from atomworks.constants import PROTEIN_BACKBONE_ATOM_NAMES
import numpy as np

# 假设你的原始变量是：
# atom_array       ← RFD3 生成的 backbone (generated)
# rf3_output.atom_array ← RF3 预测的结构 (refolded)

# 只取 chain A 的 backbone 原子
mask_gen = (atom_array.chain_id == "A")
bb_generated = atom_array[mask_gen][np.isin(atom_array[mask_gen].atom_name, PROTEIN_BACKBONE_ATOM_NAMES)]

mask_ref = (rf3_output.atom_array.chain_id == "A")
bb_refolded = rf3_output.atom_array[mask_ref][np.isin(rf3_output.atom_array[mask_ref].atom_name, PROTEIN_BACKBONE_ATOM_NAMES)]

# 检查长度是否一致
if len(bb_generated) != len(bb_refolded):
    print(f"警告：CDR-H3 backbone 原子数不一致！ generated: {len(bb_generated)}, refolded: {len(bb_refolded)}")
else:
    # 叠合并计算 RMSD
    bb_refolded_fitted, _ = superimpose(bb_generated, bb_refolded)
    rmsd_value = rmsd(bb_generated, bb_refolded_fitted)
    print(f"CDR-H3 Backbone RMSD (chain A only): {rmsd_value:.2f} Å")
    print(f"Interpretation: {'Excellent' if rmsd_value < 1.2 else 'Good' if rmsd_value < 2.0 else 'Moderate' if rmsd_value < 3.0 else 'Poor'} designability")

NameError: name 'atom_array' is not defined

In [2]:
from biotite.structure import rmsd, superimpose
from atomworks.constants import PROTEIN_BACKBONE_ATOM_NAMES
import numpy as np

# 只取 chain A 的结构（CDR-H3）
mask_a = (atom_array.chain_id == "A")
mask_a_refolded = (rf3_output.atom_array.chain_id == "A")

aa_generated_cdr = atom_array[mask_a]
aa_refolded_cdr = rf3_output.atom_array[mask_a_refolded]

# 进一步过滤 backbone 原子（N, CA, C, O）
bb_generated = aa_generated_cdr[np.isin(aa_generated_cdr.atom_name, PROTEIN_BACKBONE_ATOM_NAMES)]
bb_refolded  = aa_refolded_cdr[np.isin(aa_refolded_cdr.atom_name, PROTEIN_BACKBONE_ATOM_NAMES)]

# 叠合 & 计算 RMSD（只用 CDR 部分）
bb_refolded_fitted, _ = superimpose(bb_generated, bb_refolded)
rmsd_value = rmsd(bb_generated, bb_refolded_fitted)

print(f"CDR-H3 Backbone RMSD (chain A only): {rmsd_value:.2f} Å")
print(f"Interpretation: {'Excellent' if rmsd_value < 1.2 else 'Good' if rmsd_value < 2.0 else 'Moderate' if rmsd_value < 3.0 else 'Poor'} designability")

NameError: name 'atom_array' is not defined

In [1]:
from biotite.structure import rmsd, superimpose
from atomworks.constants import PROTEIN_BACKBONE_ATOM_NAMES
import numpy as np

# 只取 chain A（CDR-H3）的 backbone 原子
mask_a = aa_generated.chain_id == "A"
bb_generated = aa_generated[mask_a][np.isin(aa_generated[mask_a].atom_name, PROTEIN_BACKBONE_ATOM_NAMES)]

mask_a_ref = aa_refolded.chain_id == "A"
bb_refolded = aa_refolded[mask_a_ref][np.isin(aa_refolded[mask_a_ref].atom_name, PROTEIN_BACKBONE_ATOM_NAMES)]

# 确保残基数一致（如果长度不匹配会报错）
if len(bb_generated) != len(bb_refolded):
    print("警告：CDR-H3 原子数不一致！")
else:
    # 叠合并计算 RMSD
    bb_refolded_fitted, _ = superimpose(bb_generated, bb_refolded)
    rmsd_value = rmsd(bb_generated, bb_refolded_fitted)
    print(f"CDR-H3 Backbone RMSD (chain A): {rmsd_value:.2f} Å")
    print(f"Interpretation: {'Excellent' if rmsd_value < 1.2 else 'Good' if rmsd_value < 2.0 else 'Moderate' if rmsd_value < 3.0 else 'Poor'} designability")

Environment variable CCD_MIRROR_PATH not set. Will not be able to use function requiring this variable. To set it you may:
  (1) add the line 'export VAR_NAME=path/to/variable' to your .bashrc or .zshrc file
  (2) set it in your current shell with 'export VAR_NAME=path/to/variable'
  (3) write it to a .env file in the root of the atomworks.io repository
Environment variable PDB_MIRROR_PATH not set. Will not be able to use function requiring this variable. To set it you may:
  (1) add the line 'export VAR_NAME=path/to/variable' to your .bashrc or .zshrc file
  (2) set it in your current shell with 'export VAR_NAME=path/to/variable'
  (3) write it to a .env file in the root of the atomworks.io repository


NameError: name 'aa_generated' is not defined

In [14]:
from atomworks.io.utils.io_utils import to_cif_file

# Export structures to CIF format for visualization in PyMOL/ChimeraX
to_cif_file(aa_generated, "generated.cif")
to_cif_file(aa_refolded, "refolded.cif")

print("Exported structures:")
print("  - generated.cif: Original RFD3 backbone")
print("  - refolded.cif:  RF3-predicted structure")



Exported structures:
  - generated.cif: Original RFD3 backbone
  - refolded.cif:  RF3-predicted structure


### Superimposed Result

The image below shows the generated backbone (RFD3) superimposed with the re-folded structure (RF3). Close alignment indicates successful design.

![Superimposed Protein](../docs/_static/superimposed_80_residue_protein.png)