# AFToolKit — Google Colab Demo

This notebook installs AFToolKit via [condacolab](https://github.com/conda-incubator/condacolab) and demonstrates basic usage.

> **Important**: Run **Cell 1** and wait for the runtime to restart automatically before running the rest.

## Step 1 — Install condacolab and restart runtime

Run this cell **once**. The runtime will restart automatically after `condacolab.install()`.

In [None]:
!pip install -q condacolab
import condacolab
condacolab.install()  # runtime restarts here — continue from Step 2 after restart

## Step 2 — Verify conda and install dependencies

After the runtime restarts, run from here.

In [None]:
import condacolab
condacolab.check()

In [None]:
# Install conda dependencies (includes dm-tree via conda-forge to avoid C++ build failures)
!conda install -y -q -c conda-forge -c bioconda -c pytorch -c nvidia \
    python=3.9 \
    setuptools=59.5.0 \
    openmm=7.7 \
    pdbfixer \
    'cudatoolkit=11.6.*' \
    'pytorch-lightning=1.8.4' \
    'biopython=1.79' \
    'numpy=1.21' \
    'PyYAML=5.4.1' \
    requests \
    catboost \
    'tqdm=4.62.2' \
    'typing-extensions=4.10' \
    'modelcif=0.7' \
    dm-tree \
    'pytorch::pytorch=1.13.*' \
    hmmer=3.3.2 \
    hhsuite=3.3.0 \
    'kalign2=2.04'

In [None]:
# Install pip-only dependencies (dm-tree moved to conda above to avoid build errors)
!pip install -q \
    'triton==2.2.0' \
    'deepspeed==0.12.4' \
    'pandas==2.0.0' \
    'scikit-learn==1.4.*' \
    'scipy==1.11.4' \
    'pygad==3.3.1' \
    biopandas \
    transformers \
    git+https://github.com/NVIDIA/dllogger.git

In [None]:
# (Optional) flash-attention for faster attention kernels — GPU only, skip if no CUDA
# Requires CUDA toolkit; comment out if install fails.
import torch
if torch.cuda.is_available():
    !pip install -q 'flash-attn==1.0.9' --no-build-isolation
else:
    print('No GPU detected — skipping flash-attention')

## Step 3 — Install AFToolKit

In [None]:
!pip install -q git+https://github.com/venzera/AFToolkit.git

## Step 4 — Verify installation

In [None]:
import AFToolKit
from AFToolKit.processing.protein_task import ProteinTask
from AFToolKit.processing.openfold_wrapper import OpenFoldWrapper
from AFToolKit.processing.arg_parser import parse_mutations
print('AFToolKit imported successfully')

## Step 5 — Run ProteinTask

Upload your PDB file to `/content/` or use the example path below.

Weights (~900 MB) are downloaded automatically on first `init_model()` call into `./weights/`.

In [None]:
import torch

# Detect device: use GPU if available, otherwise CPU
device = 'cuda:0' if torch.cuda.is_available() else 'cpu'
print(f'Using device: {device}')

of_wrapper = OpenFoldWrapper(
    device=device,
    inference_n_recycle=3,
    always_use_template=False,
    side_chain_mask=False,
    return_all_cycles=True,
)
of_wrapper.init_model()

In [None]:
# Set paths — replace with your own PDB path and mutation string
PDB_PATH = '/content/your_protein.pdb'  # e.g. upload via Files panel
CHAINS   = ['A']
MUTATION = 'A:S11A'  # format chain:WTaaPositionMTaa
OUTPUT   = '/content/output'

import os, pickle
os.makedirs(OUTPUT, exist_ok=True)

protein_task = ProteinTask()
protein_task.set_input_protein_task(protein_path=PDB_PATH, chains=CHAINS)
protein_task.set_task_mutants(parse_mutations(MUTATION))
protein_task.set_observable_positions()

protein_task.evaluate(of_wrapper=of_wrapper, store_of_protein=True)

out_path = os.path.join(OUTPUT, 'result.pkl')
pickle.dump(protein_task, open(out_path, 'wb'))
print(f'Saved to {out_path}')

## Step 6 — Extract embeddings

In [None]:
features_list = ['pair', 'lddt_logits', 'plddt']
wt_emb, mt_emb = protein_task.get_protein_embeddings(
    features_list=features_list,
    protein_aggregation='mutpos',
    multi_aggregation='sum',
)
print('WT embedding shape:', wt_emb.shape)
print('MT embedding shape:', mt_emb.shape)