## Overview

Use the af3.ipynb notebook as a workflow for running AlphaFold 3 (AF3) structure predictions on multiple input PDB files. Include all the input PDBs you want to predict the structure of in the pdb_input_folder. This notebook will create a JSON file for each of the PDB files that AF3 will use to run a prediction. After the jobs are finished you can run the analysis to get the RMSD, pAE, pLDDT and other relevant metrics from AF3 and as comparison with the initial PDB file.

In [6]:
import os
import pandas as pd
from AF3_functions import *

In [2]:
pdb_input_folder = "input_pdbs"
AF_output_folder = "AF3_outputs"
json_folder = "JsonInputs"
slurm_folder = "SlurmScripts"
dialect="alphafold3" # Usually, stick to this
version=1 # Usually, stick to this
seeds_number=1
predefined_seeds=False
predefined_seeds_list=None
account="ajitj99"
general_env = "/nfs/turbo/umms-ajitj/conda_envs/myenv"
os.makedirs(AF_output_folder, exist_ok=True)
os.makedirs(slurm_folder, exist_ok=True)
os.makedirs(json_folder, exist_ok=True)


### Instructions for MSA

**AF3 builds MSAs**

`generate_json_from_pdb(..., msa=True)`

**MSA-free**

`generate_json_from_pdb(..., msa=False)`

**Use own unpaired A3M**

> unpaired = {
    "A": ">A\nGGGGGGGGGGGGGGGGGGGGGGGG\n",
    "B": ">B\nSSKEVAELK...SRQTVA\n",
    "C": ">C\nVSKEVAELK...SRQTVA\n",
}


`generate_json_from_pdb(...,
    msa=False,
    unpaired_msa=unpaired
)`

Or via file paths:

> unpaired_path = {
    "A": "/path/to/A.a3m",
    "B": "/path/to/B.a3m",
    "C": "/path/to/C.a3m",
}

`generate_json_from_pdb(...,
    msa=False,
    unpaired_msa_path=unpaired_path
)`



In [3]:
# Parameters to run AF3
msa = False
unpaired_msa=None
unpaired_msa_path=None

In [6]:
job_ids = submit_af3_jobs(pdb_input_folder, AF_output_folder, json_folder, dialect, version, slurm_folder, account, seeds_number, predefined_seeds, predefined_seeds_list, msa, unpaired_msa, unpaired_msa_path)
print(f"Submitted jobs: {job_ids}")

Submitting input_pdbs/8.pdb
JsonInputs/8.json json file generated
AlphaFold3 job for 8.pdb submitted with Job ID: 35788053
Submitted jobs: ['35788053']


In [4]:
# Parameters to analyze AF3 results
# AF_output_folder is the same as FinishedPredictions
# InitialGeometries is the same as pdb_input_folder
pae_omit_pairs = [['B','C']] # e.g. [['B','C']] or []
plddt_chains = ['A', 'B', 'C']

for output_folder in os.listdir(AF_output_folder):
    output_folder_path = os.path.join(AF_output_folder, output_folder)
    analyze_af_predictions(account, general_env, output_folder_path, pdb_input_folder, plddt_chains, pae_omit_pairs)

In [14]:
all_results = []
for output_folder in os.listdir(AF_output_folder):
    output_folder_path = os.path.join(AF_output_folder, output_folder)
    if not os.path.isdir(output_folder_path):
        continue
    result_csv = os.path.join(output_folder_path, "AF_results.csv")
    df = pd.read_csv(result_csv)
    all_results.append(df)

final_df = pd.concat(all_results, ignore_index=True)
final_output_path = os.path.join(AF_output_folder, "FINAL_AF3_RESULTS.csv")
final_df.to_csv(final_output_path, index=False)
final_df.head()

Unnamed: 0,ID,AF_rank,average_plddt,ipTM,iPAE,rmsd,ranking_score,geometry
0,8,4,65.936629,0.09,25.875938,13.966556,0.776395,8
1,8,5,65.357652,0.08,26.836009,15.636184,0.731407,8
2,8,1,66.43072,0.085,26.134678,14.595152,0.795276,8
3,8,2,65.805303,0.085,25.888335,12.291242,0.79013,8
4,8,3,66.068977,0.08,26.649759,14.280988,0.783624,8
