# A note on how to run AlphaFold 3

**Author**: Xiping Gong (xipinggong@hotmail.com, Department of Food Science and Technology, College of Agricultural and Environmental Sciences, University of Georgia, Griffin, GA, USA)

**Date**: 01/22/2025 (first draft); 04/24/2025 (use a modified "af3.sh" script and more)


# Introduction

AlphaFold 2 has renolutionized biomolecular structrue prediction by providing accurate 3D protein structures, which can be effectively used for rapid molecular docking (DOI: https://doi.org/10.1038/s41586-021-03819-2). This year, AlphaFold 3 was launched, extending its capability to accurately model the biomolecule-ligand interactions, likely offering unprecedented precision in studying PFAS binding to critical toxicological targets, such as proteins (DOI: https://doi.org/10.1038/s41586-024-07487-w). It was claimed that its advanced predictive accuracy significantly surpasses that of tranditional molecular docking models (e.g., AutoDock Vina), providing more opportunities in understanding the PFAS-biomolecule binding mechanisms that drive PFAS bioaccumulation and toxicity (DOI: https://doi.org/10.1038/s41586-024-07487-w). The recent release of open-source code in November 2024 (Link: https://github.com/google-deepmind/alphafold3) introduces high-throughput capabilities, making it possible to rapidly screen a wide array of biomolecule-logand interactions. These advancements provide a foundation for generating high-quality structural features on PFAS-biomolecule interaction.

This note uses the PFOA-human serum albumin interaction as an example to demonstrate how AlphaFold 3 can be utilized for docking. Additionally, I discuss the docking results and compare them to the outcomes obtained using AutoDock Vina from our previous note.

AlphaFold 3: https://github.com/google-deepmind/alphafold3


# An example: PFOA - human serum albumin (hSA) protein

The goal of this example is to how we can use the AlphaFold 3 to predict the binding of PFOA with the hSA protein. 
To test it, I integrated all scripts (Python and Bash) together, so that we can automatically screen other potential PFAS molecules.


## Background

**Reference**
Maso, Lorenzo, et al. "Unveiling the binding mode of perfluorooctanoic acid to human serum albumin." Protein Science 30.4 (2021): 830-841. DOI: https://doi.org/10.1002/pro.4036

![Alt text](https://onlinelibrary.wiley.com/cms/asset/641b2e4e-b7a8-429b-8b78-d9238385a0ab/pro4036-fig-0001-m.jpg)

**Figure 1**. Structure of hSA in complex with PFOA and Myr. Chemical structure (top) and composite omit maps depicting the (Fo−Fc) electron density (bottom) of PFOA (a) and Myr (b) contoured at 4σ; (c) Crystal structure of hSA-PFOA-Myr complex (white) obtained using a twofold molar excess of PFOA over Myr [PDB identification code: 7AAI]; (d) Superimposition of hSA-PFOA-Myr ternary complex (white) with aligned hSA-Myr binary complex (blue white) [PDB identification code: 7AAE]. The structure of hSA is organized in homologues domains (I, II and III), subdomains (A and B), fatty acids (FA) and Sudlow's binding sites. The α-helices of hSA are represented by cylinders. Bound PFOA and Myr are shown in a ball-and-stick representation with a semi-transparent van der Waals and colored by atom type (PFOA: carbon = dark salmon, oxygen = firebrick, fluorine = palecyan; Myr: carbon = smudge green, oxygen = firebrick). The electron density PFOA and Myr is shown as grey mesh. (Note: I switched the "7AAE" with "7AAI" after checking out both structures from the PDB database.)


## Download the repository

I have created a GitHub repository, and we should have it downloaded first, which includes some required scripts. 

```bash
$ git clone https://github.com/XipingGong/pfas_docking.git
$ cd pfas_docking # go to this directory, and we will have a test later.
```


## A general script to run the AF3 docking

AF3 only requires a json file which includes the basic info of the protein-ligand complex, like protein sequence and ligand ID, so it can be straightforward to run the AlphaFold 3.

+ **Step 1. Prepare the input files: af3.json and request the parameters file**

A json example of hSA-PFOA is shown as follows, and save it as "af3.json". You can also check out the document for the details from here: https://github.com/google-deepmind/alphafold3/blob/main/docs/input.md

```json
{
  "name": "af3",
  "sequences": [
    {
    "protein": {
        "id": "A",
        "sequence": "HKSEVAHRFKDLGEENFKALVLIAFAQYLQQCPFEDHVKLVNEVTEFAKTCVADESAENCDKSLHTLFGDKLCTVATLRETYGEMADCCAKQEPERNECFLQHKDDNPNLPRLVRPEVDVMCTAFHDNEETFLKKYLYEIARRHPYFYAPELLFFAKRYKAAFTECCQAADKAACLLPKLDELRDEGKASSAKQRLKCASLQKFGERAFKAWAVARLSQRFPKAEFAEVSKLVTDLTKVHTECCHGDLLECADDRADLAKYICENQDSISSKLKECCEKPLLEKSHCIAEVENDEMPADLPSLAADFVESKDVCKNYAEAKDVFLGMFLYEYARRHPDYSVVLLLRLAKTYETTLEKCCAAADPHECYAKVFDEFKPLVEEPQNLIKQNCELFEQLGEYKFQNALLVRYTKKVPQVSTPTLVEVSRNLGKVGSKCCKHPEAKRMPCAEDYLSVVLNQLCVLHEKTPVSDRVTKCCTESLVNRRPCFSALEVDETYVPKEFNAETFTFHADICTLSEKERQIKKQTALVELVKHKPKATKEQLKAVMDDFAAFVEKCCKADDKETCFAEEGKKLVAASQAAL"
      }
    },
    {
     "ligand": {
        "id": "B",
        "ccdCodes": ["8PF"]
      }
    }
  ],
  "modelSeeds": [1],
  "bondedAtomPairs": [],
  "dialect": "alphafold3",
  "version": 2
}
```

+ **Step 2: Check out the "af3.sh" script**


```bash
# We provided a general script, "af3.sh", to run the AlphaFold 3.
# Please modify the INPUT section in the "scripts/af3.sh" file, like
# >> scripts_dir='/home/xg69107/program/pfas_docking/scripts' # need to change
# >> af3_param_dir='/home/xg69107/program/alphafold3' # need to change
# >> obabel="/home/xg69107/program/anaconda/anaconda3/envs/gmxMMPBSA/bin/obabel"
# >> python="/home/xg69107/program/anaconda/anaconda3/bin/python"

```

+ **Step 3: Run the "af3.sh" script**

```bash
# Case 1: we do not have a native structure available
$ mkdir -p test/dock_dir/7AAI_8PF # create a test folder
$ cd test/dock_dir/7AAI_8PF # go to this test folder
$ sbatch ../../../scripts/af3.sh --input_json af3.json # submit an AF3 job

# Case 2: we have a native structure available
# Download the native structure from the RCSB PDB
$ bash ../../../scripts/get_native_pdb.sh --pdbid 7AAI --ligandid 8PF # it will generate four models because of four ligands
# We can take the "7AAI_8PF_1.pdb" as the native structure
$ cp 7AAI_8PF_1.pdb native_model.pdb
$ mkdir -p native # create a "native" folder to save the native structures
$ bash ../../../scripts/get_ref_for_af3vinammpbsa.sh --input_pdb native_model.pdb --work_dir native # it will generate the native structures
$ sbatch ../../../scripts/af3.sh --input_json af3.json --native_dir native # submit an AF3 job

```

+ **Step 4: Calculate the RMSD values of predicted structures**

```bash
# To determine whether the AF3-predicted structures are good, we can compare its RMSD values with the native structure
$ python ../../../scripts/check_rmsd.py --ref native/native_modelH.pdb "af3/[bs]*/aligned_model_convert.pdb"
# >> 📂 Loading reference PDB: native/native_modelH.pdb
# >>   - /home/xg69107/program/pfas_docking/test/dock_dir/7AAI_8PF/native/native_modelH.pdb
# >> 📂 Loading target PDB: af3/[bs]*/aligned_model_convert.pdb
# >>   - /home/xg69107/program/pfas_docking/test/dock_dir/7AAI_8PF/af3/best_pose/aligned_model_convert.pdb
# >>   - /home/xg69107/program/pfas_docking/test/dock_dir/7AAI_8PF/af3/seed-1_sample-0/aligned_model_convert.pdb
# >>   - /home/xg69107/program/pfas_docking/test/dock_dir/7AAI_8PF/af3/seed-1_sample-1/aligned_model_convert.pdb
# >>   - /home/xg69107/program/pfas_docking/test/dock_dir/7AAI_8PF/af3/seed-1_sample-2/aligned_model_convert.pdb
# >>   - /home/xg69107/program/pfas_docking/test/dock_dir/7AAI_8PF/af3/seed-1_sample-3/aligned_model_convert.pdb
# >>   - /home/xg69107/program/pfas_docking/test/dock_dir/7AAI_8PF/af3/seed-1_sample-4/aligned_model_convert.pdb
# >> 📊 Protein Backbone RMSD (Direct): Min = 0.766 nm ; [0.7931478  0.79011947 0.78460294 0.7897126  0.7931478  0.7661618 ] nm
# >> 📊 Protein Backbone RMSD (MDTraj): Min = 0.434 nm ; [0.4343595  0.44710496 0.43459356 0.45176113 0.4343595  0.434135  ] nm
# >> 📊 Ligand RMSD (Direct): Min = 0.846 nm ; [0.873869   0.87230045 0.86433727 0.8542202  0.873869   0.84607273] nm
# >> 📊 Ligand RMSD (MDTraj): Min = 0.142 nm ; [0.14207564 0.17133933 0.1692996  0.17180629 0.14207564 0.16421942] nm
# >> 📊 Protein Backbone Pocket RMSD (Direct): Min = 0.077 nm ; [0.07865718 0.07987285 0.07664632 0.08199524 0.07865718 0.0780214 ] nm
# >> 📊 Protein Backbone Pocket RMSD (MDTraj): Min = 0.077 nm ; [0.07865842 0.07987241 0.07664489 0.08199523 0.07865842 0.07802187] nm

# This shows the predicted protein structures (see Protein Backbone RMSD) and ligand structures (see Ligand RMSD) are much more than 0.2 nm, although the protein backbone pocket RMSD values are within 0.2 nm. In the following section, we can see the predicted ligand structure was closer to the native ligand, but with a different oritentation.

```



# Analysis & Conclusion

<img src="af3_docking_pfoa_hsa.svg" alt="Illustration of PFOA-hSA" style="width:80%;">

**Figure 1** Comparison of PFOA-hSA interaction structures obtained experimentally and through AlphaFold 3 docking.

The results reveal a close alignment between the two methods, with the head group of PFOA showing strong similarity. Notably, no specific binding pocket was predefined in this docking example, indicating that AlphaFold 3 can accurately predict the binding pocket of PFOA in the hSA protein. However, differences are observed in the orientation of the PFOA tail. 


# Appendix

## How to submit an AF3 job at Sapelo2@GACRC?

Please also check out the documentation from here: https://wiki.gacrc.uga.edu/wiki/AlphaFold3-Sapelo2 