# Using Applications

To run different protein design applications, we first have to import everything we need. In this case, we want to test the application ligandmpnn.

In [28]:
import sys
sys.path.append("..") # add the path to the protflow directory (e.g. /home/username/ProtSlurm)
from protflow.poses import Poses
from protflow.tools.ligandmpnn import LigandMPNN
from protflow.jobstarters import SbatchArrayJobstarter, LocalJobStarter

First, we define our jobstarters. We want to test if we can run LigandMPNN on GPU or cpu using the SLURM workload manager as well as run it locally. If SLURM is not installed on your machine, you can skip the parts mentioning it.

In [29]:
slurm_gpu_jobstarter = SbatchArrayJobstarter(max_cores=10, gpus=1)
slurm_cpu_jobstarter = SbatchArrayJobstarter(max_cores=10, gpus=False)

In [30]:
local_jobstarter = LocalJobStarter(max_cores=1)

Next, we have to load our poses. We set the local_jobstarter as default jobstarter.

In [31]:
my_poses = Poses(poses='data/input_pdbs/', glob_suffix='*pdb', work_dir='applications_example', storage_format='csv', jobstarter=local_jobstarter)

To run ligandmpnn, we have to create a runner. Make sure the path to the LigandMPNN script and python path are set in protflow/config.py! The lines there should look like:

#ligandmpnn.py

LIGANDMPNN_SCRIPT_PATH = "/home/user/LigandMPNN/run.py"

LIGANDMPNN_PYTHON_PATH = "/home/user/anaconda3/envs/ligandmpnn_env/bin/python3"

You can set it also when creating the runner, but it is recommended to set it in the config if you want to run it again.

If running this notebook on a cluster, be sure that is also opened from there (e.g. with VS code installed on the cluster)! Otherwise, the /home/ directories won't match since it will look for the files on your local machine and not on the cluster!

In [32]:
ligandmpnn = LigandMPNN()

To run ligandmpnn on our poses, we have to use the .run() function. All tools and metrics should have this function. It is mandatory to provide a unique prefix for each run. Each score generated will be saved to the poses dataframe in the format prefix_scorename. The output files can be found in a folder called prefix in the working_directory set for the input poses. The .run() function always returns poses.

In [33]:
my_poses = ligandmpnn.run(poses=my_poses, prefix='ligmpnn_local', nseq=2, model_type='protein_mpnn', return_seq_threaded_pdbs_as_pose=True)
display(my_poses.df)

Unnamed: 0,input_poses,poses,poses_description,ligmpnn_local_mpnn_origin,ligmpnn_local_seed,ligmpnn_local_description,ligmpnn_local_sequence,ligmpnn_local_T,ligmpnn_local_id,ligmpnn_local_seq_rec,ligmpnn_local_overall_confidence,ligmpnn_local_ligand_confidence,ligmpnn_local_location
0,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0001,structure1,91361.0,structure1_0001,MRAEFEAALAKLRADVAARAAEVDALLAPYIAEVRANPAILATFRK...,0.1,1.0,0.48,0.3437,0.3437,/home/tripp/ProtFlow/examples/applications_exa...
1,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0002,structure1,91361.0,structure1_0002,AEEEFKAALAKLKADIAAKKAEIDALLQPYIDLVKANPAILATFKE...,0.1,2.0,0.495,0.3572,0.3572,/home/tripp/ProtFlow/examples/applications_exa...
2,data/input_pdbs/structure3.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure3_0001,structure3,91361.0,structure3_0001,AREEFEAALAALKADLAANKEEVLALLAPYIEQVRANPSIYETYLA...,0.1,1.0,0.465,0.3426,0.3426,/home/tripp/ProtFlow/examples/applications_exa...
3,data/input_pdbs/structure3.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure3_0002,structure3,91361.0,structure3_0002,AREEFERALAKLREDVEERKEEFDKLLAPYIELVKANPAILATFKE...,0.1,2.0,0.445,0.3217,0.3217,/home/tripp/ProtFlow/examples/applications_exa...
4,data/input_pdbs/structure2.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure2_0001,structure2,91361.0,structure2_0001,MRAEFEAALAALKADLEKNWEKWKALLAPYIEEVKANPEIFATFLR...,0.1,1.0,0.455,0.3418,0.3418,/home/tripp/ProtFlow/examples/applications_exa...
5,data/input_pdbs/structure2.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure2_0002,structure2,91361.0,structure2_0002,MEEEFKAALARLRADLEARRAEVDALLQPYVDLVRANPSILATFLA...,0.1,2.0,0.415,0.3432,0.3432,/home/tripp/ProtFlow/examples/applications_exa...


Notice how the poses dataframe has changed! It now contains all the poses generated from LigandMPNN and the corresponding scores. Since we did not provide a jobstarter when we set up ligandmpnn, it ran on the local machine, because it defaulted to the jobstarter we set when creating our poses. We can run LigandMPNN using another jobstarter, either by providing LigandMPNN a default jobstarter or calling the .run function with the jobstarter option. We are using the output poses from the previous run as new input poses.

In [34]:
my_poses = ligandmpnn.run(poses=my_poses, prefix='ligmpnn_cpu', nseq=2, model_type='protein_mpnn', jobstarter=slurm_cpu_jobstarter, return_seq_threaded_pdbs_as_pose=True)
display(my_poses.df)

sbatch: defined options
sbatch: -------------------- --------------------
sbatch: array               : 1-6%10
sbatch: error               : /home/tripp/ProtFlow/examples/applications_example/ligmpnn_cpu//ligandmpnn_6176863_slurm.err
sbatch: job-name            : ligandmpnn_6176863
sbatch: output              : /home/tripp/ProtFlow/examples/applications_example/ligmpnn_cpu//ligandmpnn_6176863_slurm.out
sbatch: verbose             : 3
sbatch: wrap                : eval `sed -n ${SLURM_ARRAY_TASK_ID}p /home/tripp/ProtFlow/examples/applications_example/ligmpnn_cpu//ligandmpnn_6176863_cmds`
sbatch: -------------------- --------------------
sbatch: end of defined options
sbatch: debug:  propagating RLIMIT_CPU=18446744073709551615
sbatch: debug:  propagating RLIMIT_FSIZE=18446744073709551615
sbatch: debug:  propagating RLIMIT_DATA=18446744073709551615
sbatch: debug:  propagating RLIMIT_STACK=67108864
sbatch: debug:  propagating RLIMIT_CORE=0
sbatch: debug:  propagating RLIMIT_RSS=18446744073

Unnamed: 0,input_poses,poses,poses_description,ligmpnn_local_mpnn_origin,ligmpnn_local_seed,ligmpnn_local_description,ligmpnn_local_sequence,ligmpnn_local_T,ligmpnn_local_id,ligmpnn_local_seq_rec,...,ligmpnn_cpu_mpnn_origin,ligmpnn_cpu_seed,ligmpnn_cpu_description,ligmpnn_cpu_sequence,ligmpnn_cpu_T,ligmpnn_cpu_id,ligmpnn_cpu_seq_rec,ligmpnn_cpu_overall_confidence,ligmpnn_cpu_ligand_confidence,ligmpnn_cpu_location
0,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0001_0001,structure1,91361.0,structure1_0001,MRAEFEAALAKLRADVAARAAEVDALLAPYIAEVRANPAILATFRK...,0.1,1.0,0.48,...,structure1_0001,15991.0,structure1_0001_0001,MEEEFEAALAAFKADLAANKEEYLKLLQPYIDKVKNNPSIFETYQK...,0.1,1.0,0.575,0.3382,0.3382,/home/tripp/ProtFlow/examples/applications_exa...
1,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0001_0002,structure1,91361.0,structure1_0001,MRAEFEAALAKLRADVAARAAEVDALLAPYIAEVRANPAILATFRK...,0.1,1.0,0.48,...,structure1_0001,15991.0,structure1_0001_0002,MRERFEAALALLRADIEAHKAEIDALLAPYIALVKANPEILATFKK...,0.1,2.0,0.745,0.3357,0.3357,/home/tripp/ProtFlow/examples/applications_exa...
2,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0002_0001,structure1,91361.0,structure1_0002,AEEEFKAALAKLKADIAAKKAEIDALLQPYIDLVKANPAILATFKE...,0.1,2.0,0.495,...,structure1_0002,94066.0,structure1_0002_0001,ARAEFDAALAALAADLAANAAAVAALLAPYIAEVKANPAILATHKA...,0.1,1.0,0.61,0.3502,0.3502,/home/tripp/ProtFlow/examples/applications_exa...
3,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0002_0002,structure1,91361.0,structure1_0002,AEEEFKAALAKLKADIAAKKAEIDALLQPYIDLVKANPAILATFKE...,0.1,2.0,0.495,...,structure1_0002,94066.0,structure1_0002_0002,ARAEFEAALAALEADLAANRAAWDALLAPYIAEVKANPKILETFKK...,0.1,2.0,0.635,0.3489,0.3489,/home/tripp/ProtFlow/examples/applications_exa...
4,data/input_pdbs/structure3.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure3_0001_0001,structure3,91361.0,structure3_0001,AREEFEAALAALKADLAANKEEVLALLAPYIEQVRANPSIYETYLA...,0.1,1.0,0.465,...,structure3_0001,12732.0,structure3_0001_0001,MEAAFAAALAALAADLEARAAEVDALLAPYVALVRANPALLARFLE...,0.1,1.0,0.605,0.3412,0.3412,/home/tripp/ProtFlow/examples/applications_exa...
5,data/input_pdbs/structure3.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure3_0001_0002,structure3,91361.0,structure3_0001,AREEFEAALAALKADLAANKEEVLALLAPYIEQVRANPSIYETYLA...,0.1,1.0,0.465,...,structure3_0001,12732.0,structure3_0001_0002,MEAEFEAALARLRADREARREEWDALLAPYIEEVKANPEILKTFKE...,0.1,2.0,0.66,0.3281,0.3281,/home/tripp/ProtFlow/examples/applications_exa...
6,data/input_pdbs/structure3.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure3_0002_0001,structure3,91361.0,structure3_0002,AREEFERALAKLREDVEERKEEFDKLLAPYIELVKANPAILATFKE...,0.1,2.0,0.445,...,structure3_0002,3264.0,structure3_0002_0001,ARAAFEAALAALRADVEARREEIDALLAPYIAEVKANPAILATFKE...,0.1,1.0,0.74,0.3336,0.3336,/home/tripp/ProtFlow/examples/applications_exa...
7,data/input_pdbs/structure3.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure3_0002_0002,structure3,91361.0,structure3_0002,AREEFERALAKLREDVEERKEEFDKLLAPYIELVKANPAILATFKE...,0.1,2.0,0.445,...,structure3_0002,3264.0,structure3_0002_0002,MEARFQEALAAHKADLAANKKEIDALLQPYIEEVKANPSILETFLK...,0.1,2.0,0.67,0.3241,0.3241,/home/tripp/ProtFlow/examples/applications_exa...
8,data/input_pdbs/structure2.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure2_0001_0001,structure2,91361.0,structure2_0001,MRAEFEAALAALKADLEKNWEKWKALLAPYIEEVKANPEIFATFLR...,0.1,1.0,0.455,...,structure2_0001,75029.0,structure2_0001_0001,MEEEFEKALKKLKEDVKKNKEEFEELLKPYIEEVKNNPEIFKKFKE...,0.1,1.0,0.64,0.3336,0.3336,/home/tripp/ProtFlow/examples/applications_exa...
9,data/input_pdbs/structure2.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure2_0001_0002,structure2,91361.0,structure2_0001,MRAEFEAALAALKADLEKNWEKWKALLAPYIEEVKANPEIFATFLR...,0.1,1.0,0.455,...,structure2_0001,75029.0,structure2_0001_0002,MEERFKEALAAHAADVAANAAAVDALLAPYIALVKANPEILERFLE...,0.1,2.0,0.63,0.3406,0.3406,/home/tripp/ProtFlow/examples/applications_exa...


In [35]:
gpu_ligandmpnn = LigandMPNN(jobstarter=slurm_gpu_jobstarter)
my_poses = gpu_ligandmpnn.run(poses=my_poses, prefix='ligmpnn_gpu', nseq=2, model_type='protein_mpnn', jobstarter=slurm_gpu_jobstarter, return_seq_threaded_pdbs_as_pose=True)
display(my_poses.df)

sbatch: defined options
sbatch: -------------------- --------------------
sbatch: array               : 1-10%10
sbatch: cpus-per-task       : 2
sbatch: error               : /home/tripp/ProtFlow/examples/applications_example/ligmpnn_gpu//ligandmpnn_3316457_slurm.err
sbatch: gpus-per-node       : 1
sbatch: job-name            : ligandmpnn_3316457
sbatch: output              : /home/tripp/ProtFlow/examples/applications_example/ligmpnn_gpu//ligandmpnn_3316457_slurm.out
sbatch: verbose             : 3
sbatch: wrap                : eval `sed -n ${SLURM_ARRAY_TASK_ID}p /home/tripp/ProtFlow/examples/applications_example/ligmpnn_gpu//ligandmpnn_3316457_cmds`
sbatch: -------------------- --------------------
sbatch: end of defined options
sbatch: debug:  propagating RLIMIT_CPU=18446744073709551615
sbatch: debug:  propagating RLIMIT_FSIZE=18446744073709551615
sbatch: debug:  propagating RLIMIT_DATA=18446744073709551615
sbatch: debug:  propagating RLIMIT_STACK=67108864
sbatch: debug:  propagating

Unnamed: 0,input_poses,poses,poses_description,ligmpnn_local_mpnn_origin,ligmpnn_local_seed,ligmpnn_local_description,ligmpnn_local_sequence,ligmpnn_local_T,ligmpnn_local_id,ligmpnn_local_seq_rec,...,ligmpnn_gpu_mpnn_origin,ligmpnn_gpu_seed,ligmpnn_gpu_description,ligmpnn_gpu_sequence,ligmpnn_gpu_T,ligmpnn_gpu_id,ligmpnn_gpu_seq_rec,ligmpnn_gpu_overall_confidence,ligmpnn_gpu_ligand_confidence,ligmpnn_gpu_location
0,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0001_0001_0001,structure1,91361.0,structure1_0001,MRAEFEAALAKLRADVAARAAEVDALLAPYIAEVRANPAILATFRK...,0.1,1.0,0.48,...,structure1_0001_0001,42689.0,structure1_0001_0001_0001,MRERFEEALKKLKEDIEKNKEKIDKILAPYIEKVKNNPEILKKFKE...,0.1,1.0,0.585,0.3347,0.3347,/home/tripp/ProtFlow/examples/applications_exa...
1,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0001_0001_0002,structure1,91361.0,structure1_0001,MRAEFEAALAKLRADVAARAAEVDALLAPYIAEVRANPAILATFRK...,0.1,1.0,0.48,...,structure1_0001_0001,42689.0,structure1_0001_0001_0002,MEAAFAAAVAAFKADLAANKAKVDALLQPYIDYVKNNPEILETFKK...,0.1,2.0,0.68,0.336,0.336,/home/tripp/ProtFlow/examples/applications_exa...
2,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0001_0002_0001,structure1,91361.0,structure1_0001,MRAEFEAALAKLRADVAARAAEVDALLAPYIAEVRANPAILATFRK...,0.1,1.0,0.48,...,structure1_0001_0002,42689.0,structure1_0001_0002_0001,AREAFAAALAALRADLAAHAAEVAALLAPYVEQVRANPEILATYLE...,0.1,1.0,0.63,0.3563,0.3563,/home/tripp/ProtFlow/examples/applications_exa...
3,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0001_0002_0002,structure1,91361.0,structure1_0001,MRAEFEAALAKLRADVAARAAEVDALLAPYIAEVRANPAILATFRK...,0.1,1.0,0.48,...,structure1_0001_0002,42689.0,structure1_0001_0002_0002,AEAEFAAALARLRADLAAHKEEVDALLAPYIEEVKNNPKIFETFKK...,0.1,2.0,0.675,0.3422,0.3422,/home/tripp/ProtFlow/examples/applications_exa...
4,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0002_0001_0001,structure1,91361.0,structure1_0002,AEEEFKAALAKLKADIAAKKAEIDALLQPYIDLVKANPAILATFKE...,0.1,2.0,0.495,...,structure1_0002_0001,64024.0,structure1_0002_0001_0001,MRAAFEAALAALKADIAANKAAVDALLQPYIDKVKANPEILATYKK...,0.1,1.0,0.71,0.3331,0.3331,/home/tripp/ProtFlow/examples/applications_exa...
5,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0002_0001_0002,structure1,91361.0,structure1_0002,AEEEFKAALAKLKADIAAKKAEIDALLQPYIDLVKANPAILATFKE...,0.1,2.0,0.495,...,structure1_0002_0001,64024.0,structure1_0002_0001_0002,ARAAFEEALKKLKADLEKHKEEVLALLAPYIAEVKANPAIFATFKE...,0.1,2.0,0.695,0.3401,0.3401,/home/tripp/ProtFlow/examples/applications_exa...
6,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0002_0002_0001,structure1,91361.0,structure1_0002,AEEEFKAALAKLKADIAAKKAEIDALLQPYIDLVKANPAILATFKE...,0.1,2.0,0.495,...,structure1_0002_0002,64024.0,structure1_0002_0002_0001,MEEEFKAALAKLKEDIEKNKEKVEKLLKPYIEKVKNNPEILETYKK...,0.1,1.0,0.615,0.3544,0.3544,/home/tripp/ProtFlow/examples/applications_exa...
7,data/input_pdbs/structure1.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure1_0002_0002_0002,structure1,91361.0,structure1_0002,AEEEFKAALAKLKADIAAKKAEIDALLQPYIDLVKANPAILATFKE...,0.1,2.0,0.495,...,structure1_0002_0002,64024.0,structure1_0002_0002_0002,MRERFEAALAALEADLAAHREAVEALLAPEIAAVRANPAILATFLA...,0.1,2.0,0.655,0.3403,0.3403,/home/tripp/ProtFlow/examples/applications_exa...
8,data/input_pdbs/structure3.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure3_0001_0001_0001,structure3,91361.0,structure3_0001,AREEFEAALAALKADLAANKEEVLALLAPYIEQVRANPSIYETYLA...,0.1,1.0,0.465,...,structure3_0001_0001,49612.0,structure3_0001_0001_0001,AAARFAAALAALAADRAAHREELDALLQPYIDLVKANPEILATFKR...,0.1,1.0,0.665,0.359,0.359,/home/tripp/ProtFlow/examples/applications_exa...
9,data/input_pdbs/structure3.pdb,/home/tripp/ProtFlow/examples/applications_exa...,structure3_0001_0001_0002,structure3,91361.0,structure3_0001,AREEFEAALAALKADLAANKEEVLALLAPYIEQVRANPSIYETYLA...,0.1,1.0,0.465,...,structure3_0001_0001,49612.0,structure3_0001_0001_0002,ARAAFEAALAALKADIEKNRELVEKLLKPYIEKVKNNPSIYETFLK...,0.1,2.0,0.585,0.3219,0.3219,/home/tripp/ProtFlow/examples/applications_exa...
