# CONNECTING CEDAR with APE-Gen2.0

In this workflow, we demonstrate an example on how to use APE-Gen2.0 with the CEDAR database in modeling T-cell epitopes. 

First, making the necessary import to run the workflow:

In [9]:
import pandas as pd
from subprocess import call
import os
import nglview



Download the Tcell epitope assay databased from CEDAR:

In [None]:
call(['wget', 'https://cedar.iedb.org/downloader.php?file_name=doc/tcell_full_v3.zip', '-O', 'tcell_assay.zip'])

Extract it here:

In [3]:
call(['gunzip', '-S', '.zip', 'tcell_assay.zip'])

0

Now load it, and keep only the fields of interest:

In [4]:
cedar_data = pd.read_csv("tcell_assay", skiprows=1, low_memory=False)
cedar_data = cedar_data[['Name.2', 'Object Type', 'Species', 'Molecule Parent', 'Name', 'Name.10', 'Class', 'Number of Subjects Tested', 'Response Frequency (%)']]
cedar_data.columns = ['Organism', 'Epitope_type', 'Species', 'Molecule Parent', 'Epitope', 'MHC', 'MHC_Class', 'Number of Subjects Tested', 'Response Frequency (%)']

### Main filtering process:

In this step, you can select the parameters you need in order to filter the database with your selected parameters. As an example here, we are considering (a) Human, (b) Linear epitopes, from (c) Hepatitis B virus, with the number of subject tested for each epitope to be (d) larger than 10, and with (e) %100 success. We are doing this as we need a potent, potentially immunogenic candidate. 

As APE-Gen2.0 supports only class-I peptides, we filter out class-II MHCs.

In [2]:
cedar_data = cedar_data[cedar_data['Organism'] == 'Homo sapiens (human)']
cedar_data = cedar_data[cedar_data['Epitope_type'] == 'Linear peptide']
cedar_data = cedar_data[cedar_data['Species'] == 'Hepatitis B virus']
cedar_data = cedar_data[cedar_data['Number of Subjects Tested'] > 10]
cedar_data = cedar_data[cedar_data['Response Frequency (%)'] == 100]
cedar_data = cedar_data[cedar_data['MHC_Class'] == 'I']

cedar_data = cedar_data.drop_duplicates(subset=['Epitope', 'MHC'])

NameError: name 'cedar_data' is not defined

Printing the results, we get the following set of candidates:

In [None]:
cedar_data

We randomly choose one of them, but we recommend that all c

In [7]:
cwd = os.getcwd()
cwd_store_dict = cwd + "/"

# Arguments for APE-Gen2.0
peptide = "WLSLLVPFV"
MHC = "HLA-A*02:01"
no_of_conformations = 20

store_dir = cwd_store_dict + "CEDAR_dict"

In [None]:
os.chdir("../../data")
call(['python', 'New_APE-Gen.py', peptide, MHC, '--dir', store_dir, '--verbose', '--num_cores', '1', '--num_loops_for_optimization', str(no_of_conformations)])
os.chdir(cwd)

In [19]:
dtype = {"Peptide index": str, "Debug": str, "Affinity": float}
Apegen_res = pd.read_csv("./CEDAR_dict/successful_conformations_statistics.csv", dtype=dtype)
Apegen_res = Apegen_res.sort_values(by="Affinity")
Peptide_index = Apegen_res['Peptide index'].iloc[0]

In [21]:
view = nglview.show_file("./CEDAR_dict/results/5_final_conformations/pMHC_" + Peptide_index + ".pdb")
view

NGLWidget()