# Example tutorial Notebook

This short introduction is meant to showcase how to use the Neg_Sel_Pred pipeline efficiently. 
We'll be going over an example data file that uses Fasta sequences of MHC binding regions, the peptide antigen, and the binding region of the TCell. The purpose of this tutorial is to get you familiarized with the expected input and output of the code.

The sample fasta files used will be MHCTutorial.fasta , PeptideTutorial.fasta , and BindingTutorial.fasta . These must be input into the command prompts as requested by the system when initialized. The data is based on real sequences aquired from the proteome exchange NIH library. 



**Purpose** 

The goal of this project is to create a software that models the process of negative selection via the comparison of binding strength between MHC molecules and the antigen they present and the binding strength between the mTEC molecule and the antigen being presented. Should time permit the project scope may expand beyond just this single set of interactions. This way multiple variable regions can be quickly tested against various potential peptide expressions to determine which undergo negative selection and which will avoid it all together.


**The data:**
- MHC I and II: Major histocompatability Complex molecules are found on antigen presenting cells. These regulate immune responses of the body and make sure that immune cells don't initiate auto-immune responses against the body's own immune system. 

- Variable Regions of T-Cell Receptor molecules: Bind to antigens and plays a role in the adaptive immune response.

- Peptide antigen: Binds to MHC and binds to the variable region of the T-Cell receptor. 

**Steps**
- Enter corresponding Fasta files when prompted
- See output printed out into results file

In [24]:
import numpy as np
import pandas as pd
import math
import sys
import copy
import pprint
from MainCode import negative_sel_project as nsp

In [25]:
print("Input the location of the MHC file you want to read in")
string1 = input()
print("Input the location of the Peptide file you want to read in")
string2 = input()
print("Input the location of the Binding Site file you want to read in")
string3 = input()
print("Input the location of the AA Table file you want to read in")
string4 = input()

#Read in Amino Acid Strings
MHC = open(string1).read() 
Peptide = open(string2).read()
Binding = open(string3).read()

AA_Table = nsp.AA_import_todict(string4)


MHCCon = []
PeptideCon = []
BindingCon = []
MHCCon,MHC,PeptideCon,Peptide,BindingCon,Binding = nsp.convertAA(MHC,Peptide,Binding,AA_Table)
#Folding for the first peptide sequence    
#Code 
print("This is the Protein")
MaxPMHC, MHCFace = nsp.createLattice(MHC,MHCCon)
print("This is the Peptide")
MaxPPeptide,PeptideFace = nsp.createLattice(Peptide,PeptideCon)
print("This is the Variable Region")
MaxPBinding,BindingFace = nsp.createLattice(Binding,BindingCon)


#Just doing basic entropy change and enthalpy = number of bonds formed. 

FMHCEntropy = len(MHC)*len(MHC)
#print(FMHCEntropy)
FBindingEntropy = len(Binding)*len(Binding)
#print(FBindingEntropy)
BMHCEntropy = abs(len(MHC)*len(MHC) - abs(MaxPPeptide - MaxPMHC))
#print(BMHCEntropy)
BBindingEntropy = abs(len(Binding)*len(Binding) - abs(MaxPPeptide - MaxPBinding))
#print(BBindingEntropy)
DMHCEntropy = FMHCEntropy - BMHCEntropy 
#print(DMHCEntropy)
DBindingEntropy = FBindingEntropy - BBindingEntropy



#For now ignoring Enthalpy changes. 


StandardGBinding = 0
StandardGMHC = 0
if MaxPPeptide <= MaxPBinding:
    StandardGBinding = nsp.SFree_E(-MaxPPeptide*50,DBindingEntropy)
else:
    StandardGBinding = nsp.SFree_E(-MaxPBinding*50,DBindingEntropy)
    
if MaxPPeptide <= MaxPMHC:
    StandardGMHC = nsp.SFree_E(-MaxPPeptide*50,DBindingEntropy)
else:
    StandardGMHC = nsp.SFree_E(-MaxPMHC*50,DBindingEntropy)

print(StandardGBinding)
print(StandardGMHC)
#Comparing both
if StandardGBinding <= StandardGMHC:
    print("Safe from negative Selection")
else:
    print("Underwent Negative Selection")

Input the location of the MHC file you want to read in


KeyboardInterrupt: Interrupted by user

**Conclusion**
This code can be used to quickly understand how changes in amino acid sequence can impact the strength of binding of the TCell receptor variable region to that of the expressed peptide antigen. In doing so many variable region sequences can be tested for when developing chimeric t-cells to better gauge if the TCell may undergo negative selection. 