<a href="https://colab.research.google.com/github/alanraetz/signatureExomeScan/blob/main/mhcflurry_colab.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Setup

This notebook demonstrates how to generate predictions using MHCflurry.

In [1]:
# Install the package and download models
!pip install -q mhcflurry
!mhcflurry-downloads --quiet fetch models_class1_presentation

[K     |████████████████████████████████| 140 kB 13.7 MB/s 
[K     |████████████████████████████████| 61 kB 57 kB/s 
[K     |████████████████████████████████| 103 kB 55.6 MB/s 
[K     |████████████████████████████████| 636 kB 64.8 MB/s 
[K     |████████████████████████████████| 130 kB 65.1 MB/s 
[?25h  Building wheel for np-utils (setup.py) ... [?25l[?25hdone
  Building wheel for serializable (setup.py) ... [?25l[?25hdone
  Building wheel for typechecks (setup.py) ... [?25l[?25hdone
135MB [00:12, 10.8MB/s]               
Extracting: 100% 62/62 [00:13<00:00,  4.63it/s]


In [2]:
# Imports
import mhcflurry
from google.colab import files

# Quiet warnings
import warnings
warnings.filterwarnings('ignore')

In [3]:
# Load a predictor
predictor = mhcflurry.Class1PresentationPredictor.load()
predictor

Forcing tensorflow backend.


Instructions for updating:
non-resource variables are not supported in the long term

Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor


<Class1PresentationPredictor at 0x7f2eb2274050 [mhcflurry 2.0.6] generated on Thu Jun 11 13:37:18 2020>

# Predict for specified peptides

In [None]:
peptides = """
NLVPMVATV
RANDMPEPTIDE
SIINFEKL
""".split()

alleles = "A*02:01 B*27:01 H2-Kb".split()

results1 = predictor.predict(peptides, alleles)
results1

Predicting processing.


100%|██████████| 1/1 [00:05<00:00,  5.01s/it]


Predicting affinities.


Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
100%|██████████| 3/3 [00:09<00:00,  3.25s/it]


Unnamed: 0,peptide,peptide_num,sample_name,affinity,best_allele,processing_score,presentation_score,presentation_percentile
0,NLVPMVATV,0,sample1,16.570972,A*02:01,0.533008,0.970187,0.018723
1,RANDMPEPTIDE,1,sample1,21780.313255,B*27:01,0.008492,0.004732,62.744674
2,SIINFEKL,2,sample1,19.70721,H2-Kb,0.26471,0.914111,0.099511


In [None]:
# Download results
results1.to_csv('mhcflurry-results.csv')
files.download('mhcflurry-results.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
# See help for more options:
help(predictor.predict)

Help on method predict in module mhcflurry.class1_presentation_predictor:

predict(peptides, alleles, sample_names=None, n_flanks=None, c_flanks=None, include_affinity_percentile=False, verbose=1, throw=True) method of mhcflurry.class1_presentation_predictor.Class1PresentationPredictor instance
    Predict presentation scores across a set of peptides.
    
    Presentation scores combine predictions for MHC I binding affinity
    and antigen processing.
    
    This method returns a pandas.DataFrame giving presentation scores plus
    the binding affinity and processing predictions and other intermediate
    results.
    
    Example:
    
    >>> predictor = Class1PresentationPredictor.load()
    >>> predictor.predict(
    ...    peptides=["SIINFEKL", "PEPTIDE"],
    ...    n_flanks=["NNN", "SNS"],
    ...    c_flanks=["CCC", "CNC"],
    ...    alleles={
    ...        "sample1": ["A0201", "A0301", "B0702"],
    ...        "sample2": ["A0101", "C0202"],
    ...    },
    ...    verbo

# Predict by scanning across protein sequences

In [None]:
# Paste your fasta here
proteins_fasta = """
>ENSG00000004776_71_wt
YLRAPSVALPVAQVPTDPGHFSVLLDVKHFSPEEIA
>ENSG00000004776_71_mut
YLRAPSVALPVAQVPTDPGYFSVLLDVKHFSPEEIA
>ENSG00000004776_101_wt
SPEEIAVKVVGEHVEVHARHEERPDEHGFVAREFHR
>ENSG00000004776_101_mut
SPEEIAVKVVGEHVEVHARYEERPDEHGFVAREFHR
>ENSG00000004776_116_wt
VHARHEERPDEHGFVAREFHRRYRLPPGVDPAAVTS
>ENSG00000004776_116_mut
VHARHEERPDEHGFVAREFYRRYRLPPGVDPAAVTS
>ENSG00000004776_147_wt
AAVTSALSPEGVLSIQAAPASAQAPPPAAAK
>ENSG00000004776_147_mut
AAVTSALSPEGVLSIQAALASAQAPPPAAAK
>ENSG00000004776_153_wt
LSPEGVLSIQAAPASAQAPPPAAAK
>ENSG00000004776_153_mut
LSPEGVLSIQAAPASAQALPPAAAK
>ENSG00000004776_155_wt
PEGVLSIQAAPASAQAPPPAAAK
>ENSG00000004776_155_mut
PEGVLSIQAAPASAQAPPLAAAK
>ENSG00000004776_2_wt
MEIPVPVQPSWLRRASAPLP
>ENSG00000004776_2_mut
MEISVPVQPSWLRRASAPLP
>ENSG00000004776_16_wt
MEIPVPVQPSWLRRASAPLPGLSAPGRLFDQRFG
>ENSG00000004776_16_mut
MEIPVPVQPSWLRRASASLPGLSAPGRLFDQRFG
>ENSG00000004776_19_wt
EIPVPVQPSWLRRASAPLPGLSAPGRLFDQRFGEGL
>ENSG00000004776_19_mut
EIPVPVQPSWLRRASAPLLGLSAPGRLFDQRFGEGL
>ENSG00000004776_24_wt
VQPSWLRRASAPLPGLSAPGRLFDQRFGEGLLEAEL
>ENSG00000004776_24_mut
VQPSWLRRASAPLPGLSALGRLFDQRFGEGLLEAEL
>ENSG00000004776_45_wt
LFDQRFGEGLLEAELAALCPTTLAPYYLRAPSVALP
>ENSG00000004776_45_mut
LFDQRFGEGLLEAELAALCSTTLAPYYLRAPSVALP
>ENSG00000004776_46_wt
FDQRFGEGLLEAELAALCPTTLAPYYLRAPSVALPV
>ENSG00000004776_46_mut
FDQRFGEGLLEAELAALCLTTLAPYYLRAPSVALPV
>ENSG00000004776_50_wt
FGEGLLEAELAALCPTTLAPYYLRAPSVALPVAQVP
>ENSG00000004776_50_mut
FGEGLLEAELAALCPTTLASYYLRAPSVALPVAQVP
>ENSG00000004776_51_wt
GEGLLEAELAALCPTTLAPYYLRAPSVALPVAQVPT
>ENSG00000004776_51_mut
GEGLLEAELAALCPTTLALYYLRAPSVALPVAQVPT
>ENSG00000004776_57_wt
AELAALCPTTLAPYYLRAPSVALPVAQVPTDPGHFS
>ENSG00000004776_57_mut
AELAALCPTTLAPYYLRALSVALPVAQVPTDPGHFS
>ENSG00000004776_62_wt
LCPTTLAPYYLRAPSVALPVAQVPTDPGHFSVLLDV
>ENSG00000004776_62_mut
LCPTTLAPYYLRAPSVALLVAQVPTDPGHFSVLLDV
>ENSG00000004776_69_wt
PYYLRAPSVALPVAQVPTDPGHFSVLLDVKHFSPEE
>ENSG00000004776_69_mut
PYYLRAPSVALPVAQVPTDSGHFSVLLDVKHFSPEE
>ENSG00000004776_70_wt
YYLRAPSVALPVAQVPTDPGHFSVLLDVKHFSPEEI
>ENSG00000004776_70_mut
YYLRAPSVALPVAQVPTDLGHFSVLLDVKHFSPEEI
>ENSG00000004776_105_wt
IAVKVVGEHVEVHARHEERPDEHGFVAREFHRRYRL
>ENSG00000004776_105_mut
IAVKVVGEHVEVHARHEERSDEHGFVAREFHRRYRL
>ENSG00000004776_136_wt
RRYRLPPGVDPAAVTSALSPEGVLSIQAAPASAQAP
>ENSG00000004776_136_mut
RRYRLPPGVDPAAVTSALFPEGVLSIQAAPASAQAP
>ENSG00000004776_137_wt
RYRLPPGVDPAAVTSALSPEGVLSIQAAPASAQAPP
>ENSG00000004776_137_mut
RYRLPPGVDPAAVTSALSLEGVLSIQAAPASAQAPP
>ENSG00000004776_152_wt
ALSPEGVLSIQAAPASAQAPPPAAAK
>ENSG00000004776_152_mut
ALSPEGVLSIQAAPASAQASPPAAAK
>ENSG00000004776_3_wt
MEIPVPVQPSWLRRASAPLPG
>ENSG00000004776_3_mut
MEILVPVQPSWLRRASAPLPG
>ENSG00000004776_5_wt
MEIPVPVQPSWLRRASAPLPGLS
>ENSG00000004776_5_mut
MEIPVLVQPSWLRRASAPLPGLS
>ENSG00000004776_26_wt
PSWLRRASAPLPGLSAPGRLFDQRFGEGLLEAELAA
>ENSG00000004776_26_mut
PSWLRRASAPLPGLSAPGRFFDQRFGEGLLEAELAA
>ENSG00000004776_124_wt
PDEHGFVAREFHRRYRLPPGVDPAAVTSALSPEGVL
>ENSG00000004776_124_mut
PDEHGFVAREFHRRYRLPLGVDPAAVTSALSPEGVL
>ENSG00000004776_127_wt
HGFVAREFHRRYRLPPGVDPAAVTSALSPEGVLSIQ
>ENSG00000004776_127_mut
HGFVAREFHRRYRLPPGVDSAAVTSALSPEGVLSIQ
>ENSG00000004776_133_wt
EFHRRYRLPPGVDPAAVTSALSPEGVLSIQAAPASA
>ENSG00000004776_133_mut
EFHRRYRLPPGVDPAAVTFALSPEGVLSIQAAPASA
>ENSG00000004776_142_wt
PGVDPAAVTSALSPEGVLSIQAAPASAQAPPPAAAK
>ENSG00000004776_142_mut
PGVDPAAVTSALSPEGVLFIQAAPASAQAPPPAAAK
>ENSG00000004776_15_wt
MEIPVPVQPSWLRRASAPLPGLSAPGRLFDQRF
>ENSG00000004776_15_mut
MEIPVPVQPSWLRRALAPLPGLSAPGRLFDQRF
>ENSG00000004776_22_wt
VPVQPSWLRRASAPLPGLSAPGRLFDQRFGEGLLEA
>ENSG00000004776_22_mut
VPVQPSWLRRASAPLPGLLAPGRLFDQRFGEGLLEA
>ENSG00000004776_74_wt
APSVALPVAQVPTDPGHFSVLLDVKHFSPEEIAVKV
>ENSG00000004776_74_mut
APSVALPVAQVPTDPGHFLVLLDVKHFSPEEIAVKV
>ENSG00000004776_83_wt
QVPTDPGHFSVLLDVKHFSPEEIAVKVVGEHVEVHA
>ENSG00000004776_83_mut
QVPTDPGHFSVLLDVKHFLPEEIAVKVVGEHVEVHA
>ENSG00000004776_118_wt
ARHEERPDEHGFVAREFHRRYRLPPGVDPAAVTSAL
>ENSG00000004776_118_mut
ARHEERPDEHGFVAREFHRCYRLPPGVDPAAVTSAL
>ENSG00000004776_149_wt
VTSALSPEGVLSIQAAPASAQAPPPAAAK
>ENSG00000004776_149_mut
VTSALSPEGVLSIQAAPALAQAPPPAAAK
>ENSG00000004776_9_wt
MEIPVPVQPSWLRRASAPLPGLSAPGR
>ENSG00000004776_9_mut
MEIPVPVQPFWLRRASAPLPGLSAPGR
>ENSG00000001617_17_wt
MLVAGLLLWASLLTGAWPSFPTQDHLPATPRVRLS
>ENSG00000001617_17_mut
MLVAGLLLWASLLTGAWLSFPTQDHLPATPRVRLS
>ENSG00000001617_23_wt
LLLWASLLTGAWPSFPTQDHLPATPRVRLSFKELKA
>ENSG00000001617_23_mut
LLLWASLLTGAWPSFPTQDYLPATPRVRLSFKELKA
>ENSG00000001617_44_wt
PATPRVRLSFKELKATGTAHFFNFLLNTTDYRILLK
>ENSG00000001617_44_mut
PATPRVRLSFKELKATGTAYFFNFLLNTTDYRILLK
>ENSG00000001617_64_wt
FFNFLLNTTDYRILLKDEDHDRMYVGSKDYVLSLDL
>ENSG00000001617_64_mut
FFNFLLNTTDYRILLKDEDYDRMYVGSKDYVLSLDL
>ENSG00000001617_97_wt
LDLHDINREPLIIHWAASPQRIEECVLSGKDVNGEC
>ENSG00000001617_97_mut
LDLHDINREPLIIHWAASLQRIEECVLSGKDVNGEC
>ENSG00000001617_154_wt
AYNPMCTYVNRGRRAQATPWTQTQAVRGRGSRATDG
>ENSG00000001617_154_mut
AYNPMCTYVNRGRRAQATLWTQTQAVRGRGSRATDG
>ENSG00000001617_180_wt
RGRGSRATDGALRPMPTAPRQDYIFYLEPERLESGK
>ENSG00000001617_180_mut
RGRGSRATDGALRPMPTALRQDYIFYLEPERLESGK
>ENSG00000001617_258_wt
TAMRTDQYNSRWLNDPSFIHAELIPDSAERNDDKLY
>ENSG00000001617_258_mut
TAMRTDQYNSRWLNDPSFIYAELIPDSAERNDDKLY
>ENSG00000001617_387_wt
AVCVYSMADIRMVFNGPFAHKEGPNYQWMPFSGKMP
>ENSG00000001617_387_mut
AVCVYSMADIRMVFNGPFAYKEGPNYQWMPFSGKMP
>ENSG00000001617_406_wt
HKEGPNYQWMPFSGKMPYPRPGTCPGGTFTPSMKST
>ENSG00000001617_406_mut
HKEGPNYQWMPFSGKMPYLRPGTCPGGTFTPSMKST
>ENSG00000001617_418_wt
SGKMPYPRPGTCPGGTFTPSMKSTKDYPDEVINFMR
>ENSG00000001617_418_mut
SGKMPYPRPGTCPGGTFTLSMKSTKDYPDEVINFMR
>ENSG00000001617_436_wt
PSMKSTKDYPDEVINFMRSHPLMYQAVYPLQRRPLV
>ENSG00000001617_436_mut
PSMKSTKDYPDEVINFMRSYPLMYQAVYPLQRRPLV
>ENSG00000001617_438_wt
MKSTKDYPDEVINFMRSHPLMYQAVYPLQRRPLVVR
>ENSG00000001617_438_mut
MKSTKDYPDEVINFMRSHLLMYQAVYPLQRRPLVVR
>ENSG00000001617_516_wt
DDQELEELMLEEVEVFKDPAPVKTMTISSKRQQLYV
>ENSG00000001617_516_mut
DDQELEELMLEEVEVFKDLAPVKTMTISSKRQQLYV
>ENSG00000001617_701_wt
NFKHVVTRVQLHVLGRDAVHAALFPPLSMSAPPPPG
>ENSG00000001617_701_mut
NFKHVVTRVQLHVLGRDAVYAALFPPLSMSAPPPPG
>ENSG00000001617_707_wt
TRVQLHVLGRDAVHAALFPPLSMSAPPPPGAGPPTP
>ENSG00000001617_707_mut
TRVQLHVLGRDAVHAALFLPLSMSAPPPPGAGPPTP
>ENSG00000001617_708_wt
RVQLHVLGRDAVHAALFPPLSMSAPPPPGAGPPTPP
>ENSG00000001617_708_mut
RVQLHVLGRDAVHAALFPLLSMSAPPPPGAGPPTPP
>ENSG00000001617_715_wt
GRDAVHAALFPPLSMSAPPPPGAGPPTPPYQELAQL
>ENSG00000001617_715_mut
GRDAVHAALFPPLSMSAPLPPGAGPPTPPYQELAQL
>ENSG00000001617_717_wt
DAVHAALFPPLSMSAPPPPGAGPPTPPYQELAQLLA
>ENSG00000001617_717_mut
DAVHAALFPPLSMSAPPPLGAGPPTPPYQELAQLLA
>ENSG00000001617_722_wt
ALFPPLSMSAPPPPGAGPPTPPYQELAQLLAQPEVG
>ENSG00000001617_722_mut
ALFPPLSMSAPPPPGAGPLTPPYQELAQLLAQPEVG
>ENSG00000001617_736_wt
GAGPPTPPYQELAQLLAQPEVGLIHQYCQGYWRHVP
>ENSG00000001617_736_mut
GAGPPTPPYQELAQLLAQLEVGLIHQYCQGYWRHVP
>ENSG00000001617_741_wt
TPPYQELAQLLAQPEVGLIHQYCQGYWRHVPPSPRE
>ENSG00000001617_741_mut
TPPYQELAQLLAQPEVGLIYQYCQGYWRHVPPSPRE
>ENSG00000001617_750_wt
LLAQPEVGLIHQYCQGYWRHVPPSPREAPGAPRSPE
>ENSG00000001617_750_mut
LLAQPEVGLIHQYCQGYWRYVPPSPREAPGAPRSPE
>ENSG00000001617_760_wt
HQYCQGYWRHVPPSPREAPGAPRSPEPQDQKKPRNR
>ENSG00000001617_760_mut
HQYCQGYWRHVPPSPREALGAPRSPEPQDQKKPRNR
>ENSG00000001617_779_wt
GAPRSPEPQDQKKPRNRRHHPPDT
>ENSG00000001617_779_mut
GAPRSPEPQDQKKPRNRRHYPPDT
>ENSG00000001617_19_wt
LVAGLLLWASLLTGAWPSFPTQDHLPATPRVRLSFK
>ENSG00000001617_19_mut
LVAGLLLWASLLTGAWPSFSTQDHLPATPRVRLSFK
>ENSG00000001617_20_wt
VAGLLLWASLLTGAWPSFPTQDHLPATPRVRLSFKE
>ENSG00000001617_20_mut
VAGLLLWASLLTGAWPSFLTQDHLPATPRVRLSFKE
>ENSG00000001617_25_wt
LWASLLTGAWPSFPTQDHLPATPRVRLSFKELKATG
>ENSG00000001617_25_mut
LWASLLTGAWPSFPTQDHLSATPRVRLSFKELKATG
>ENSG00000001617_29_wt
LLTGAWPSFPTQDHLPATPRVRLSFKELKATGTAHF
>ENSG00000001617_29_mut
LLTGAWPSFPTQDHLPATLRVRLSFKELKATGTAHF
>ENSG00000001617_88_wt
VGSKDYVLSLDLHDINREPLIIHWAASPQRIEECVL
>ENSG00000001617_88_mut
VGSKDYVLSLDLHDINREPFIIHWAASPQRIEECVL
>ENSG00000001617_96_wt
SLDLHDINREPLIIHWAASPQRIEECVLSGKDVNGE
>ENSG00000001617_96_mut
SLDLHDINREPLIIHWAAFPQRIEECVLSGKDVNGE
>ENSG00000001617_123_wt
LSGKDVNGECGNFVRLIQPWNRTHLYVCGTGAYNPM
>ENSG00000001617_123_mut
LSGKDVNGECGNFVRLIQLWNRTHLYVCGTGAYNPM
>ENSG00000001617_139_wt
IQPWNRTHLYVCGTGAYNPMCTYVNRGRRAQATPWT
>ENSG00000001617_139_mut
IQPWNRTHLYVCGTGAYNLMCTYVNRGRRAQATPWT
>ENSG00000001617_174_wt
TQTQAVRGRGSRATDGALRPMPTAPRQDYIFYLEPE
>ENSG00000001617_174_mut
TQTQAVRGRGSRATDGALRSMPTAPRQDYIFYLEPE
>ENSG00000001617_177_wt
QAVRGRGSRATDGALRPMPTAPRQDYIFYLEPERLE
>ENSG00000001617_177_mut
QAVRGRGSRATDGALRPMLTAPRQDYIFYLEPERLE
>ENSG00000001617_179_wt
VRGRGSRATDGALRPMPTAPRQDYIFYLEPERLESG
>ENSG00000001617_179_mut
VRGRGSRATDGALRPMPTASRQDYIFYLEPERLESG
>ENSG00000001617_204_wt
FYLEPERLESGKGKCPYDPKLDTASALINEELYAGV
>ENSG00000001617_204_mut
FYLEPERLESGKGKCPYDLKLDTASALINEELYAGV
>ENSG00000001617_254_wt
LGKQTAMRTDQYNSRWLNDPSFIHAELIPDSAERND
>ENSG00000001617_254_mut
LGKQTAMRTDQYNSRWLNDSSFIHAELIPDSAERND
>ENSG00000001617_288_wt
NDDKLYFFFRERSAEAPQSPAVYARIGRICLNDDGG
>ENSG00000001617_288_mut
NDDKLYFFFRERSAEAPQSSAVYARIGRICLNDDGG
>ENSG00000001617_289_wt
DDKLYFFFRERSAEAPQSPAVYARIGRICLNDDGGH
>ENSG00000001617_289_mut
DDKLYFFFRERSAEAPQSLAVYARIGRICLNDDGGH
>ENSG00000001617_325_wt
CCLVNKWSTFLKARLVCSVPGEDGIETHFDELQDVF
>ENSG00000001617_325_mut
CCLVNKWSTFLKARLVCSVSGEDGIETHFDELQDVF
>ENSG00000001617_385_wt
GSAVCVYSMADIRMVFNGPFAHKEGPNYQWMPFSGK
>ENSG00000001617_385_mut
GSAVCVYSMADIRMVFNGLFAHKEGPNYQWMPFSGK
>ENSG00000001617_398_wt
MVFNGPFAHKEGPNYQWMPFSGKMPYPRPGTCPGGT
>ENSG00000001617_398_mut
MVFNGPFAHKEGPNYQWMLFSGKMPYPRPGTCPGGT
>ENSG00000001617_404_wt
FAHKEGPNYQWMPFSGKMPYPRPGTCPGGTFTPSMK
>ENSG00000001617_404_mut
FAHKEGPNYQWMPFSGKMLYPRPGTCPGGTFTPSMK
>ENSG00000001617_405_wt
AHKEGPNYQWMPFSGKMPYPRPGTCPGGTFTPSMKS
>ENSG00000001617_405_mut
AHKEGPNYQWMPFSGKMPYSRPGTCPGGTFTPSMKS
>ENSG00000001617_411_wt
NYQWMPFSGKMPYPRPGTCPGGTFTPSMKSTKDYPD
>ENSG00000001617_411_mut
NYQWMPFSGKMPYPRPGTCSGGTFTPSMKSTKDYPD
>ENSG00000001617_437_wt
SMKSTKDYPDEVINFMRSHPLMYQAVYPLQRRPLVV
>ENSG00000001617_437_mut
SMKSTKDYPDEVINFMRSHSLMYQAVYPLQRRPLVV
>ENSG00000001617_445_wt
PDEVINFMRSHPLMYQAVYPLQRRPLVVRTGAPYRL
>ENSG00000001617_445_mut
PDEVINFMRSHPLMYQAVYSLQRRPLVVRTGAPYRL
>ENSG00000001617_451_wt
FMRSHPLMYQAVYPLQRRPLVVRTGAPYRLTTIAVD
>ENSG00000001617_451_mut
FMRSHPLMYQAVYPLQRRLLVVRTGAPYRLTTIAVD
>ENSG00000001617_459_wt
YQAVYPLQRRPLVVRTGAPYRLTTIAVDQVDAADGR
>ENSG00000001617_459_mut
YQAVYPLQRRPLVVRTGALYRLTTIAVDQVDAADGR
>ENSG00000001617_496_wt
EVLFLGTDRGTVQKVIVLPKDDQELEELMLEEVEVF
>ENSG00000001617_496_mut
EVLFLGTDRGTVQKVIVLLKDDQELEELMLEEVEVF
>ENSG00000001617_518_wt
QELEELMLEEVEVFKDPAPVKTMTISSKRQQLYVAS
>ENSG00000001617_518_mut
QELEELMLEEVEVFKDPALVKTMTISSKRQQLYVAS
>ENSG00000001617_562_wt
LHRCQAYGAACADCCLARDPYCAWDGQACSRYTASS
>ENSG00000001617_562_mut
LHRCQAYGAACADCCLARDSYCAWDGQACSRYTASS
>ENSG00000001617_593_wt
YTASSKRRSRRQDVRHGNPIRQCRGFNSNANKNAVE
>ENSG00000001617_593_mut
YTASSKRRSRRQDVRHGNLIRQCRGFNSNANKNAVE
>ENSG00000001617_627_wt
VESVQYGVAGSAAFLECQPRSPQATVKWLFQRDPGD
>ENSG00000001617_627_mut
VESVQYGVAGSAAFLECQLRSPQATVKWLFQRDPGD
>ENSG00000001617_630_wt
VQYGVAGSAAFLECQPRSPQATVKWLFQRDPGDRRR
>ENSG00000001617_630_mut
VQYGVAGSAAFLECQPRSLQATVKWLFQRDPGDRRR
>ENSG00000001617_706_wt
VTRVQLHVLGRDAVHAALFPPLSMSAPPPPGAGPPT
>ENSG00000001617_706_mut
VTRVQLHVLGRDAVHAALFSPLSMSAPPPPGAGPPT
>ENSG00000001617_713_wt
VLGRDAVHAALFPPLSMSAPPPPGAGPPTPPYQELA
>ENSG00000001617_713_mut
VLGRDAVHAALFPPLSMSASPPPGAGPPTPPYQELA
>ENSG00000001617_716_wt
RDAVHAALFPPLSMSAPPPPGAGPPTPPYQELAQLL
>ENSG00000001617_716_mut
RDAVHAALFPPLSMSAPPPSGAGPPTPPYQELAQLL
>ENSG00000001617_720_wt
HAALFPPLSMSAPPPPGAGPPTPPYQELAQLLAQPE
>ENSG00000001617_720_mut
HAALFPPLSMSAPPPPGAGSPTPPYQELAQLLAQPE
>ENSG00000001617_721_wt
AALFPPLSMSAPPPPGAGPPTPPYQELAQLLAQPEV
>ENSG00000001617_721_mut
AALFPPLSMSAPPPPGAGPSTPPYQELAQLLAQPEV
>ENSG00000001617_753_wt
QPEVGLIHQYCQGYWRHVPPSPREAPGAPRSPEPQD
>ENSG00000001617_753_mut
QPEVGLIHQYCQGYWRHVPSSPREAPGAPRSPEPQD
>ENSG00000001617_754_wt
PEVGLIHQYCQGYWRHVPPSPREAPGAPRSPEPQDQ
>ENSG00000001617_754_mut
PEVGLIHQYCQGYWRHVPLSPREAPGAPRSPEPQDQ
>ENSG00000001617_755_wt
EVGLIHQYCQGYWRHVPPSPREAPGAPRSPEPQDQK
>ENSG00000001617_755_mut
EVGLIHQYCQGYWRHVPPSSREAPGAPRSPEPQDQK
>ENSG00000001617_756_wt
VGLIHQYCQGYWRHVPPSPREAPGAPRSPEPQDQKK
>ENSG00000001617_756_mut
VGLIHQYCQGYWRHVPPSLREAPGAPRSPEPQDQKK
>ENSG00000001617_763_wt
CQGYWRHVPPSPREAPGAPRSPEPQDQKKPRNRRHH
>ENSG00000001617_763_mut
CQGYWRHVPPSPREAPGALRSPEPQDQKKPRNRRHH
>ENSG00000001617_768_wt
RHVPPSPREAPGAPRSPEPQDQKKPRNRRHHPPDT
>ENSG00000001617_768_mut
RHVPPSPREAPGAPRSPELQDQKKPRNRRHHPPDT
>ENSG00000001617_774_wt
PREAPGAPRSPEPQDQKKPRNRRHHPPDT
>ENSG00000001617_774_mut
PREAPGAPRSPEPQDQKKLRNRRHHPPDT
>ENSG00000001617_780_wt
APRSPEPQDQKKPRNRRHHPPDT
>ENSG00000001617_780_mut
APRSPEPQDQKKPRNRRHHSPDT
>ENSG00000001617_24_wt
LLWASLLTGAWPSFPTQDHLPATPRVRLSFKELKAT
>ENSG00000001617_24_mut
LLWASLLTGAWPSFPTQDHFPATPRVRLSFKELKAT
>ENSG00000001617_172_wt
PWTQTQAVRGRGSRATDGALRPMPTAPRQDYIFYLE
>ENSG00000001617_172_mut
PWTQTQAVRGRGSRATDGAFRPMPTAPRQDYIFYLE
>ENSG00000001617_190_wt
ALRPMPTAPRQDYIFYLEPERLESGKGKCPYDPKLD
>ENSG00000001617_190_mut
ALRPMPTAPRQDYIFYLELERLESGKGKCPYDPKLD
>ENSG00000001617_211_wt
LESGKGKCPYDPKLDTASALINEELYAGVYIDFMGT
>ENSG00000001617_211_mut
LESGKGKCPYDPKLDTASAFINEELYAGVYIDFMGT
>ENSG00000001617_264_wt
QYNSRWLNDPSFIHAELIPDSAERNDDKLYFFFRER
>ENSG00000001617_264_mut
QYNSRWLNDPSFIHAELILDSAERNDDKLYFFFRER
>ENSG00000001617_352_wt
HFDELQDVFVQQTQDVRNPVIYAVFTSSGSVFRGSA
>ENSG00000001617_352_mut
HFDELQDVFVQQTQDVRNLVIYAVFTSSGSVFRGSA
>ENSG00000001617_412_wt
YQWMPFSGKMPYPRPGTCPGGTFTPSMKSTKDYPDE
>ENSG00000001617_412_mut
YQWMPFSGKMPYPRPGTCLGGTFTPSMKSTKDYPDE
>ENSG00000001617_427_wt
GTCPGGTFTPSMKSTKDYPDEVINFMRSHPLMYQAV
>ENSG00000001617_427_mut
GTCPGGTFTPSMKSTKDYLDEVINFMRSHPLMYQAV
>ENSG00000001617_446_wt
DEVINFMRSHPLMYQAVYPLQRRPLVVRTGAPYRLT
>ENSG00000001617_446_mut
DEVINFMRSHPLMYQAVYLLQRRPLVVRTGAPYRLT
>ENSG00000001617_461_wt
AVYPLQRRPLVVRTGAPYRLTTIAVDQVDAADGRYE
>ENSG00000001617_461_mut
AVYPLQRRPLVVRTGAPYRFTTIAVDQVDAADGRYE
>ENSG00000001617_558_wt
THLSLHRCQAYGAACADCCLARDPYCAWDGQACSRY
>ENSG00000001617_558_mut
THLSLHRCQAYGAACADCCFARDPYCAWDGQACSRY
>ENSG00000001617_563_wt
HRCQAYGAACADCCLARDPYCAWDGQACSRYTASSK
>ENSG00000001617_563_mut
HRCQAYGAACADCCLARDLYCAWDGQACSRYTASSK
>ENSG00000001617_622_wt
ANKNAVESVQYGVAGSAAFLECQPRSPQATVKWLFQ
>ENSG00000001617_622_mut
ANKNAVESVQYGVAGSAAFFECQPRSPQATVKWLFQ
>ENSG00000001617_642_wt
ECQPRSPQATVKWLFQRDPGDRRREIRAEDRFLRTE
>ENSG00000001617_642_mut
ECQPRSPQATVKWLFQRDLGDRRREIRAEDRFLRTE
>ENSG00000001617_673_wt
FLRTEQGLLLRALQLSDRGLYSCTATENNFKHVVTR
>ENSG00000001617_673_mut
FLRTEQGLLLRALQLSDRGFYSCTATENNFKHVVTR
>ENSG00000001617_704_wt
HVVTRVQLHVLGRDAVHAALFPPLSMSAPPPPGAGP
>ENSG00000001617_704_mut
HVVTRVQLHVLGRDAVHAAFFPPLSMSAPPPPGAGP
>ENSG00000001617_724_wt
FPPLSMSAPPPPGAGPPTPPYQELAQLLAQPEVGLI
>ENSG00000001617_724_mut
FPPLSMSAPPPPGAGPPTPSYQELAQLLAQPEVGLI
>ENSG00000001617_725_wt
PPLSMSAPPPPGAGPPTPPYQELAQLLAQPEVGLIH
>ENSG00000001617_725_mut
PPLSMSAPPPPGAGPPTPLYQELAQLLAQPEVGLIH
>ENSG00000001617_739_wt
PPTPPYQELAQLLAQPEVGLIHQYCQGYWRHVPPSP
>ENSG00000001617_739_mut
PPTPPYQELAQLLAQPEVGFIHQYCQGYWRHVPPSP
>ENSG00000001617_766_wt
YWRHVPPSPREAPGAPRSPEPQDQKKPRNRRHHPPD
>ENSG00000001617_766_mut
YWRHVPPSPREAPGAPRSLEPQDQKKPRNRRHHPPD
>ENSG00000001617_781_wt
PRSPEPQDQKKPRNRRHHPPDT
>ENSG00000001617_781_mut
PRSPEPQDQKKPRNRRHHPSDT
>ENSG00000001617_34_wt
WPSFPTQDHLPATPRVRLSFKELKATGTAHFFNFLL
>ENSG00000001617_34_mut
WPSFPTQDHLPATPRVRLLFKELKATGTAHFFNFLL
>ENSG00000001617_106_wt
PLIIHWAASPQRIEECVLSGKDVNGECGNFVRLIQP
>ENSG00000001617_106_mut
PLIIHWAASPQRIEECVLLGKDVNGECGNFVRLIQP
>ENSG00000001617_195_wt
PTAPRQDYIFYLEPERLESGKGKCPYDPKLDTASAL
>ENSG00000001617_195_mut
PTAPRQDYIFYLEPERLELGKGKCPYDPKLDTASAL
>ENSG00000001617_305_wt
QSPAVYARIGRICLNDDGGHCCLVNKWSTFLKARLV
>ENSG00000001617_305_mut
QSPAVYARIGRICLNDDGGYCCLVNKWSTFLKARLV
>ENSG00000001617_333_wt
TFLKARLVCSVPGEDGIETHFDELQDVFVQQTQDVR
>ENSG00000001617_333_mut
TFLKARLVCSVPGEDGIETYFDELQDVFVQQTQDVR
>ENSG00000001617_400_wt
FNGPFAHKEGPNYQWMPFSGKMPYPRPGTCPGGTFT
>ENSG00000001617_400_mut
FNGPFAHKEGPNYQWMPFLGKMPYPRPGTCPGGTFT
>ENSG00000001617_535_wt
APVKTMTISSKRQQLYVASAVGVTHLSLHRCQAYGA
>ENSG00000001617_535_mut
APVKTMTISSKRQQLYVALAVGVTHLSLHRCQAYGA
>ENSG00000001617_10_wt
MLVAGLLLWASLLTGAWPSFPTQDHLPA
>ENSG00000001617_10_mut
MLVAGLLLWAFLLTGAWPSFPTQDHLPA
>ENSG00000001617_18_wt
MLVAGLLLWASLLTGAWPSFPTQDHLPATPRVRLSF
>ENSG00000001617_18_mut
MLVAGLLLWASLLTGAWPFFPTQDHLPATPRVRLSF
>ENSG00000001617_78_wt
LKDEDHDRMYVGSKDYVLSLDLHDINREPLIIHWAA
>ENSG00000001617_78_mut
LKDEDHDRMYVGSKDYVLFLDLHDINREPLIIHWAA
>ENSG00000001617_200_wt
QDYIFYLEPERLESGKGKCPYDPKLDTASALINEEL
>ENSG00000001617_200_mut
QDYIFYLEPERLESGKGKCSYDPKLDTASALINEEL
>ENSG00000001617_203_wt
IFYLEPERLESGKGKCPYDPKLDTASALINEELYAG
>ENSG00000001617_203_mut
IFYLEPERLESGKGKCPYDSKLDTASALINEELYAG
>ENSG00000001617_249_wt
AIFRTLGKQTAMRTDQYNSRWLNDPSFIHAELIPDS
>ENSG00000001617_249_mut
AIFRTLGKQTAMRTDQYNFRWLNDPSFIHAELIPDS
>ENSG00000001617_263_wt
DQYNSRWLNDPSFIHAELIPDSAERNDDKLYFFFRE
>ENSG00000001617_263_mut
DQYNSRWLNDPSFIHAELISDSAERNDDKLYFFFRE
>ENSG00000001617_360_wt
FVQQTQDVRNPVIYAVFTSSGSVFRGSAVCVYSMAD
>ENSG00000001617_360_mut
FVQQTQDVRNPVIYAVFTFSGSVFRGSAVCVYSMAD
>ENSG00000001617_363_wt
QTQDVRNPVIYAVFTSSGSVFRGSAVCVYSMADIRM
>ENSG00000001617_363_mut
QTQDVRNPVIYAVFTSSGFVFRGSAVCVYSMADIRM
>ENSG00000001617_374_wt
AVFTSSGSVFRGSAVCVYSMADIRMVFNGPFAHKEG
>ENSG00000001617_374_mut
AVFTSSGSVFRGSAVCVYFMADIRMVFNGPFAHKEG
>ENSG00000001617_422_wt
PYPRPGTCPGGTFTPSMKSTKDYPDEVINFMRSHPL
>ENSG00000001617_422_mut
PYPRPGTCPGGTFTPSMKFTKDYPDEVINFMRSHPL
>ENSG00000001617_426_wt
PGTCPGGTFTPSMKSTKDYPDEVINFMRSHPLMYQA
>ENSG00000001617_426_mut
PGTCPGGTFTPSMKSTKDYSDEVINFMRSHPLMYQA
>ENSG00000001617_458_wt
MYQAVYPLQRRPLVVRTGAPYRLTTIAVDQVDAADG
>ENSG00000001617_458_mut
MYQAVYPLQRRPLVVRTGASYRLTTIAVDQVDAADG
>ENSG00000001617_515_wt
KDDQELEELMLEEVEVFKDPAPVKTMTISSKRQQLY
>ENSG00000001617_515_mut
KDDQELEELMLEEVEVFKDSAPVKTMTISSKRQQLY
>ENSG00000001617_573_wt
ADCCLARDPYCAWDGQACSRYTASSKRRSRRQDVRH
>ENSG00000001617_573_mut
ADCCLARDPYCAWDGQACFRYTASSKRRSRRQDVRH
>ENSG00000001617_578_wt
ARDPYCAWDGQACSRYTASSKRRSRRQDVRHGNPIR
>ENSG00000001617_578_mut
ARDPYCAWDGQACSRYTAFSKRRSRRQDVRHGNPIR
>ENSG00000001617_579_wt
RDPYCAWDGQACSRYTASSKRRSRRQDVRHGNPIRQ
>ENSG00000001617_579_mut
RDPYCAWDGQACSRYTASFKRRSRRQDVRHGNPIRQ
>ENSG00000001617_641_wt
LECQPRSPQATVKWLFQRDPGDRRREIRAEDRFLRT
>ENSG00000001617_641_mut
LECQPRSPQATVKWLFQRDSGDRRREIRAEDRFLRT
>ENSG00000001617_676_wt
TEQGLLLRALQLSDRGLYSCTATENNFKHVVTRVQL
>ENSG00000001617_676_mut
TEQGLLLRALQLSDRGLYFCTATENNFKHVVTRVQL
>ENSG00000001617_710_wt
QLHVLGRDAVHAALFPPLSMSAPPPPGAGPPTPPYQ
>ENSG00000001617_710_mut
QLHVLGRDAVHAALFPPLFMSAPPPPGAGPPTPPYQ
>ENSG00000001617_759_wt
IHQYCQGYWRHVPPSPREAPGAPRSPEPQDQKKPRN
>ENSG00000001617_759_mut
IHQYCQGYWRHVPPSPREASGAPRSPEPQDQKKPRN
>ENSG00000001617_765_wt
GYWRHVPPSPREAPGAPRSPEPQDQKKPRNRRHHPP
>ENSG00000001617_765_mut
GYWRHVPPSPREAPGAPRFPEPQDQKKPRNRRHHPP
>ENSG00000001617_210_wt
RLESGKGKCPYDPKLDTASALINEELYAGVYIDFMG
>ENSG00000001617_210_mut
RLESGKGKCPYDPKLDTALALINEELYAGVYIDFMG
>ENSG00000001617_256_wt
KQTAMRTDQYNSRWLNDPSFIHAELIPDSAERNDDK
>ENSG00000001617_256_mut
KQTAMRTDQYNSRWLNDPLFIHAELIPDSAERNDDK
>ENSG00000001617_282_wt
PDSAERNDDKLYFFFRERSAEAPQSPAVYARIGRIC
>ENSG00000001617_282_mut
PDSAERNDDKLYFFFRERLAEAPQSPAVYARIGRIC
>ENSG00000001617_378_wt
SSGSVFRGSAVCVYSMADIRMVFNGPFAHKEGPNYQ
>ENSG00000001617_378_mut
SSGSVFRGSAVCVYSMADICMVFNGPFAHKEGPNYQ
>ENSG00000001617_629_wt
SVQYGVAGSAAFLECQPRSPQATVKWLFQRDPGDRR
>ENSG00000001617_629_mut
SVQYGVAGSAAFLECQPRLPQATVKWLFQRDPGDRR
>ENSG00000001617_649_wt
QATVKWLFQRDPGDRRREIRAEDRFLRTEQGLLLRA
>ENSG00000001617_649_mut
QATVKWLFQRDPGDRRREICAEDRFLRTEQGLLLRA
>ENSG00000001617_671_wt
DRFLRTEQGLLLRALQLSDRGLYSCTATENNFKHVV
>ENSG00000001617_671_mut
DRFLRTEQGLLLRALQLSDCGLYSCTATENNFKHVV
>ENSG00000001617_4_wt
MLVAGLLLWASLLTGAWPSFPT
>ENSG00000001617_4_mut
MLVAGFLLWASLLTGAWPSFPT
>ENSG00000001617_5_wt
MLVAGLLLWASLLTGAWPSFPTQ
>ENSG00000001617_5_mut
MLVAGLFLWASLLTGAWPSFPTQ
>ENSG00000001617_6_wt
MLVAGLLLWASLLTGAWPSFPTQD
>ENSG00000001617_6_mut
MLVAGLLFWASLLTGAWPSFPTQD
>ENSG00000001617_324_wt
HCCLVNKWSTFLKARLVCSVPGEDGIETHFDELQDV
>ENSG00000001617_324_mut
HCCLVNKWSTFLKARLVCFVPGEDGIETHFDELQDV
>ENSG00000001617_361_wt
VQQTQDVRNPVIYAVFTSSGSVFRGSAVCVYSMADI
>ENSG00000001617_361_mut
VQQTQDVRNPVIYAVFTSFGSVFRGSAVCVYSMADI
>ENSG00000001617_368_wt
RNPVIYAVFTSSGSVFRGSAVCVYSMADIRMVFNGP
>ENSG00000001617_368_mut
RNPVIYAVFTSSGSVFRGFAVCVYSMADIRMVFNGP
>ENSG00000001617_419_wt
GKMPYPRPGTCPGGTFTPSMKSTKDYPDEVINFMRS
>ENSG00000001617_419_mut
GKMPYPRPGTCPGGTFTPFMKSTKDYPDEVINFMRS
>ENSG00000001617_525_wt
LEEVEVFKDPAPVKTMTISSKRQQLYVASAVGVTHL
>ENSG00000001617_525_mut
LEEVEVFKDPAPVKTMTIFSKRQQLYVASAVGVTHL
>ENSG00000001617_526_wt
EEVEVFKDPAPVKTMTISSKRQQLYVASAVGVTHLS
>ENSG00000001617_526_mut
EEVEVFKDPAPVKTMTISFKRQQLYVASAVGVTHLS
>ENSG00000001617_611_wt
PIRQCRGFNSNANKNAVESVQYGVAGSAAFLECQPR
>ENSG00000001617_611_mut
PIRQCRGFNSNANKNAVEFVQYGVAGSAAFLECQPR
>ENSG00000006611_32_wt
FLIENDAEKDYLYDVLRMYHQTMDVAVLVGDLKLVI
>ENSG00000006611_32_mut
FLIENDAEKDYLYDVLRMYYQTMDVAVLVGDLKLVI
>ENSG00000006611_66_wt
VINEPSRLPLFDAIRPLIPLKHQVEYDQLTPRRSRK
>ENSG00000006611_66_mut
VINEPSRLPLFDAIRPLILLKHQVEYDQLTPRRSRK
>ENSG00000006611_114_wt
GLGLSVRGGLEFGCGLFISHLIKGGQADSVGLQVGD
>ENSG00000006611_114_mut
GLGLSVRGGLEFGCGLFIFHLIKGGQADSVGLQVGD
>ENSG00000006611_145_wt
LQVGDEIVRINGYSISSCTHEEVINLIRTKKTVSIK
>ENSG00000006611_145_mut
LQVGDEIVRINGYSISSCTYEEVINLIRTKKTVSIK
>ENSG00000006611_238_wt
GLGCSISSGPIQKPGIFISHVKPGSLSAEVGLEIGD
>ENSG00000006611_238_mut
GLGCSISSGPIQKPGIFISYVKPGSLSAEVGLEIGD
>ENSG00000006611_395_wt
WGSKEQLLLPKTITAEVHPVPLRKPKSFGWFYRYDG
>ENSG00000006611_395_mut
WGSKEQLLLPKTITAEVHLVPLRKPKSFGWFYRYDG
>ENSG00000006611_401_wt
LLLPKTITAEVHPVPLRKPKSFGWFYRYDGKFPTIR
>ENSG00000006611_401_mut
LLLPKTITAEVHPVPLRKLKSFGWFYRYDGKFPTIR
>ENSG00000006611_415_wt
PLRKPKSFGWFYRYDGKFPTIRKKGKDKKKAKYGSL
>ENSG00000006611_415_mut
PLRKPKSFGWFYRYDGKFLTIRKKGKDKKKAKYGSL
>ENSG00000006611_526_wt
SEMTTGPPPPPPSVSPLAPPLRRFAGGLHLHTTDLD
>ENSG00000006611_526_mut
SEMTTGPPPPPPSVSPLALPLRRFAGGLHLHTTDLD
>ENSG00000006611_564_wt
PLDMFYYPPKTPSALPVMPHPPPSNPPHKVPAPPVL
>ENSG00000006611_564_mut
PLDMFYYPPKTPSALPVMLHPPPSNPPHKVPAPPVL
>ENSG00000006611_567_wt
MFYYPPKTPSALPVMPHPPPSNPPHKVPAPPVLPLS
>ENSG00000006611_567_mut
MFYYPPKTPSALPVMPHPLPSNPPHKVPAPPVLPLS
>ENSG00000006611_571_wt
PPKTPSALPVMPHPPPSNPPHKVPAPPVLPLSGHVS
>ENSG00000006611_571_mut
PPKTPSALPVMPHPPPSNLPHKVPAPPVLPLSGHVS
>ENSG00000006611_572_wt
PKTPSALPVMPHPPPSNPPHKVPAPPVLPLSGHVSA
>ENSG00000006611_572_mut
PKTPSALPVMPHPPPSNPLHKVPAPPVLPLSGHVSA
>ENSG00000006611_585_wt
PPSNPPHKVPAPPVLPLSGHVSASSSPWVQRTPPPI
>ENSG00000006611_585_mut
PPSNPPHKVPAPPVLPLSGYVSASSSPWVQRTPPPI
>ENSG00000006611_593_wt
VPAPPVLPLSGHVSASSSPWVQRTPPPIPIPPPPSV
>ENSG00000006611_593_mut
VPAPPVLPLSGHVSASSSLWVQRTPPPIPIPPPPSV
>ENSG00000006611_599_wt
LPLSGHVSASSSPWVQRTPPPIPIPPPPSVPTQDLT
>ENSG00000006611_599_mut
LPLSGHVSASSSPWVQRTLPPIPIPPPPSVPTQDLT
>ENSG00000006611_608_wt
SSSPWVQRTPPPIPIPPPPSVPTQDLTPTRPLPSAL
>ENSG00000006611_608_mut
SSSPWVQRTPPPIPIPPPLSVPTQDLTPTRPLPSAL
>ENSG00000006611_620_wt
IPIPPPPSVPTQDLTPTRPLPSALEEALSNHPFRTG
>ENSG00000006611_620_mut
IPIPPPPSVPTQDLTPTRLLPSALEEALSNHPFRTG
>ENSG00000006611_642_wt
ALEEALSNHPFRTGDTGNPVEDWEAKNHSGKPTNSP
>ENSG00000006611_642_mut
ALEEALSNHPFRTGDTGNLVEDWEAKNHSGKPTNSP
>ENSG00000006611_666_wt
AKNHSGKPTNSPVPEQSFPPTPKTFCPSPQPPRGPG
>ENSG00000006611_666_mut
AKNHSGKPTNSPVPEQSFLPTPKTFCPSPQPPRGPG
>ENSG00000006611_669_wt
HSGKPTNSPVPEQSFPPTPKTFCPSPQPPRGPGVST
>ENSG00000006611_669_mut
HSGKPTNSPVPEQSFPPTLKTFCPSPQPPRGPGVST
>ENSG00000006611_674_wt
TNSPVPEQSFPPTPKTFCPSPQPPRGPGVSTISKPV
>ENSG00000006611_674_mut
TNSPVPEQSFPPTPKTFCLSPQPPRGPGVSTISKPV
>ENSG00000006611_676_wt
SPVPEQSFPPTPKTFCPSPQPPRGPGVSTISKPVMV
>ENSG00000006611_676_mut
SPVPEQSFPPTPKTFCPSLQPPRGPGVSTISKPVMV
>ENSG00000006611_679_wt
PEQSFPPTPKTFCPSPQPPRGPGVSTISKPVMVHQE
>ENSG00000006611_679_mut
PEQSFPPTPKTFCPSPQPLRGPGVSTISKPVMVHQE
>ENSG00000006611_693_wt
SPQPPRGPGVSTISKPVMVHQEPNFIYRPAVKSEVL
>ENSG00000006611_693_mut
SPQPPRGPGVSTISKPVMVYQEPNFIYRPAVKSEVL
>ENSG00000006611_703_wt
STISKPVMVHQEPNFIYRPAVKSEVLPQEMLKRMVV
>ENSG00000006611_703_mut
STISKPVMVHQEPNFIYRLAVKSEVLPQEMLKRMVV
>ENSG00000006611_711_wt
VHQEPNFIYRPAVKSEVLPQEMLKRMVVYQTAFRQD
>ENSG00000006611_711_mut
VHQEPNFIYRPAVKSEVLLQEMLKRMVVYQTAFRQD
>ENSG00000006611_744_wt
RQDFRKYEEGFDPYSMFTPEQIMGKDVRLLRIKKEG
>ENSG00000006611_744_mut
RQDFRKYEEGFDPYSMFTLEQIMGKDVRLLRIKKEG
>ENSG00000006611_841_wt
KAWNQGGDWIDLVVAVCPPKEYDDELASLPSSVAES
>ENSG00000006611_841_mut
KAWNQGGDWIDLVVAVCPLKEYDDELASLPSSVAES
>ENSG00000006611_52_wt
QTMDVAVLVGDLKLVINEPSRLPLFDAIRPLIPLKH
>ENSG00000006611_52_mut
QTMDVAVLVGDLKLVINELSRLPLFDAIRPLIPLKH
>ENSG00000006611_65_wt
LVINEPSRLPLFDAIRPLIPLKHQVEYDQLTPRRSR
>ENSG00000006611_65_mut
LVINEPSRLPLFDAIRPLISLKHQVEYDQLTPRRSR
>ENSG00000006611_77_wt
DAIRPLIPLKHQVEYDQLTPRRSRKLKEVRLDRLHP
>ENSG00000006611_77_mut
DAIRPLIPLKHQVEYDQLTSRRSRKLKEVRLDRLHP
>ENSG00000006611_78_wt
AIRPLIPLKHQVEYDQLTPRRSRKLKEVRLDRLHPE
>ENSG00000006611_78_mut
AIRPLIPLKHQVEYDQLTLRRSRKLKEVRLDRLHPE
>ENSG00000006611_93_wt
QLTPRRSRKLKEVRLDRLHPEGLGLSVRGGLEFGCG
>ENSG00000006611_93_mut
QLTPRRSRKLKEVRLDRLHSEGLGLSVRGGLEFGCG
>ENSG00000006611_94_wt
LTPRRSRKLKEVRLDRLHPEGLGLSVRGGLEFGCGL
>ENSG00000006611_94_mut
LTPRRSRKLKEVRLDRLHLEGLGLSVRGGLEFGCGL
>ENSG00000006611_169_wt
NLIRTKKTVSIKVRHIGLIPVKSSPDEPLTWQYVDQ
>ENSG00000006611_169_mut
NLIRTKKTVSIKVRHIGLISVKSSPDEPLTWQYVDQ
>ENSG00000006611_170_wt
LIRTKKTVSIKVRHIGLIPVKSSPDEPLTWQYVDQF
>ENSG00000006611_170_mut
LIRTKKTVSIKVRHIGLILVKSSPDEPLTWQYVDQF
>ENSG00000006611_178_wt
SIKVRHIGLIPVKSSPDEPLTWQYVDQFVSESGGVR
>ENSG00000006611_178_mut
SIKVRHIGLIPVKSSPDEPFTWQYVDQFVSESGGVR
>ENSG00000006611_200_wt
QYVDQFVSESGGVRGSLGSPGNRENKEKKVFISLVG
>ENSG00000006611_200_mut
QYVDQFVSESGGVRGSLGFPGNRENKEKKVFISLVG
>ENSG00000006611_228_wt
KVFISLVGSRGLGCSISSGPIQKPGIFISHVKPGSL
>ENSG00000006611_228_mut
KVFISLVGSRGLGCSISSGSIQKPGIFISHVKPGSL
>ENSG00000006611_229_wt
VFISLVGSRGLGCSISSGPIQKPGIFISHVKPGSLS
>ENSG00000006611_229_mut
VFISLVGSRGLGCSISSGLIQKPGIFISHVKPGSLS
>ENSG00000006611_394_wt
DWGSKEQLLLPKTITAEVHPVPLRKPKSFGWFYRYD
>ENSG00000006611_394_mut
DWGSKEQLLLPKTITAEVHSVPLRKPKSFGWFYRYD
>ENSG00000006611_397_wt
SKEQLLLPKTITAEVHPVPLRKPKSFGWFYRYDGKF
>ENSG00000006611_397_mut
SKEQLLLPKTITAEVHPVPFRKPKSFGWFYRYDGKF
>ENSG00000006611_414_wt
VPLRKPKSFGWFYRYDGKFPTIRKKGKDKKKAKYGS
>ENSG00000006611_414_mut
VPLRKPKSFGWFYRYDGKFSTIRKKGKDKKKAKYGS
>ENSG00000006611_514_wt
RLEQISSADNEISEMTTGPPPPPPSVSPLAPPLRRF
>ENSG00000006611_514_mut
RLEQISSADNEISEMTTGPSPPPPSVSPLAPPLRRF
>ENSG00000006611_517_wt
QISSADNEISEMTTGPPPPPPSVSPLAPPLRRFAGG
>ENSG00000006611_517_mut
QISSADNEISEMTTGPPPPSPSVSPLAPPLRRFAGG
>ENSG00000006611_523_wt
NEISEMTTGPPPPPPSVSPLAPPLRRFAGGLHLHTT
>ENSG00000006611_523_mut
NEISEMTTGPPPPPPSVSLLAPPLRRFAGGLHLHTT
>ENSG00000006611_525_wt
ISEMTTGPPPPPPSVSPLAPPLRRFAGGLHLHTTDL
>ENSG00000006611_525_mut
ISEMTTGPPPPPPSVSPLASPLRRFAGGLHLHTTDL
>ENSG00000006611_527_wt
EMTTGPPPPPPSVSPLAPPLRRFAGGLHLHTTDLDD
>ENSG00000006611_527_mut
EMTTGPPPPPPSVSPLAPLLRRFAGGLHLHTTDLDD
>ENSG00000006611_545_wt
PLRRFAGGLHLHTTDLDDIPLDMFYYPPKTPSALPV
>ENSG00000006611_545_mut
PLRRFAGGLHLHTTDLDDISLDMFYYPPKTPSALPV
>ENSG00000006611_553_wt
LHLHTTDLDDIPLDMFYYPPKTPSALPVMPHPPPSN
>ENSG00000006611_553_mut
LHLHTTDLDDIPLDMFYYPSKTPSALPVMPHPPPSN
>ENSG00000006611_554_wt
HLHTTDLDDIPLDMFYYPPKTPSALPVMPHPPPSNP
>ENSG00000006611_554_mut
HLHTTDLDDIPLDMFYYPLKTPSALPVMPHPPPSNP
>ENSG00000006611_557_wt
TTDLDDIPLDMFYYPPKTPSALPVMPHPPPSNPPHK
>ENSG00000006611_557_mut
TTDLDDIPLDMFYYPPKTLSALPVMPHPPPSNPPHK
>ENSG00000006611_565_wt
LDMFYYPPKTPSALPVMPHPPPSNPPHKVPAPPVLP
>ENSG00000006611_565_mut
LDMFYYPPKTPSALPVMPHSPPSNPPHKVPAPPVLP
>ENSG00000006611_568_wt
FYYPPKTPSALPVMPHPPPSNPPHKVPAPPVLPLSG
>ENSG00000006611_568_mut
FYYPPKTPSALPVMPHPPLSNPPHKVPAPPVLPLSG
>ENSG00000006611_575_wt
PSALPVMPHPPPSNPPHKVPAPPVLPLSGHVSASSS
>ENSG00000006611_575_mut
PSALPVMPHPPPSNPPHKVSAPPVLPLSGHVSASSS
>ENSG00000006611_578_wt
LPVMPHPPPSNPPHKVPAPPVLPLSGHVSASSSPWV
>ENSG00000006611_578_mut
LPVMPHPPPSNPPHKVPAPSVLPLSGHVSASSSPWV
>ENSG00000006611_582_wt
PHPPPSNPPHKVPAPPVLPLSGHVSASSSPWVQRTP
>ENSG00000006611_582_mut
PHPPPSNPPHKVPAPPVLLLSGHVSASSSPWVQRTP
>ENSG00000006611_600_wt
PLSGHVSASSSPWVQRTPPPIPIPPPPSVPTQDLTP
>ENSG00000006611_600_mut
PLSGHVSASSSPWVQRTPPSIPIPPPPSVPTQDLTP
>ENSG00000006611_601_wt
LSGHVSASSSPWVQRTPPPIPIPPPPSVPTQDLTPT
>ENSG00000006611_601_mut
LSGHVSASSSPWVQRTPPLIPIPPPPSVPTQDLTPT
>ENSG00000006611_603_wt
GHVSASSSPWVQRTPPPIPIPPPPSVPTQDLTPTRP
>ENSG00000006611_603_mut
GHVSASSSPWVQRTPPPILIPPPPSVPTQDLTPTRP
>ENSG00000006611_604_wt
HVSASSSPWVQRTPPPIPIPPPPSVPTQDLTPTRPL
>ENSG00000006611_604_mut
HVSASSSPWVQRTPPPIPISPPPSVPTQDLTPTRPL
>ENSG00000006611_606_wt
SASSSPWVQRTPPPIPIPPPPSVPTQDLTPTRPLPS
>ENSG00000006611_606_mut
SASSSPWVQRTPPPIPIPPSPSVPTQDLTPTRPLPS
>ENSG00000006611_611_wt
PWVQRTPPPIPIPPPPSVPTQDLTPTRPLPSALEEA
>ENSG00000006611_611_mut
PWVQRTPPPIPIPPPPSVLTQDLTPTRPLPSALEEA
>ENSG00000006611_617_wt
PPPIPIPPPPSVPTQDLTPTRPLPSALEEALSNHPF
>ENSG00000006611_617_mut
PPPIPIPPPPSVPTQDLTLTRPLPSALEEALSNHPF
>ENSG00000006611_619_wt
PIPIPPPPSVPTQDLTPTRPLPSALEEALSNHPFRT
>ENSG00000006611_619_mut
PIPIPPPPSVPTQDLTPTRSLPSALEEALSNHPFRT
>ENSG00000006611_622_wt
IPPPPSVPTQDLTPTRPLPSALEEALSNHPFRTGDT
>ENSG00000006611_622_mut
IPPPPSVPTQDLTPTRPLLSALEEALSNHPFRTGDT
>ENSG00000006611_633_wt
LTPTRPLPSALEEALSNHPFRTGDTGNPVEDWEAKN
>ENSG00000006611_633_mut
LTPTRPLPSALEEALSNHLFRTGDTGNPVEDWEAKN
>ENSG00000006611_655_wt
GDTGNPVEDWEAKNHSGKPTNSPVPEQSFPPTPKTF
>ENSG00000006611_655_mut
GDTGNPVEDWEAKNHSGKLTNSPVPEQSFPPTPKTF
>ENSG00000006611_658_wt
GNPVEDWEAKNHSGKPTNSPVPEQSFPPTPKTFCPS
>ENSG00000006611_658_mut
GNPVEDWEAKNHSGKPTNFPVPEQSFPPTPKTFCPS
>ENSG00000006611_660_wt
PVEDWEAKNHSGKPTNSPVPEQSFPPTPKTFCPSPQ
>ENSG00000006611_660_mut
PVEDWEAKNHSGKPTNSPVSEQSFPPTPKTFCPSPQ
>ENSG00000006611_665_wt
EAKNHSGKPTNSPVPEQSFPPTPKTFCPSPQPPRGP
>ENSG00000006611_665_mut
EAKNHSGKPTNSPVPEQSFSPTPKTFCPSPQPPRGP
>ENSG00000006611_667_wt
KNHSGKPTNSPVPEQSFPPTPKTFCPSPQPPRGPGV
>ENSG00000006611_667_mut
KNHSGKPTNSPVPEQSFPLTPKTFCPSPQPPRGPGV
>ENSG00000006611_668_wt
NHSGKPTNSPVPEQSFPPTPKTFCPSPQPPRGPGVS
>ENSG00000006611_668_mut
NHSGKPTNSPVPEQSFPPTSKTFCPSPQPPRGPGVS
>ENSG00000006611_673_wt
PTNSPVPEQSFPPTPKTFCPSPQPPRGPGVSTISKP
>ENSG00000006611_673_mut
PTNSPVPEQSFPPTPKTFCSSPQPPRGPGVSTISKP
>ENSG00000006611_675_wt
NSPVPEQSFPPTPKTFCPSPQPPRGPGVSTISKPVM
>ENSG00000006611_675_mut
NSPVPEQSFPPTPKTFCPSSQPPRGPGVSTISKPVM
>ENSG00000006611_681_wt
QSFPPTPKTFCPSPQPPRGPGVSTISKPVMVHQEPN
>ENSG00000006611_681_mut
QSFPPTPKTFCPSPQPPRGSGVSTISKPVMVHQEPN
>ENSG00000006611_737_wt
VVYQTAFRQDFRKYEEGFDPYSMFTPEQIMGKDVRL
>ENSG00000006611_737_mut
VVYQTAFRQDFRKYEEGFDSYSMFTPEQIMGKDVRL

"""

import mhcflurry.fasta

with open("temp.fa", "w") as fd:
    fd.write(proteins_fasta)

proteins = mhcflurry.fasta.read_fasta_to_dataframe("temp.fa").set_index("sequence_id")
proteins

Unnamed: 0_level_0,sequence
sequence_id,Unnamed: 1_level_1
ENSG00000004776_71_wt,YLRAPSVALPVAQVPTDPGHFSVLLDVKHFSPEEIA
ENSG00000004776_71_mut,YLRAPSVALPVAQVPTDPGYFSVLLDVKHFSPEEIA
ENSG00000004776_101_wt,SPEEIAVKVVGEHVEVHARHEERPDEHGFVAREFHR
ENSG00000004776_101_mut,SPEEIAVKVVGEHVEVHARYEERPDEHGFVAREFHR
ENSG00000004776_116_wt,VHARHEERPDEHGFVAREFHRRYRLPPGVDPAAVTS
...,...
ENSG00000006611_675_mut,NSPVPEQSFPPTPKTFCPSSQPPRGPGVSTISKPVM
ENSG00000006611_681_wt,QSFPPTPKTFCPSPQPPRGPGVSTISKPVMVHQEPN
ENSG00000006611_681_mut,QSFPPTPKTFCPSPQPPRGSGVSTISKPVMVHQEPN
ENSG00000006611_737_wt,VVYQTAFRQDFRKYEEGFDPYSMFTPEQIMGKDVRL


In [None]:
all_alleles =predictor.predict_sequences()

In [None]:
# Define alleles for each sample
alleles={
    "my-sample": [ 'HLA-A*01:01',
'HLA-A*02:01',
'HLA-A*02:03',
'HLA-A*02:06',
'HLA-A*03:01',
'HLA-A*11:01',
'HLA-A*23:01',
'HLA-A*24:02',
'HLA-A*26:01',
'HLA-A*30:01',
'HLA-A*30:02',
'HLA-A*31:01',
'HLA-A*32:01',
'HLA-A*33:01',
'HLA-A*68:01',
'HLA-A*68:02',
'HLA-B*07:02',
'HLA-B*08:01',
'HLA-B*08:01',
'HLA-B*15:01',
'HLA-B*35:01',
'HLA-B*40:01',
'HLA-B*44:02',
'HLA-B*44:03',
'HLA-B*51:01',
'HLA-B*53:01',
'HLA-B*57:01',
'HLA-B*58:01' ],
 # ["A0201", "A0301", "B0702", "C0802"],
}

In [None]:
# Predict across protein sequences and return peptides with predicted affinity
# less than 100 nM.
results2 = predictor.predict_sequences(
    sequences=proteins.sequence.to_dict(),
    alleles=alleles,
    result="filtered",
    comparison_quantity="affinity",
    filter_value=500)
results2

Predicting processing.


100%|██████████| 1/1 [01:34<00:00, 94.54s/it]


Predicting affinities.


100%|██████████| 27/27 [16:51<00:00, 37.48s/it]


Unnamed: 0,sequence_name,pos,peptide,n_flank,c_flank,sample_name,affinity,best_allele,affinity_percentile,processing_score,presentation_score,presentation_percentile
0,ENSG00000006611_603_mut,17,ILIPPPPSV,RTPPP,PTQDL,my-sample,12.591606,HLA-A*02:03,0.039000,0.269313,0.938885,0.064321
1,ENSG00000001617_535_mut,17,ALAVGVTHL,QQLYV,SLHRC,my-sample,12.727417,HLA-A*02:03,0.042375,0.191520,0.918766,0.093152
2,ENSG00000004776_83_mut,10,VLLDVKHFL,PGHFS,PEEIA,my-sample,13.013693,HLA-A*02:03,0.047250,0.489466,0.971721,0.017663
3,ENSG00000001617_154_mut,17,TLWTQTQAV,RRAQA,RGRGS,my-sample,13.309364,HLA-A*02:03,0.052500,0.892469,0.993615,0.000489
4,ENSG00000001617_438_wt,19,LMYQAVYPL,MRSHP,QRRPL,my-sample,13.795660,HLA-A*02:01,0.031250,0.002550,0.835987,0.221250
...,...,...,...,...,...,...,...,...,...,...,...,...
9545,ENSG00000006611_394_mut,5,EQLLLPKTI,DWGSK,TAEVH,my-sample,499.825613,HLA-B*40:01,0.681000,0.484128,0.491078,0.839918
9546,ENSG00000006611_397_mut,2,EQLLLPKTI,SK,TAEVH,my-sample,499.825613,HLA-B*40:01,0.681000,0.700257,0.686999,0.459891
9547,ENSG00000006611_394_wt,5,EQLLLPKTI,DWGSK,TAEVH,my-sample,499.825613,HLA-B*40:01,0.681000,0.484128,0.491078,0.839918
9548,ENSG00000006611_395_wt,4,EQLLLPKTI,WGSK,TAEVH,my-sample,499.825613,HLA-B*40:01,0.681000,0.664050,0.656661,0.512826


In [None]:
# Download results
results2.to_csv('mhcflurry-results.csv')
files.download('mhcflurry-results.csv')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

In [None]:
# See help for more options:
help(predictor.predict_sequences)

Help on method predict_sequences in module mhcflurry.class1_presentation_predictor:

predict_sequences(sequences, alleles, result='best', comparison_quantity=None, filter_value=None, peptide_lengths=(8, 9, 10, 11), use_flanks=True, include_affinity_percentile=True, verbose=1, throw=True) method of mhcflurry.class1_presentation_predictor.Class1PresentationPredictor instance
    Predict presentation across protein sequences.
    
    Example:
    
    >>> predictor = Class1PresentationPredictor.load()
    >>> predictor.predict_sequences(
    ...    sequences={
    ...        'protein1': "MDSKGSSQKGSRLLLLLVVSNLL",
    ...        'protein2': "SSLPTPEDKEQAQQTHH",
    ...    },
    ...    alleles={
    ...        "sample1": ["A0201", "A0301", "B0702"],
    ...        "sample2": ["A0101", "C0202"],
    ...    },
    ...    result="filtered",
    ...    comparison_quantity="affinity",
    ...    filter_value=500,
    ...    verbose=0)
      sequence_name  pos     peptide n_flank c_flank sample

In [None]:
pip install biopython


Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting biopython
  Downloading biopython-1.79-cp37-cp37m-manylinux_2_5_x86_64.manylinux1_x86_64.whl (2.3 MB)
[K     |████████████████████████████████| 2.3 MB 5.2 MB/s 
Installing collected packages: biopython
Successfully installed biopython-1.79


In [None]:
import Bio
