<a href="https://colab.research.google.com/github/jalew188/PeptDeep-HLA/blob/master/nbs/HLA1_transfer.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Transfer learning of sample-specific HLA-I models

> To enable GPU in colab, click `Runtime -> Change runtime type`.

In [1]:
%pip install -q git+https://github.com/MannLabs/PeptDeep-HLA.git

In [2]:
%pip install -q gdown

In [3]:
import torch
if torch.cuda.is_available():
  # human fasta
  fasta_url = "https://drive.google.com/file/d/1V9KxDniKwZFZnHlP58EbjkuelNJnp1Kq/view?usp=share_link"
  fasta = 'UP000005640_human_reviewed.fasta'
else:
  # no GPU runtime in colab, use irt fusion peptides for testing
  fasta_url = "https://drive.google.com/file/d/1MKGRBpzvmMW0l_hdPESo3j26EWd_yi8l/view?usp=share_link"
  fasta = 'irtfusion.fasta'

In [4]:
import gdown

gdown.download(fasta_url, fasta, fuzzy=True)

Downloading...
From: https://drive.google.com/uc?id=1MKGRBpzvmMW0l_hdPESo3j26EWd_yi8l
To: /content/irtfusion.fasta
100%|██████████| 174/174 [00:00<00:00, 180kB/s]


'irtfusion.fasta'

#### Load training HLA peptides

It can be a tsv/csv/txt file containing sample-specific HLA-I peptides in the 'sequence' column.

Click the `Files` (folder logo) in the left panel of Colab and upload files.

In [5]:
the_file_you_uploaded = ""

In [6]:
import pandas as pd

if the_file_you_uploaded:
  train_seq_df = pd.read_csv(the_file_you_uploaded, sep='\t')
else:
  train_seq_df = pd.DataFrame({
    'sequence': [
        'ACDEFGHIKLMNPQ',
        'ACDEFGHI',
        'ACDEFGHIK',
        'EFGHIKLMNPQ',
        'AHIKLMNPQ',
    ]
  })
train_seq_df['nAA'] = train_seq_df.sequence.str.len()
train_seq_df

Unnamed: 0,sequence,nAA
0,ACDEFGHIKLMNPQ,14
1,ACDEFGHI,8
2,ACDEFGHIK,9
3,EFGHIKLMNPQ,11
4,AHIKLMNPQ,9


#### Initialize the model and load the pretrained model

In [7]:
from peptdeep_hla.HLA_class_I import HLA_Class_I_Classifier
model = HLA_Class_I_Classifier(
    fasta_files=[fasta]
)
model.get_parameter_num()

1669697

In [8]:
from peptdeep_hla.HLA_class_I import pretrained_HLA1
model.load(pretrained_HLA1)
pretrained_HLA1

'/usr/local/lib/python3.8/dist-packages/peptdeep_hla/pretrained_models/HLA1_IEDB.pt'

#### Train by the training peptides

The non-HLA peptides are automatically sampled from the fasta file as the negative training data.

In [9]:
model.train(
    train_seq_df, 
    epoch=20, warmup_epoch=10, 
    verbose=True
)

[Training] Epoch=1, lr=1e-05, loss=1.5713386237621307
[Training] Epoch=2, lr=2e-05, loss=1.5816382467746735
[Training] Epoch=3, lr=3e-05, loss=1.5455440878868103
[Training] Epoch=4, lr=4e-05, loss=0.8761113584041595
[Training] Epoch=5, lr=5e-05, loss=0.5743134282529354
[Training] Epoch=6, lr=6e-05, loss=0.7115397229790688
[Training] Epoch=7, lr=7e-05, loss=0.8482447639107704
[Training] Epoch=8, lr=8e-05, loss=0.45883858948946
[Training] Epoch=9, lr=9e-05, loss=0.36174119263887405
[Training] Epoch=10, lr=0.0001, loss=0.21440323255956173
[Training] Epoch=11, lr=9.755282581475769e-05, loss=0.08296080213040113
[Training] Epoch=12, lr=9.045084971874738e-05, loss=0.03662086511030793
[Training] Epoch=13, lr=7.938926261462366e-05, loss=0.10513292765244842
[Training] Epoch=14, lr=6.545084971874738e-05, loss=0.03265547938644886
[Training] Epoch=15, lr=5e-05, loss=0.08746509533375502
[Training] Epoch=16, lr=3.4549150281252636e-05, loss=0.007541837519966066
[Training] Epoch=17, lr=2.06107373853763

#### Predict HLA-I peptides from fasta

In [10]:
hla_df = model.predict_from_proteins(prob_threshold=0.7)
hla_df

100%|██████████| 1/1 [00:01<00:00,  1.14s/it]


Unnamed: 0,start_pos,end_pos,nAA,HLA_prob_pred,sequence
0,116,124,8,0.838455,SEWSKLFL
1,116,126,10,0.774762,SEWSKLFLQF
2,22,32,10,0.854397,FIIDPGGVIR
3,72,83,11,0.973126,ATFGVDESNAK
4,20,32,12,0.839394,GTFIIDPGGVIR
5,106,119,13,0.752223,VRADVTPADFSEW
6,22,35,13,0.841694,FIIDPGGVIRGTF
7,9,23,14,0.718809,RYILAGVENSKGTF


In [11]:
hla_df.to_csv('Predicted_HLA.tsv',index=False, sep="\t")

To download `Predicted_HLA.tsv` when using Colab, click the `Files` (folder logo) in the left panel and choose the file.