# MHCnuggets User Guide
This is a simple jupyter notebook illustrating how to incorporate MHCnuggets into your work flow

# Installation
MHCnuggets is pip installable as 
```bash
pip install mhcnuggets
```

# Prediction
MHCnuggets is a pan predictor that can make a IC50 binding affinity prediction on any MHC alleles. However, its prediction is more reliable for alleles that are present in the IEDB. For a complete list of alleles refer to the `supported_alleles.txt` in the production data folders.

In [None]:
# importing the predict module
from mhcnuggets.src.predict import predict
# predicting new line separated peptides present in the peptides_path file 
# for MHC class_I allele HLA-A*02:01
predict(class_='I',
        peptides_path='mhcnuggets/data/test/test_peptides.peps', 
        mhc='HLA-A0201')
# similarly doing the same prediction for MHC class_II allele HLA-DRB1*01:01
predict(class_='II',
        peptides_path='mhcnuggets/data/test/test_peptides.peps', 
        mhc='HLA-DRB10101')


The above lines of code demonstrate using the default MHCnuggets models that are trained on the latest pull from IEDB. If you want to predict using your own models: 

In [None]:
# predicting using a user trained model 
predict(class_='I',
        peptides_path='mhcnuggets/data/test/test_peptides.peps', 
        mhc='HLA-A0201',weights_path='mhcnuggets/saves/test/testI.h5')

# Training
MHCnuggets allows users to train their own models on their own datasets. The recommended protocol for trainning MHCnuggets utilizes transfer learning described in the publication. Briefly, one starts with training a model for HLA-A\*02:01 and HLA-DRB1\*01:01 for 100 epochs, then training all other alleles for 100 epochs with one of the aforementioned alleles as the base transfer weights, and finally, fine tuning certain alleles (refer to the `mhc_tuning.csv` file in the production data folders) for 25 epochs. Note that the transfer of weights occurs within the same MHC supertype, i.e. one can't tune the weights of a class II allele with a class I allele. This process is demonstrated below for the training for class I alleles: HLA-A\*02:01, HLA-B\*08:01, and HLA-B\*08:02.

In [None]:
# importing the train module
from mhcnuggets.src.train import train
# training MHC class_I allele HLA-A*02:01 using data present in the data file from scratch 
train(class_='I', data='mhcnuggets/data/production/mhcI/curated_training_data.csv',
      mhc='HLA-A0201',save_path='test_A0201.h5',n_epoch=100)
# training MHC class I allele HLA-B*08:01 using transfer weights from class I allele
# HLA-A*02:01
train(class_='I', data='mhcnuggets/data/production/mhcI/curated_training_data.csv',
      mhc='HLA-B0801',save_path='test_B0801.h5',n_epoch=100, transfer_path='test_A0201.h5')
# training MHC class I allele HLA-B*08:02 using transfer weights from class I allele
# HLA-B*08:01, note that this is only train for n_epochs=25
train(class_='I', data='mhcnuggets/data/production/mhcI/curated_training_data.csv',
      mhc='HLA-B0802',save_path='test_B0802.h5',n_epoch=25, transfer_path='test_B0801.h5')


# Evaluation
MHCnuggets allows users to evaluate the training process through 3 metrics: AUC, F1, and Keneall Tau. This allows for evaluation of either user trained or default MHCnuggets models found in the `saves` directory. 

In [None]:
# importing the evaluation module
from mhcnuggets.src.evaluate import test
# Evaluating training performance of model test.h5 on peptides 
# correspondingn to class I allele HLA-B*08:01 in database given by the 
# data path. 
test(class_='I',
     data='mhcnuggets/data/production/mhcI/curated_training_data.csv',
     model_path='test.h5',mhc='HLA-B0801')