# Tutorial: using `ModelManager`

In [None]:
from peptdeep.pretrained_models import ModelManager

`ModelManager` is the main entry to access MS2/RT/CCS models.

In [None]:
model_mgr = ModelManager(mask_modloss=True, device='cpu')

Most of the default parameters and attributes of `ModelManager` class are controlled by `peptdeep.settings.global_settings` which is a dict.

```
from peptdeep.settings import global_settings
```

The default values of `peptdeep.settings.global_settings` is defined in [default_settings.yaml](../peptdeep/constants/default_settings.yaml).

#### `ModelManager.load_installed_models`

`ModelManager.load_installed_models(model_type)` enables users to load different model types. The `model_type` could be: 
- generic: generic RT/CCS/MS2 models including HLA
- HLA: currently the same as `generic`
- phos: RT/CCS/MS2 models for Phospho@S/T/Y
- digly: RT/CCS/MS2 models for GlyGly@K

Calling `ModelManager(...)` will also call `ModelManager.load_installed_models` implicitly, and the default model_type is `global_settings['model_mgr']['model_type']`.

## Test the RT model

Use the 11 iRT peptides to test the RT model

In [None]:
from peptdeep.model.rt import IRT_PEPTIDE_DF

In [None]:
df = IRT_PEPTIDE_DF.copy()
# randomly add some modifications, this may change the real irt
df.loc[1,'mods'] = 'Phospho@S'
df.loc[1,'mod_sites'] = '5'
df

Unnamed: 0,sequence,pep_name,irt,mods,mod_sites,nAA
0,LGGNEQVTR,RT-pep a,-24.92,,,9
1,GAGSSEPVTGLDAK,RT-pep b,0.0,Phospho@S,5.0,14
2,VEATFGVDESNAK,RT-pep c,12.39,,,13
3,YILAGVENSK,RT-pep d,19.79,,,10
4,TPVISGGPYEYR,RT-pep e,28.71,,,12
5,TPVITGAPYEYR,RT-pep f,33.38,,,12
6,DGLDAASYYAPVR,RT-pep g,42.26,,,13
7,ADVTPADFSEWSK,RT-pep h,54.62,,,13
8,GTFIIDPGGVIR,RT-pep i,70.52,,,12
9,GTFIIDPAAVIR,RT-pep k,87.23,,,12


In [None]:
model_mgr.load_installed_models('phos')
model_mgr.predict_rt(df)
model_mgr.rt_model.add_irt_column_to_precursor_df(df)

2022-09-09 21:54:02> Predicting RT ...


100%|██████████| 5/5 [00:00<00:00, 125.27it/s]


Unnamed: 0,sequence,pep_name,irt,mods,mod_sites,nAA,rt_pred,rt_norm_pred,irt_pred
0,LGGNEQVTR,RT-pep a,-24.92,,,9,0.184235,0.184235,-26.123537
1,GAGSSEPVTGLDAK,RT-pep b,0.0,Phospho@S,5.0,14,0.266746,0.266746,11.916059
2,VEATFGVDESNAK,RT-pep c,12.39,,,13,0.266133,0.266133,11.63312
3,YILAGVENSK,RT-pep d,19.79,,,10,0.290495,0.290495,22.864811
4,TPVISGGPYEYR,RT-pep e,28.71,,,12,0.303847,0.303847,29.020259
5,TPVITGAPYEYR,RT-pep f,33.38,,,12,0.316514,0.316514,34.860122
6,DGLDAASYYAPVR,RT-pep g,42.26,,,13,0.324423,0.324423,38.506308
7,ADVTPADFSEWSK,RT-pep h,54.62,,,13,0.345197,0.345197,48.08389
8,GTFIIDPGGVIR,RT-pep i,70.52,,,12,0.394248,0.394248,70.697474
9,GTFIIDPAAVIR,RT-pep k,87.23,,,12,0.434775,0.434775,89.38115


Training RT model on df with the `rt_norm` column:

In [None]:
def normalize_irt(df):
    min_rt = df.irt.min()
    df['rt_norm'] = (
        df.irt - min_rt
    ) / (df.irt.max()-min_rt)
normalize_irt(df)
model_mgr.epoch_to_train_rt_ccs=50
model_mgr.train_rt_model(df)
model_mgr.predict_rt(df)
model_mgr.rt_model.add_irt_column_to_precursor_df(df)

2022-09-09 21:54:02> 11 PSMs for RT training/fine-tuning
2022-09-09 21:54:09> Predicting RT ...


100%|██████████| 5/5 [00:00<00:00, 151.56it/s]


Unnamed: 0,sequence,pep_name,irt,mods,mod_sites,nAA,rt_pred,rt_norm_pred,irt_pred,rt_norm
0,LGGNEQVTR,RT-pep a,-24.92,,,9,0.127189,0.127189,-18.916407,0.0
1,GAGSSEPVTGLDAK,RT-pep b,0.0,Phospho@S,5.0,14,0.199919,0.199919,-5.504272,0.199488
2,VEATFGVDESNAK,RT-pep c,12.39,,,13,0.295237,0.295237,12.073141,0.298671
3,YILAGVENSK,RT-pep d,19.79,,,10,0.357351,0.357351,23.527389,0.357909
4,TPVISGGPYEYR,RT-pep e,28.71,,,12,0.429762,0.429762,36.880596,0.429315
5,TPVITGAPYEYR,RT-pep f,33.38,,,12,0.392419,0.392419,29.994243,0.466699
6,DGLDAASYYAPVR,RT-pep g,42.26,,,13,0.387393,0.387393,29.067502,0.537784
7,ADVTPADFSEWSK,RT-pep h,54.62,,,13,0.634485,0.634485,74.633402,0.636728
8,GTFIIDPGGVIR,RT-pep i,70.52,,,12,0.67131,0.67131,81.424123,0.764009
9,GTFIIDPAAVIR,RT-pep k,87.23,,,12,0.699334,0.699334,86.592033,0.897775


## Test the CCS model

In [None]:
df['charge'] = 2
model_mgr.predict_mobility(df)

2022-09-09 21:54:09> Predicting mobility ...


100%|██████████| 5/5 [00:00<00:00, 117.53it/s]


Unnamed: 0,sequence,pep_name,irt,mods,mod_sites,nAA,rt_pred,rt_norm_pred,irt_pred,rt_norm,charge,ccs_pred,precursor_mz,mobility_pred
0,LGGNEQVTR,RT-pep a,-24.92,,,9,0.127189,0.127189,-18.916407,0.0,2,331.279816,487.256705,0.815533
1,GAGSSEPVTGLDAK,RT-pep b,0.0,Phospho@S,5.0,14,0.199919,0.199919,-5.504272,0.199488,2,381.067841,684.805772,0.941902
2,VEATFGVDESNAK,RT-pep c,12.39,,,13,0.295237,0.295237,12.073141,0.298671,2,394.208893,683.827889,0.974369
3,YILAGVENSK,RT-pep d,19.79,,,10,0.357351,0.357351,23.527389,0.357909,2,364.828003,547.298039,0.8995
4,TPVISGGPYEYR,RT-pep e,28.71,,,12,0.429762,0.429762,36.880596,0.429315,2,394.317596,669.838059,0.974434
5,TPVITGAPYEYR,RT-pep f,33.38,,,12,0.392419,0.392419,29.994243,0.466699,2,399.848633,683.853709,0.988309
6,DGLDAASYYAPVR,RT-pep g,42.26,,,13,0.387393,0.387393,29.067502,0.537784,2,399.736542,699.338423,0.988252
7,ADVTPADFSEWSK,RT-pep h,54.62,,,13,0.634485,0.634485,74.633402,0.636728,2,405.532562,726.835714,1.002953
8,GTFIIDPGGVIR,RT-pep i,70.52,,,12,0.67131,0.67131,81.424123,0.764009,2,379.443451,622.853512,0.936954
9,GTFIIDPAAVIR,RT-pep k,87.23,,,12,0.699334,0.699334,86.592033,0.897775,2,387.88678,636.869163,0.958034


## Test the MS2 model

In [None]:
df['charge'] = 2
inten_df = model_mgr.predict_ms2(df)
inten_df

2022-09-09 21:54:10> Predicting MS2 ...


100%|██████████| 5/5 [00:00<00:00, 82.83it/s]


Unnamed: 0,b_z1,b_z2,y_z1,y_z2,b_modloss_z1,b_modloss_z2,y_modloss_z1,y_modloss_z2
0,0.000000,0.0,1.000000,0.021727,0.0,0.0,0.0,0.0
1,0.191613,0.0,0.343992,0.000000,0.0,0.0,0.0,0.0
2,0.063825,0.0,0.119938,0.015200,0.0,0.0,0.0,0.0
3,0.033420,0.0,0.257022,0.000000,0.0,0.0,0.0,0.0
4,0.027311,0.0,0.340053,0.000000,0.0,0.0,0.0,0.0
...,...,...,...,...,...,...,...,...
118,0.000000,0.0,0.101413,0.000000,0.0,0.0,0.0,0.0
119,0.000000,0.0,0.672498,0.000000,0.0,0.0,0.0,0.0
120,0.000000,0.0,0.034437,0.000000,0.0,0.0,0.0,0.0
121,0.000000,0.0,0.125430,0.000000,0.0,0.0,0.0,0.0


Note that modloss fragment intensities are enabled in this case (`ModelManager(mask_modloss=False, ...)`), so modloss intensities are not zero for Phosphopeptides:

In [None]:
phos_precursor_id = 1 # we manually assigned this peptide as phospho
inten_df.iloc[
    df.loc[phos_precursor_id,'frag_start_idx']:
    df.loc[phos_precursor_id,'frag_stop_idx'],:
]

Unnamed: 0,b_z1,b_z2,y_z1,y_z2,b_modloss_z1,b_modloss_z2,y_modloss_z1,y_modloss_z2
8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,0.063835,0.0,0.012835,0.000606,0.0,0.0,0.0,0.0
10,0.066177,0.0,0.0,0.0,0.0,0.0,0.0,0.0
11,0.061181,0.0,0.064921,0.0,0.0,0.0,0.0,0.0
12,0.0,0.0,0.082699,0.0,0.0,0.0,0.0,0.0
13,0.0,0.0,1.0,0.080108,0.0,0.0,0.0,0.0
14,0.0,0.0,0.068587,0.0,0.0,0.0,0.0,0.0
15,0.0,0.0,0.293111,0.0,0.0,0.0,0.0,0.0
16,0.0,0.0,0.185996,0.0,0.0,0.0,0.0,0.0
17,0.0,0.0,0.024486,0.0,0.0,0.0,0.0,0.0


To disable this, use `ModelManager(mask_modloss=False, ...)`:

In [None]:
model_mgr = ModelManager(mask_modloss=True, device='cpu')
model_mgr.load_installed_models('phos')
df = IRT_PEPTIDE_DF.copy()
df.loc[1,'mods'] = 'Phospho@S'
df.loc[1,'mod_sites'] = '5'
df['charge'] = 2
inten_df = model_mgr.predict_ms2(df)
inten_df.iloc[
    df.loc[1,'frag_start_idx']:
    df.loc[1,'frag_stop_idx'],:
]

2022-09-09 21:54:13> Predicting MS2 ...


100%|██████████| 5/5 [00:00<00:00, 86.70it/s]


Unnamed: 0,b_z1,b_z2,y_z1,y_z2,b_modloss_z1,b_modloss_z2,y_modloss_z1,y_modloss_z2
8,0.0,0.0,0.0,0.0,0.0,0.0,0.0,0.0
9,0.063835,0.0,0.012835,0.000606,0.0,0.0,0.0,0.0
10,0.066177,0.0,0.0,0.0,0.0,0.0,0.0,0.0
11,0.061181,0.0,0.064921,0.0,0.0,0.0,0.0,0.0
12,0.0,0.0,0.082699,0.0,0.0,0.0,0.0,0.0
13,0.0,0.0,1.0,0.080108,0.0,0.0,0.0,0.0
14,0.0,0.0,0.068587,0.0,0.0,0.0,0.0,0.0
15,0.0,0.0,0.293111,0.0,0.0,0.0,0.0,0.0
16,0.0,0.0,0.185996,0.0,0.0,0.0,0.0,0.0
17,0.0,0.0,0.024486,0.0,0.0,0.0,0.0,0.0
