## How to change the output or the input layer in a pre-trained PrositIntensityModel

Imports

In [2]:
import dlomix
import tensorflow as tf
from tensorflow import keras
import os

In [3]:
%load_ext autoreload
%autoreload 2

Specify the path to the pre-trained model file, this file needs the .keras suffix

In [7]:
PATH_TO_MODEL_DIR = '/cmnfs/proj/prosit_astral/bmpc_dlomix_group/models/baseline_models/noptm_baseline_full_bs1024/'
MODEL_NAME = '85c6c918-4a2a-42e5-aab1-e666121c69a6.keras'

Loading the model \
Normally it should work, if custom functions and classes are decorated with @keras.saving.register_keras_serializable() \
If not, Ssecify the custom objects of the model, here the masked spectral distance loss function

In [6]:
from dlomix.losses import masked_spectral_distance
model = keras.models.load_model(PATH_TO_MODEL_DIR + MODEL_NAME)
model.summary()

Model: "prosit_intensity_predictor_1"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_1 (Embedding)     multiple                  432       
                                                                 
 sequential_5 (Sequential)   (None, 30, 512)           1996800   
                                                                 
 sequential_6 (Sequential)   multiple                  4608      
                                                                 
 sequential_7 (Sequential)   (None, 29, 512)           1576806   
                                                                 
 encoder_att (AttentionLaye  multiple                  542       
 r)                                                              
                                                                 
 sequential_8 (Sequential)   multiple                  0         
                                      

### Changing the output layer
This corresponds to changing the regressor of the network \
You can either keep the dimensions and replace the trained weights with randomly initialized ones (default parameter) or 
change the output dimensions to fit a arbitrary number of ions (e.g. 1) for only b ions 

In [8]:
from dlomix.models import PrositIntensityPredictor
def change_output_layer(model: PrositIntensityPredictor, number_of_ions: int = 2):
    model.len_fion = 3 * number_of_ions
    model.regressor = tf.keras.Sequential(
    [
        tf.keras.layers.TimeDistributed(
            tf.keras.layers.Dense(model.len_fion), name="time_dense"
        ),
        tf.keras.layers.LeakyReLU(name="activation"),
        tf.keras.layers.Flatten(name="out"),
    ]
    )

In [9]:
change_output_layer(model, number_of_ions=1)

Summary of the model can only be shown if called on a tensor the size of the input

In [10]:
toy_dataset = dlomix.data.load_processed_dataset('/cmnfs/proj/prosit_astral/bmpc_dlomix_group/datasets/processed/transfer_learning_toy_data')

In [11]:
batch, _ = next(iter(toy_dataset.tensor_train_data))
model(batch)
model.summary()

Model: "prosit_intensity_predictor"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding (Embedding)       multiple                  928       
                                                                 
 sequential (Sequential)     (None, 30, 512)           1996800   
                                                                 
 sequential_1 (Sequential)   multiple                  4608      
                                                                 
 sequential_2 (Sequential)   (None, 29, 512)           1576806   
                                                                 
 encoder_att (AttentionLaye  multiple                  542       
 r)                                                              
                                                                 
 sequential_3 (Sequential)   multiple                  0         
                                        

The output of the model has now the dimensions (batch_size, 87)

In [12]:
!export XLA_FLAGS=--xla_gpu_cuda_data_dir=/usr/lib/cuda
os.environ["CUDA_VISIBLE_DEVICES"] = '1'

In [13]:
tf.config.list_physical_devices('GPU')

[]

You need to recompile the model after changing the architecture

In [21]:
# reinitialize the optimizer
optimizer = tf.keras.optimizers.Adam(learning_rate=1e-4)

In [22]:
model.compile(
    optimizer=optimizer,
    loss=dlomix.losses.masked_spectral_distance
)

In [23]:
model.fit(toy_dataset.tensor_train_data, 
          validation_data=toy_dataset.tensor_val_data,
          epochs=2)

Epoch 1/2
Epoch 2/2


<keras.src.callbacks.History at 0x7f561b1a5810>

### Changing the input layer

Changing the input layer to accomodate a new modification means updating the alphabet of the model \
After that you can reinitialize the embedding layer with the ne alphabet size

Check if the model can handle the new modification without the new embedding layer -> load model dataset with new modification

In [49]:
model = keras.models.load_model(PATH_TO_MODEL_DIR + MODEL_NAME)
model.summary()

Model: "prosit_intensity_predictor_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 embedding_2 (Embedding)     multiple                  928       
                                                                 
 sequential_11 (Sequential)  (None, 30, 512)           1996800   
                                                                 
 sequential_12 (Sequential)  multiple                  4608      
                                                                 
 sequential_13 (Sequential)  (None, 29, 512)           1576806   
                                                                 
 encoder_att (AttentionLaye  multiple                  542       
 r)                                                              
                                                                 
 sequential_14 (Sequential)  multiple                  0         
                                      

In [62]:
toy_dataset_2 = dlomix.data.load_processed_dataset('/cmnfs/proj/prosit_astral/bmpc_dlomix_group/datasets/processed/new_modification')

In [67]:
def change_input_layer(model: PrositIntensityPredictor, modifications: list = None):
    # update the alphabet given the list of the new modifications, if there are any
    if modifications:
        for new_mod in modifications:
            model.alphabet.update({new_mod: max(model.alphabet.values()) + 1})
    # replace the embedding layer
    model.embedding = tf.keras.layers.Embedding(
        input_dim=len(model.alphabet) + 2,
        output_dim=model.embedding_output_dim,
        input_length=model.seq_length,
        name='embedding'
    )

In [68]:
change_input_layer(model, ['M[UNIMOD:999]'])

Build model with a batch of the toy dataset

In [69]:
model(batch)
model.summary()

Model: "prosit_intensity_predictor_2"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 sequential_11 (Sequential)  (None, 30, 512)           1996800   
                                                                 
 sequential_12 (Sequential)  multiple                  4608      
                                                                 
 sequential_13 (Sequential)  (None, 29, 512)           1576806   
                                                                 
 encoder_att (AttentionLaye  multiple                  542       
 r)                                                              
                                                                 
 sequential_14 (Sequential)  multiple                  0         
                                                                 
 sequential_15 (Sequential)  (None, 174)               3078      
                                      

### Problems with new modifications:
* The new modification has to be processed correctly -> needs an entry in the alphabet used in the prepare_dataset.py file (maybe in a config specify the new modifications?)


To efficiently apply transfer learning, you also need to freeze certain layers -> Lina
Freezing the whole network except the regressor