This notebook is used to experiement with our customizable model. We train an instance of it, and then later we perform a hyperparameter sweep to find optimal hyperparameter values using the optuna library.

In [3]:
# imports
import argparse
from argparse import Namespace

from pytorch_lightning import Trainer, LightningModule, seed_everything
from pytorch_lightning.callbacks import ModelCheckpoint
from pytorch_lightning.loggers import CSVLogger, TensorBoardLogger
from torchsummary import summary

from yeastdnnexplorer.data_loaders.synthetic_data_loader import SyntheticDataLoader
from yeastdnnexplorer.ml_models.simple_model import SimpleModel
from yeastdnnexplorer.ml_models.customizable_model import CustomizableModel

import optuna

import matplotlib.pyplot as plt
import seaborn as sns

from yeastdnnexplorer.probability_models.generate_data import (
    perturbation_effect_adjustment_function_with_tf_relationships,
)

seed_everything(42)

Seed set to 42


42

Defining checkpoints and loggers for the model. Checkpoints tell pytorch when to save instances of the model (that can be loaded and inspected later) and loggers tell pytorch how to format the metrics that the model logs during its training.

In [4]:
# define checkpoints for the model
# tells it when to save snapshots of the model during training
# Callback to save the best model based on validation loss

# this one works for simple model
# best_model_checkpoint = ModelCheckpoint(
#     monitor="val_loss",
#     mode="min",
#     filename="best-model-{epoch:02d}-{val_loss:.2f}",
#     save_top_k=1,
# )

best_model_checkpoint = ModelCheckpoint(
    monitor="val_mse",
    mode="min",
    filename="best-model-{epoch:02d}-{val_loss:.2f}",
    save_top_k=1,
)

# Callback to save checkpoints every 5 epochs, regardless of performance
periodic_checkpoint = ModelCheckpoint(
    filename="periodic-{epoch:02d}",
    every_n_epochs=2,
    save_top_k=-1,  # Setting -1 saves all checkpoints
)

# define loggers for the model
tb_logger = TensorBoardLogger("logs/tensorboard_logs")
csv_logger = CSVLogger("logs/csv_logs")

Training an instance of the customizable model. We are using perturbation_effect_adjustment_function_with_tf_relationships to adjust the means of the generated data based on which combinations of transcription factors are bound to the gene in question.  

In [5]:
tf_relationships_dict = {
    0: [2, 4, 7],
    1: [8],
    2: [3, 9],
    3: [1, 6],
    4: [5],
    5: [0, 2, 8],
    6: [4],
    7: [1, 4],
    8: [6],
    9: [0, 3, 8],
}

data_module = SyntheticDataLoader(
    batch_size=32,
    num_genes=4000,
    signal_mean=3.0,
    signal=[0.5] * 10,  # old: [0.1, 0.15, 0.2, 0.25, 0.3],
    n_sample=[1, 2, 2, 4, 4],  # sum of this is num of tfs
    val_size=0.1,
    test_size=0.1,
    random_state=42,
    max_mean_adjustment=3.0,
    adjustment_function=perturbation_effect_adjustment_function_with_tf_relationships,
    tf_relationships=tf_relationships_dict,
)

num_tfs = sum(data_module.n_sample)  # sum of all n_sample is the number of TFs

model = CustomizableModel(
    input_dim=num_tfs,
    output_dim=num_tfs,
    lr=0.01,
    hidden_layer_num=3,
    hidden_layer_sizes=[128, 64, 32],
    activation="ReLU",
    optimizer="Adam",
    L2_regularization_term=0.0,
    dropout_rate=0.0,
)

trainer = Trainer(
    max_epochs=10,
    deterministic=True,
    accelerator="cpu",
    callbacks=[best_model_checkpoint, periodic_checkpoint],
    logger=[tb_logger, csv_logger],
)

trainer.fit(model, data_module)

test_results = trainer.test(model, datamodule=data_module)
print("Printing test results...")
print(test_results)  # this prints all metrics that were logged during the test phase

# print summary of model
print("Printing model summary...")
summary(model, (num_tfs, num_tfs))

GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs
  X_train, Y_train = torch.tensor(X_train, dtype=torch.float32), torch.tensor(
  X_val, Y_val = torch.tensor(X_val, dtype=torch.float32), torch.tensor(
  X_test, Y_test = torch.tensor(X_test, dtype=torch.float32), torch.tensor(

  | Name          | Type       | Params
---------------------------------------------
0 | r2            | R2Score    | 0     
1 | activation    | ReLU       | 0     
2 | input_layer   | Linear     | 1.8 K 
3 | hidden_layers | ModuleList | 10.3 K
4 | output_layer  | Linear     | 429   
5 | dropout       | Dropout    | 0     
---------------------------------------------
12.6 K    Trainable params
0         Non-trainable params
12.6 K    Total params
0.050     Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=10` reached.


Testing: |          | 0/? [00:00<?, ?it/s]

────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
       Test metric             DataLoader 0
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
        test_mse            1.7528094053268433
       test_nrmse           0.20540115237236023
────────────────────────────────────────────────────────────────────────────────────────────────────────────────────────
Printing test results...
[{'test_mse': 1.7528094053268433, 'test_nrmse': 0.20540115237236023}]
Printing model summary...
----------------------------------------------------------------
        Layer (type)               Output Shape         Param #
            Linear-1              [-1, 13, 128]           1,792
              ReLU-2              [-1, 13, 128]               0
           Dropout-3              [-1, 13, 128]               0
            Linear-4               [-1, 13, 64]         

Here we define a TF relationships dictionary to be used while training models in our hyperparameter sweep. It is important that we use the same relationships for each of the models we train since the only things being varied in the hyperparameter sweep should be the hyperparameters themselves.

In [3]:
tf_relationships_dict = {
    0: [2, 4, 7],
    1: [8],
    2: [3, 9],
    3: [1, 6],
    4: [5],
    5: [0, 2, 8],
    6: [4],
    7: [1, 4],
    8: [6],
    9: [0, 3, 8],
}

To perform our hyperparameter sweep using the optuna library, we need to define an objective function that reutnrs a scalar value. We then minimize this obective function during the hyperparamter sweep.

In [5]:
def objective(trial):
    # model hyperparameters
    lr = trial.suggest_categorical("lr", [1e-4, 1e-3, 1e-2, 1e-1])
    hidden_layer_num = trial.suggest_categorical("hidden_layer_num", [1, 2, 3, 5])
    activation = trial.suggest_categorical(
        "activation", ["ReLU", "Sigmoid", "Tanh", "LeakyReLU"]
    )
    optimizer = trial.suggest_categorical("optimizer", ["Adam", "SGD", "RMSprop"])
    L2_regularization_term = trial.suggest_categorical(
        "L2_regularization_term", [0, 0.1, 0.01]
    )  # change to categorical?
    dropout_rate = trial.suggest_categorical(
        "dropout_rate", [0, 0.3, 0.5]
    )  # change to categorical?

    # data module hyperparameters
    batch_size = trial.suggest_categorical("batch_size", [32, 128])

    # training hyperparameters
    max_epochs = trial.suggest_categorical(
        "max_epochs", [2]
    )  # can keep this low for sanity check

    # defining what to pass in for the hidden layer sizes list based on the number of hidden layers
    hidden_layer_sizes_configurations = {
        1: [[64], [256]],
        2: [[64, 32], [256, 64]],
        3: [[256, 128, 32]],
        5: [[512, 256, 128, 64, 32]],
    }
    hidden_layer_sizes = trial.suggest_categorical(
        f"hidden_layer_sizes_{hidden_layer_num}_layers",
        hidden_layer_sizes_configurations[hidden_layer_num],
    )

    print("=" * 70)
    print("About to create model with the following hyperparameters:")
    print(f"lr: {lr}")
    print(f"hidden_layer_num: {hidden_layer_num}")
    print(f"hidden_layer_sizes: {hidden_layer_sizes}")
    print(f"activation: {activation}")
    print(f"optimizer: {optimizer}")
    print(f"L2_regularization_term: {L2_regularization_term}")
    print(f"dropout_rate: {dropout_rate}")
    print(f"batch_size: {batch_size}")
    print(f"max_epochs: {max_epochs}")
    print("")

    # create data module
    data_module = SyntheticDataLoader(
        batch_size=batch_size,
        num_genes=4000,
        signal_mean=3.0,
        signal=[0.5] * 10,  # old: [0.1, 0.15, 0.2, 0.25, 0.3],
        n_sample=[1, 2, 2, 4, 4],  # sum of this is num of tfs
        val_size=0.1,
        test_size=0.1,
        random_state=42,
        max_mean_adjustment=3.0,
        adjustment_function=perturbation_effect_adjustment_function_with_tf_relationships,
        tf_relationships=tf_relationships_dict,
    )

    num_tfs = sum(data_module.n_sample)  # sum of all n_sample is the number of TFs

    # create model
    model = CustomizableModel(
        input_dim=num_tfs,
        output_dim=num_tfs,
        lr=lr,
        hidden_layer_num=hidden_layer_num,
        hidden_layer_sizes=hidden_layer_sizes,
        activation=activation,
        optimizer=optimizer,
        L2_regularization_term=L2_regularization_term,
        dropout_rate=dropout_rate,
    )

    # create trainer
    trainer = Trainer(
        max_epochs=max_epochs,
        deterministic=True,
        accelerator="cpu",
        callbacks=[
            best_model_checkpoint
        ],  # not using periodic checkpoint as that would be way too many checkpoints, can add back if we choose a specific hyperparam config that we want to look more closely at
        logger=[tb_logger, csv_logger],
    )

    # train model
    trainer.fit(model, data_module)

    # get best validation loss from the model
    return trainer.callback_metrics["val_loss"]

Now we can run the hyperparater sweep and print out what we found to be the optimal set of hyperparaters. Note that this will create a very large amount of output since we are training an instance of the model for every possible combination of hyperparameters.

In [6]:
# Perform hyperparameter optimization using Optuna
study = optuna.create_study(direction="minimize")
study.optimize(objective, n_trials=5)

# Get the best hyperparameters and their corresponding values
best_params = study.best_params
best_loss = study.best_value

print("\n" * 5)
print("RESULTS" + ("=" * 70))
print(f"Best hyperparameters: {best_params}")
print(f"Best loss: {best_loss}")

[I 2024-03-18 10:30:40,965] A new study created in memory with name: no-name-a994b30b-aa3d-4830-8de9-152b9aaaab76
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


About to create model with the following hyperparameters:
lr: 0.001
hidden_layer_num: 1
hidden_layer_sizes: [64]
activation: ReLU
optimizer: Adam
L2_regularization_term: 0.01
dropout_rate: 0.5
batch_size: 32
max_epochs: 2

bm - adjustment function provided to dataLoader setup


  X_train, Y_train = torch.tensor(X_train, dtype=torch.float32), torch.tensor(
  X_val, Y_val = torch.tensor(X_val, dtype=torch.float32), torch.tensor(
  X_test, Y_test = torch.tensor(X_test, dtype=torch.float32), torch.tensor(

  | Name          | Type       | Params
---------------------------------------------
0 | activation    | ReLU       | 0     
1 | input_layer   | Linear     | 896   
2 | hidden_layers | ModuleList | 0     
3 | output_layer  | Linear     | 845   
4 | dropout       | Dropout    | 0     
---------------------------------------------
1.7 K     Trainable params
0         Non-trainable params
1.7 K     Total params
0.007     Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=2` reached.
[I 2024-03-18 10:31:04,540] Trial 0 finished with value: 4.347833156585693 and parameters: {'lr': 0.001, 'hidden_layer_num': 1, 'activation': 'ReLU', 'optimizer': 'Adam', 'L2_regularization_term': 0.01, 'dropout_rate': 0.5, 'batch_size': 32, 'max_epochs': 2, 'hidden_layer_sizes_1_layers': [64]}. Best is trial 0 with value: 4.347833156585693.
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


About to create model with the following hyperparameters:
lr: 0.1
hidden_layer_num: 3
hidden_layer_sizes: [256, 128, 32]
activation: LeakyReLU
optimizer: RMSprop
L2_regularization_term: 0.1
dropout_rate: 0.3
batch_size: 128
max_epochs: 2

bm - adjustment function provided to dataLoader setup


/Users/benmueller/2024Classes/BrentResearch/git_repos/yeastdnnexplorer/.venv/lib/python3.11/site-packages/pytorch_lightning/callbacks/model_checkpoint.py:653: Checkpoint directory logs/tensorboard_logs/lightning_logs/version_60/checkpoints exists and is not empty.

  | Name          | Type       | Params
---------------------------------------------
0 | activation    | LeakyReLU  | 0     
1 | input_layer   | Linear     | 3.6 K 
2 | hidden_layers | ModuleList | 37.0 K
3 | output_layer  | Linear     | 429   
4 | dropout       | Dropout    | 0     
---------------------------------------------
41.0 K    Trainable params
0         Non-trainable params
41.0 K    Total params
0.164     Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

/Users/benmueller/2024Classes/BrentResearch/git_repos/yeastdnnexplorer/.venv/lib/python3.11/site-packages/pytorch_lightning/loops/fit_loop.py:298: The number of training batches (25) is smaller than the logging interval Trainer(log_every_n_steps=50). Set a lower value for log_every_n_steps if you want to see logs for the training epoch.


Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=2` reached.
[I 2024-03-18 10:31:25,804] Trial 1 finished with value: 49579.48828125 and parameters: {'lr': 0.1, 'hidden_layer_num': 3, 'activation': 'LeakyReLU', 'optimizer': 'RMSprop', 'L2_regularization_term': 0.1, 'dropout_rate': 0.3, 'batch_size': 128, 'max_epochs': 2, 'hidden_layer_sizes_3_layers': [256, 128, 32]}. Best is trial 0 with value: 4.347833156585693.
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


About to create model with the following hyperparameters:
lr: 0.1
hidden_layer_num: 1
hidden_layer_sizes: [64]
activation: Tanh
optimizer: Adam
L2_regularization_term: 0.1
dropout_rate: 0.3
batch_size: 32
max_epochs: 2

bm - adjustment function provided to dataLoader setup



  | Name          | Type       | Params
---------------------------------------------
0 | activation    | Tanh       | 0     
1 | input_layer   | Linear     | 896   
2 | hidden_layers | ModuleList | 0     
3 | output_layer  | Linear     | 845   
4 | dropout       | Dropout    | 0     
---------------------------------------------
1.7 K     Trainable params
0         Non-trainable params
1.7 K     Total params
0.007     Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=2` reached.
[I 2024-03-18 10:31:48,495] Trial 2 finished with value: 2.718580961227417 and parameters: {'lr': 0.1, 'hidden_layer_num': 1, 'activation': 'Tanh', 'optimizer': 'Adam', 'L2_regularization_term': 0.1, 'dropout_rate': 0.3, 'batch_size': 32, 'max_epochs': 2, 'hidden_layer_sizes_1_layers': [64]}. Best is trial 2 with value: 2.718580961227417.
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


About to create model with the following hyperparameters:
lr: 0.1
hidden_layer_num: 5
hidden_layer_sizes: [512, 256, 128, 64, 32]
activation: Tanh
optimizer: SGD
L2_regularization_term: 0
dropout_rate: 0.3
batch_size: 128
max_epochs: 2

bm - adjustment function provided to dataLoader setup



  | Name          | Type       | Params
---------------------------------------------
0 | activation    | Tanh       | 0     
1 | input_layer   | Linear     | 7.2 K 
2 | hidden_layers | ModuleList | 174 K 
3 | output_layer  | Linear     | 429   
4 | dropout       | Dropout    | 0     
---------------------------------------------
182 K     Trainable params
0         Non-trainable params
182 K     Total params
0.729     Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=2` reached.
[I 2024-03-18 10:32:10,222] Trial 3 finished with value: 2.8252861499786377 and parameters: {'lr': 0.1, 'hidden_layer_num': 5, 'activation': 'Tanh', 'optimizer': 'SGD', 'L2_regularization_term': 0, 'dropout_rate': 0.3, 'batch_size': 128, 'max_epochs': 2, 'hidden_layer_sizes_5_layers': [512, 256, 128, 64, 32]}. Best is trial 2 with value: 2.718580961227417.
GPU available: False, used: False
TPU available: False, using: 0 TPU cores
IPU available: False, using: 0 IPUs
HPU available: False, using: 0 HPUs


About to create model with the following hyperparameters:
lr: 0.01
hidden_layer_num: 5
hidden_layer_sizes: [512, 256, 128, 64, 32]
activation: Tanh
optimizer: RMSprop
L2_regularization_term: 0.1
dropout_rate: 0.3
batch_size: 128
max_epochs: 2

bm - adjustment function provided to dataLoader setup



  | Name          | Type       | Params
---------------------------------------------
0 | activation    | Tanh       | 0     
1 | input_layer   | Linear     | 7.2 K 
2 | hidden_layers | ModuleList | 174 K 
3 | output_layer  | Linear     | 429   
4 | dropout       | Dropout    | 0     
---------------------------------------------
182 K     Trainable params
0         Non-trainable params
182 K     Total params
0.729     Total estimated model params size (MB)


Sanity Checking: |          | 0/? [00:00<?, ?it/s]

Training: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

Validation: |          | 0/? [00:00<?, ?it/s]

`Trainer.fit` stopped: `max_epochs=2` reached.
[I 2024-03-18 10:32:32,349] Trial 4 finished with value: 2.3827872276306152 and parameters: {'lr': 0.01, 'hidden_layer_num': 5, 'activation': 'Tanh', 'optimizer': 'RMSprop', 'L2_regularization_term': 0.1, 'dropout_rate': 0.3, 'batch_size': 128, 'max_epochs': 2, 'hidden_layer_sizes_5_layers': [512, 256, 128, 64, 32]}. Best is trial 4 with value: 2.3827872276306152.








Best hyperparameters: {'lr': 0.01, 'hidden_layer_num': 5, 'activation': 'Tanh', 'optimizer': 'RMSprop', 'L2_regularization_term': 0.1, 'dropout_rate': 0.3, 'batch_size': 128, 'max_epochs': 2, 'hidden_layer_sizes_5_layers': [512, 256, 128, 64, 32]}
Best loss: 2.3827872276306152
