# DLBS - ms-convSTAR Experiments

## Introduction

Welcome to this notebook where we delve into various aspects of training and optimizing machine learning models. In this exploration following [Karpathy's A Recipe for Training Neural Networks](https://karpathy.github.io/2019/04/25/recipe/), we'll cover the following key components:

1. **Overfit Test:** An inital overfit test on a single batch to understand if our model is able to learn the full complexity.

2. **First Training Run:** An initial training run to establish a baseline for model performance.

3. **Regularization Techniques:**
   - **Batch Size:** Investigating the impact of different batch sizes.
   - **Dropout:** Exploring the effectiveness of dropout regularization .
   - **Weight Decay:** Analyzing the influence of weight decay.
   - **Augmentation Rate:** Examining the effects of data augmentation.

4. **Hyperparameter Tuning:**
   - **Learning Rate:** Fine-tuning the learning rate to optimize model performance and convergence.

In [None]:
# imports
from src.dlbs import train_dlbs

# Overfit Test

In this section, we conduct an overfitting test by deliberately training our model on a single batch of data. This is achieved by setting the parameter `one_batch_training` to `True`. The goal is to intentionally push the model to memorize the single training data batch and check the pipeline functionality.

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=1000,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=3,
    lambda_1=0.0,
    lambda_2=0.0,
    weight_decay=0.0,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0.0,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Overfit-Test",
    one_batch_training = True,
    augment_rate=0.,
)

## First Run

In this section, we embark on the initial training run of our model without applying regularization techniques. This serves as the baseline for model performance, helping us understand its behavior under standard training conditions.

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=30,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Run-Unregularized",
    augment_rate=0.,
)

## Regularisation 

In this section, we systematically explore the impact of various regularisation techniques on our model's performance. We investigate the following regularization parameters:

### Batch Sizes:
- Batch Size 4
- Batch Size 16

### Dropout Rates:
- Dropout Rate 0.3
- Dropout Rate 0.5
- Dropout Rate 0.7

### Weight Decay Values:
- Weight Decay 0.001
- Weight Decay 0.0001

### Augmentation Rates:
- Augmentation Rate 0.33
- Augmentation Rate 0.66

For each set of parameters, we analyze their impact on model training and generalization. This comprehensive exploration aims to identify optimal values for these regularization techniques, balancing model complexity and overfitting.

### Batchsize

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=16,
    epochs=30,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Run-Batchsizes-16",
    augment_rate=0.,
)

### Dropout

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=30,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0.3,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Run-Dropout-0.3",
    augment_rate=0.,
)

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=30,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0.5,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Run-Dropout-0.5",
    augment_rate=0.,
)

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=30,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0.7,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Run-Dropout-0.7",
    augment_rate=0.,
)

### Weight Decay

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=30,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0.001,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0.,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Run-Weigh-Decay-0.001",
    augment_rate=0.,
)

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=30,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0.0001,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0.,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Run-Weigh-Decay-0.0001",
    augment_rate=0.,
)

### Augmentation Rate

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=30,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0.,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0.,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Run-augment_rate-0.33",
    augment_rate=0.33,
)

In [None]:
output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=30,
    lr=1e-4,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0.,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0.,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group="Run-augment_rate-0.66",
    augment_rate=0.66,
)

## Hyperparameter Tuning

In this section, we focus on hyperparameter tuning, specifically exploring different learning rates to optimize our model's performance. We consider the following learning rates:

- Learning Rate: 5e-4
- Learning Rate: 1e-5
- Learning Rate: 1e-6

By varying the learning rates, we aim to identify the most effective setting for promoting faster convergence and improved model accuracy.

In [None]:
lrs = [5e-4, 1e-5, 1e-6]
for lr in lrs:
    output = train_dlbs.main(
    datadir = "../scratch/AgroLuege/ZueriCrop/ZueriCrop.hdf5",
    workers= 0,
    batchsize=4,
    epochs=30,
    lr=lr,
    snapshot=None,
    checkpoint_dir="./checkpoint/",
    layer=6,
    lambda_1=0.1,
    lambda_2=0.5,
    weight_decay=0,
    hidden=64,
    lrS=2,
    name="msConvSTAR",
    dropout=0.5,
    stage=3,
    clip=5,
    seed=0,
    gt_path="./raw_data/ZueriCrop/labels.csv",
    cell="star",
    input_dim=4,
    fold_num='all',
    project= "DLBS-MS-Convstar",
    run_group=f"Run-learning-rate-{lr}",
    augment_rate=0.,
    )