## Imports and Configuration

In [1]:
import os
import datetime
import wandb

import torch

import pytorch_lightning as pl
from pytorch_lightning import Trainer
from pytorch_lightning.loggers import WandbLogger
from pytorch_lightning.callbacks.early_stopping import EarlyStopping
from pytorch_lightning.utilities.model_summary import ModelSummary

from config.load_configuration import load_configuration
from data.datamodule import ECG_DataModule
from model.model import UNET_1D

#### Loading configuration

This notebook loads configuration settings using the `load_configuration` function from the `config.load_configuration` module. The configuration is stored in the `config` variable.

In [2]:
config = load_configuration()

PC Name: DESKTOP-LUKAS
Loaded configuration from config/config_lukas.yaml


#### Logging in to Weights & Biases (wandb)

Before starting any experiment tracking, ensure you are logged in to your Weights & Biases (wandb) account. This enables automatic logging of metrics, model checkpoints, and experiment configurations. The following code logs you in to wandb:

```python
wandb.login()
```
If you are running this for the first time, you may be prompted to enter your API key.

In [3]:
wandb.login()

[34m[1mwandb[0m: Currently logged in as: [33mlukas-pelz[0m ([33mHKA-EKG-Signalverarbeitung[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


True

#### Setting Seeds for Reproducibility

To ensure comparable and reproducible results, we set the random seed using the `seed_everything` function from PyTorch Lightning. This helps in achieving consistent behavior across multiple runs of the notebook.

In [4]:
pl.seed_everything(config['seed'])
os.environ["TF_ENABLE_ONEDNN_OPTS"] = "0"   # disable oneDNN optimizations for reproducibility
os.environ["CUDA_LAUNCH_BLOCKING"] = "1"

Seed set to 42


#### Checking for GPU Devices

In this step, we check for the availability of GPU devices and print the device currently being used by PyTorch. This ensures that the computations are performed on the most efficient hardware available.

In [5]:
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print("="*40)
print(f"Torch Version      : {torch.__version__}")
print(f"Selected Device    : {device}")
if device.type == 'cuda':
    print(f"CUDA Version       : {torch.version.cuda}")
    print(f"Device Name        : {torch.cuda.get_device_name(0)}")
    allocated = torch.cuda.memory_allocated(0) / 1024**3
    reserved = torch.cuda.memory_reserved(0) / 1024**3
    print(f"Memory Usage       : Allocated: {allocated:.2f} GB | Reserved: {reserved:.2f} GB")
    torch.set_float32_matmul_precision('high')
else:
    print("CUDA not available, using CPU.")
print("="*40)

Torch Version      : 2.7.0+cu128
Selected Device    : cuda
CUDA Version       : 12.8
Device Name        : NVIDIA GeForce RTX 5060 Ti
Memory Usage       : Allocated: 0.00 GB | Reserved: 0.00 GB


#### Initializing the Data Module

The `ECG_DataModule` is initialized using the data path, batch size, and feature list from the configuration. This prepares the data for training and validation.

In [6]:
dm = ECG_DataModule(
    data_dir=config['path_to_data'],
    batch_size=config['batch_size'],
    # num_workers=0,
    # persistent_workers=False,
    num_workers=config['num_workers'],
    persistent_workers=config['persistent_workers'],
    feature_list=config['feature_list']
)
dm.setup()
dm.train_dataset.__getitem__(0)  # Warm up dataset (for reproducibility when using num_workers > 0)
print(dm.train_dataset.__getitem__(0))

(tensor([[ 4.7929e-02,  3.8235e-02,  2.8590e-02,  1.9708e-02,  1.2195e-02,
          6.3819e-03,  2.1981e-03, -8.3775e-04, -3.4762e-03, -6.4977e-03,
         -1.0486e-02, -1.5690e-02, -2.1992e-02, -2.8956e-02, -3.5935e-02,
         -4.2190e-02, -4.7024e-02, -4.9905e-02, -5.0558e-02, -4.9023e-02,
         -4.5645e-02, -4.1028e-02, -3.5963e-02, -3.1344e-02, -2.8020e-02,
         -2.6617e-02, -2.7363e-02, -3.0024e-02, -3.3974e-02, -3.8402e-02,
         -4.2604e-02, -4.6277e-02, -4.9659e-02, -5.3412e-02, -5.8275e-02,
         -6.4655e-02, -7.2369e-02, -8.0631e-02, -8.8223e-02, -9.3761e-02,
         -9.5921e-02, -9.3597e-02, -8.5904e-02, -7.2036e-02, -5.1030e-02,
         -2.1583e-02,  1.7936e-02,  6.9222e-02,  1.3354e-01,  2.1113e-01,
          3.0069e-01,  3.9899e-01,  5.0079e-01,  5.9897e-01,  6.8501e-01,
          7.4992e-01,  7.8559e-01,  7.8630e-01,  7.5014e-01,  6.7966e-01,
          5.8162e-01,  4.6583e-01,  3.4354e-01,  2.2580e-01,  1.2191e-01,
          3.8272e-02, -2.2199e-02, -5

#### Creating the Model

In this step, we will define the model architecture and print its summary using the `ModelSummary` utility from PyTorch Lightning. This provides an overview of the model's layers, parameters, and structure.

In [7]:
model = UNET_1D(
    in_channels=1, 
    layer_n=512, 
    out_channels=len(config['feature_list']), 
    kernel_size=5
)
print(ModelSummary(model, max_depth=-1))  
print(type(model).__name__)

    | Name                                            | Type                  | Params | Mode  | In sizes      | Out sizes    
------------------------------------------------------------------------------------------------------------------------------------
0   | criterion                                       | BCEWithLogitsLoss     | 0      | train | ?             | ?            
1   | train_jaccard                                   | BinaryJaccardIndex    | 0      | train | ?             | ?            
2   | val_jaccard                                     | BinaryJaccardIndex    | 0      | train | ?             | ?            
3   | test_jaccard                                    | BinaryJaccardIndex    | 0      | train | ?             | ?            
4   | multi_tolerance_metrics                         | MultiToleranceWrapper | 0      | train | ?             | ?            
5   | multi_tolerance_metrics.metrics                 | ModuleDict            | 0      | train | ?       

In [None]:
current_time = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M")
print(f"Current Time       : {current_time}")

# Initialize wandb logger (https://wandb.ai/HKA-EKG-Signalverarbeitung)
wandb_logger = WandbLogger(
    project=config['wandb_project_name'],
    name=f"{config['wandb_experiment_name']}_{type(model).__name__}_{current_time}",
    config={
        'model': type(model).__name__,
        'dataset': type(dm).__name__,
        'batch_size': config['batch_size'],
        'max_epochs': config['max_epochs'],
        'learning_rate': config['learning_rate']
    }
)

# Initialize Trainer with wandb logger, using early stopping callback (https://lightning.ai/docs/pytorch/stable/common/early_stopping.html)
trainer = Trainer(
    max_epochs=config['max_epochs'], 
    default_root_dir='model/checkpoint/', #data_directory, 
    accelerator="auto", 
    devices="auto", 
    strategy="auto",
    callbacks=[EarlyStopping(monitor='val_loss', patience=5, mode='min')], 
    logger=wandb_logger)

trainer.fit(model=model, datamodule=dm)

# Finish wandb
wandb.finish()

# Create a filename with date identifier
model_filename = f"{config['wandb_experiment_name']}_{type(model).__name__}_{current_time}.ckpt"

# Save the model's state_dict to the path specified in config
save_path = os.path.join(config['path_to_models'], model_filename)
trainer.save_checkpoint(save_path)
print(f"Model checkpoint saved as {save_path}")

Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Current Time       : 2025-09-09_12-41


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

   | Name                    | Type                  | Params | Mode  | In sizes      | Out sizes    
-----------------------------------------------------------------------------------------------------------
0  | criterion               | BCEWithLogitsLoss     | 0      | train | ?             | ?            
1  | train_jaccard           | BinaryJaccardIndex    | 0      | train | ?             | ?            
2  | val_jaccard             | BinaryJaccardIndex    | 0      | train | ?             | ?            
3  | test_jaccard            | BinaryJaccardIndex    | 0      | train | ?             | ?            
4  | multi_tolerance_metrics | MultiToleranceWrapper | 0      | train | ?             | ?            
5  | AvgPool1D1              | AvgPool1d             | 0      | train | [1, 64, 512]  | [1, 64, 256] 
6  | AvgPool1D2              | AvgPool1d             | 0      | train | [1, 128, 256] | [1, 128, 128]
7  | AvgPool1D3              | Av

Epoch 0:  29%|██▊       | 137/477 [00:06<00:15, 21.75it/s, v_num=16db, train_loss_step=0.476]

In [None]:
import optuna
from training.hyperparameter_optimization import OptunaTrainer

def objective(trial):
    model = UNET_1D
    config["sweep_id"] = datetime.datetime.now().strftime("%Y-%m-%d_%H-%M")
    config["dataset_name"] = "ECG_TEST_DATASET"
    config["model_name"] = type(model).__name__
    trainer = OptunaTrainer(
        model=model,
        config=config
    )
    return trainer.run_training(trial)

# Optuna Hyperparameter Study
study = optuna.create_study(direction="minimize", study_name=f"Optuna_HPO_{datetime.datetime.now().strftime('%Y-%m-%d_%H-%M')}")

# Reduce output clutter by setting verbosity to WARNING
optuna.logging.set_verbosity(optuna.logging.WARNING)

# Start optimization
study.optimize(objective, n_trials=config['number_of_trials'], gc_after_trial=True, show_progress_bar=True)

# Best result
print("Best trial: ", study.best_trial)
print("Best value (loss): ", study.best_value)


  from .autonotebook import tqdm as notebook_tqdm
[I 2025-09-04 10:42:04,031] A new study created in memory with name: Optuna_HPO_2025-09-04_10-42
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

   | Name                    | Type                  | Params | Mode  | In sizes      | Out sizes    
-----------------------------------------------------------------------------------------------------------
0  | criterion               | BCEWithLogitsLoss     | 0      | train | ?             | ?            
1  | train_jaccard           | BinaryJaccardIndex    | 0      | train | ?             | ?            
2  | val_jaccard             | BinaryJaccardIndex    | 0      | train | ?             | ?            
3  | test_jaccard            | BinaryJaccardIndex    | 0      | train | ?             | ?            
4  | multi_tolerance_metrics | MultiToleranceWrapper | 0      | train | ?             | ?            
5  | AvgPool1D1              | AvgPool1d             | 0      | train | [1, 64, 512]  | [1, 64, 256] 
6  | AvgPool1D2              | AvgPool1d             | 0      | train | [1, 128, 256] | [1, 128, 128]
7  | AvgPool1D3              | Av

Epoch 2: 100%|██████████| 840/840 [00:44<00:00, 19.01it/s, v_num=7cr7, train_loss_step=0.272, val_loss_step=0.270, val_loss_epoch=0.282, train_loss_epoch=0.307]

`Trainer.fit` stopped: `max_epochs=3` reached.


Epoch 2: 100%|██████████| 840/840 [00:44<00:00, 18.70it/s, v_num=7cr7, train_loss_step=0.272, val_loss_step=0.270, val_loss_epoch=0.282, train_loss_epoch=0.307]


[34m[1mwandb[0m: [32m[41mERROR[0m The nbformat package was not found. It is required to save notebook history.


0,1
epoch,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▅▅▅▅▅▅▅▅▅▅▅▅▅▅███████████
train_loss_epoch,█▄▁
train_loss_step,██▇▇▇▆▆▆▆▅▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▃▂▂▂▂▂▂▁▂▁▁▁▁▁
trainer/global_step,▁▂▂▃▃▁▁▁▁▁▅▆▁▁▁▂▂▂▂▂▂▂▂▂▂▂▂▂▂▇██▂▂▂▂▂▂▂▂
val_loss_epoch,█▄▁
val_loss_step,▇▇▇█▇█▇▇▇██▇▇▇▇▇▇▇▇▄▄▄▄▄▄▄▄▃▄▃▄▄▁▁▁▂▁▁▁▂

0,1
epoch,2.0
train_loss_epoch,0.30652
train_loss_step,0.27188
trainer/global_step,2519.0
val_loss_epoch,0.28209
val_loss_step,0.27033


Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Optimization finished with best validation loss: 0.28209277987480164


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

   | Name                    | Type                  | Params | Mode  | In sizes      | Out sizes    
-----------------------------------------------------------------------------------------------------------
0  | criterion               | BCEWithLogitsLoss     | 0      | train | ?             | ?            
1  | train_jaccard           | BinaryJaccardIndex    | 0      | train | ?             | ?            
2  | val_jaccard             | BinaryJaccardIndex    | 0      | train | ?             | ?            
3  | test_jaccard            | BinaryJaccardIndex    | 0      | train | ?             | ?            
4  | multi_tolerance_metrics | MultiToleranceWrapper | 0      | train | ?             | ?            
5  | AvgPool1D1              | AvgPool1d             | 0      | train | [1, 64, 512]  | [1, 64, 256] 
6  | AvgPool1D2              | AvgPool1d             | 0      | train | [1, 128, 256] | [1, 128, 128]
7  | AvgPool1D3              | Av

Epoch 2: 100%|██████████| 1680/1680 [01:07<00:00, 24.72it/s, v_num=q79i, train_loss_step=0.0675, val_loss_step=0.0681, val_loss_epoch=0.0863, train_loss_epoch=0.0887]

`Trainer.fit` stopped: `max_epochs=3` reached.


Epoch 2: 100%|██████████| 1680/1680 [01:08<00:00, 24.43it/s, v_num=q79i, train_loss_step=0.0675, val_loss_step=0.0681, val_loss_epoch=0.0863, train_loss_epoch=0.0887]


[34m[1mwandb[0m: [32m[41mERROR[0m The nbformat package was not found. It is required to save notebook history.


0,1
epoch,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▅▅▅▅▅▅▅▅▅▅▅▅███████████
train_loss_epoch,█▂▁
train_loss_step,█▇▇▆▅▅▄▄▄▃▃▃▃▃▂▂▂▂▂▂▂▂▁▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁
trainer/global_step,▁▂▃▁▁▁▁▁▂▅▆▆▆▂▂▂▂▂▂▂▂▂▂▇▇▇██▂▂▂▂▂▂▂▃▃▃▃▃
val_loss_epoch,█▃▁
val_loss_step,▅▅▄▄▄▆▆▄▄▃▅▄▄▅▅▃▄▄▂▃▂▃▃▄▂▂▃▂▄▂▂▃▂▃▁▂▃▁▁█

0,1
epoch,2.0
train_loss_epoch,0.08869
train_loss_step,0.06752
trainer/global_step,5039.0
val_loss_epoch,0.08629
val_loss_step,0.06815


Best trial: 1. Best value: 0.0862901:  40%|████      | 2/5 [07:37<11:40, 233.36s/it]

Optimization finished with best validation loss: 0.08629006892442703


Using 16bit Automatic Mixed Precision (AMP)
Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

   | Name                    | Type                  | Params | Mode  | In sizes      | Out sizes    
-----------------------------------------------------------------------------------------------------------
0  | criterion               | BCEWithLogitsLoss     | 0      | train | ?             | ?            
1  | train_jaccard           | BinaryJaccardIndex    | 0      | train | ?             | ?            
2  | val_jaccard             | BinaryJaccardIndex    | 0      | train | ?             | ?            
3  | test_jaccard            | BinaryJaccardIndex    | 0      | train | ?             | ?            
4  | multi_tolerance_metrics | MultiToleranceWrapper | 0      | train | ?             | ?            
5  | AvgPool1D1              | AvgPool1d             | 0      | train | [1, 64, 512]  | [1, 64, 256] 
6  | AvgPool1D2              | AvgPool1d             | 0      | train | [1, 128, 256] | [1, 128, 128]
7  | AvgPool1D3              | Av

Epoch 2: 100%|██████████| 3359/3359 [01:51<00:00, 30.23it/s, v_num=w35z, train_loss_step=0.458, val_loss_step=0.419, val_loss_epoch=0.424, train_loss_epoch=0.440]

`Trainer.fit` stopped: `max_epochs=3` reached.


Epoch 2: 100%|██████████| 3359/3359 [01:51<00:00, 30.01it/s, v_num=w35z, train_loss_step=0.458, val_loss_step=0.419, val_loss_epoch=0.424, train_loss_epoch=0.440]


[34m[1mwandb[0m: [32m[41mERROR[0m The nbformat package was not found. It is required to save notebook history.


0,1
epoch,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▅▅▅▅▅▅▅▅▅▅▅████████████
train_loss_epoch,█▄▁
train_loss_step,█▆▅▄▅▄▄▄▄▄▄▄▄▃▄▄▃▃▃▃▃▃▄▃▂▂▃▂▂▂▂▂▂▂▂▂▂▂▁▁
trainer/global_step,▂▁▁▁▁▂▂▂▃▃▃▃▃▄▄▄▄▄▄▄▅▅▅▅▅▆▅▆▆▆▆▇▇▇▇▇████
val_loss_epoch,█▄▁
val_loss_step,▇▆▆█▇▇█▆▆▇▇▆▆▆█▄▄▄▅▄▅▅▅▄▄▅▅▄▆▃█▁▁▂▂▂▂▂▂▂

0,1
epoch,2.0
train_loss_epoch,0.44007
train_loss_step,0.45831
trainer/global_step,1259.0
val_loss_epoch,0.42443
val_loss_step,0.41897


Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Optimization finished with best validation loss: 0.42443081736564636


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

   | Name                    | Type                  | Params | Mode  | In sizes      | Out sizes    
-----------------------------------------------------------------------------------------------------------
0  | criterion               | BCEWithLogitsLoss     | 0      | train | ?             | ?            
1  | train_jaccard           | BinaryJaccardIndex    | 0      | train | ?             | ?            
2  | val_jaccard             | BinaryJaccardIndex    | 0      | train | ?             | ?            
3  | test_jaccard            | BinaryJaccardIndex    | 0      | train | ?             | ?            
4  | multi_tolerance_metrics | MultiToleranceWrapper | 0      | train | ?             | ?            
5  | AvgPool1D1              | AvgPool1d             | 0      | train | [1, 64, 512]  | [1, 64, 256] 
6  | AvgPool1D2              | AvgPool1d             | 0      | train | [1, 128, 256] | [1, 128, 128]
7  | AvgPool1D3              | Av

Epoch 2: 100%|██████████| 840/840 [00:42<00:00, 19.95it/s, v_num=jlpi, train_loss_step=0.351, val_loss_step=0.334, val_loss_epoch=0.346, train_loss_epoch=0.366]

`Trainer.fit` stopped: `max_epochs=3` reached.


Epoch 2: 100%|██████████| 840/840 [00:42<00:00, 19.66it/s, v_num=jlpi, train_loss_step=0.351, val_loss_step=0.334, val_loss_epoch=0.346, train_loss_epoch=0.366]


[34m[1mwandb[0m: [32m[41mERROR[0m The nbformat package was not found. It is required to save notebook history.


0,1
epoch,▁▁▁▁▁▁▁▁▁▁▁▁▅▅▅▅▅▅▅▅▅▅▅▅▅████████████
train_loss_epoch,█▄▁
train_loss_step,█▇▆▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▃▂▂▂▂▂▂▁▁▁▁▁▁
trainer/global_step,▂▁▁▁▁▂▂▂▂▂▃▃▃▃▃▃▃▄▄▄▄▄▅▅▅▅▄▆▆▆▆▆▆▆▇▇▇▇▇█
val_loss_epoch,█▅▁
val_loss_step,▇▇▇█▆▇▇▇▇▇▇▇█▇▄▅▄▅▅▄▄▅▄▄▂▁▂▂▁▂▂▁▁▁▂▁▂▁▂▂

0,1
epoch,2.0
train_loss_epoch,0.36616
train_loss_step,0.34943
trainer/global_step,314.0
val_loss_epoch,0.34598
val_loss_step,0.33382


Using default `ModelCheckpoint`. Consider installing `litmodels` package to enable `LitModelCheckpoint` for automatic upload to the Lightning model registry.
GPU available: True (cuda), used: True
TPU available: False, using: 0 TPU cores
HPU available: False, using: 0 HPUs


Optimization finished with best validation loss: 0.3459818959236145


LOCAL_RANK: 0 - CUDA_VISIBLE_DEVICES: [0]

   | Name                    | Type                  | Params | Mode  | In sizes      | Out sizes    
-----------------------------------------------------------------------------------------------------------
0  | criterion               | BCEWithLogitsLoss     | 0      | train | ?             | ?            
1  | train_jaccard           | BinaryJaccardIndex    | 0      | train | ?             | ?            
2  | val_jaccard             | BinaryJaccardIndex    | 0      | train | ?             | ?            
3  | test_jaccard            | BinaryJaccardIndex    | 0      | train | ?             | ?            
4  | multi_tolerance_metrics | MultiToleranceWrapper | 0      | train | ?             | ?            
5  | AvgPool1D1              | AvgPool1d             | 0      | train | [1, 64, 512]  | [1, 64, 256] 
6  | AvgPool1D2              | AvgPool1d             | 0      | train | [1, 128, 256] | [1, 128, 128]
7  | AvgPool1D3              | Av

Epoch 2: 100%|██████████| 840/840 [00:42<00:00, 19.58it/s, v_num=wf0j, train_loss_step=0.400, val_loss_step=0.392, val_loss_epoch=0.406, train_loss_epoch=0.421]

`Trainer.fit` stopped: `max_epochs=3` reached.


Epoch 2: 100%|██████████| 840/840 [00:43<00:00, 19.19it/s, v_num=wf0j, train_loss_step=0.400, val_loss_step=0.392, val_loss_epoch=0.406, train_loss_epoch=0.421]


[34m[1mwandb[0m: [32m[41mERROR[0m The nbformat package was not found. It is required to save notebook history.


0,1
epoch,▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▅▅▅▅▅▅▅▅▅▅▅▅█████████████
train_loss_epoch,█▄▁
train_loss_step,█▇▇▆▆▆▆▆▆▅▅▅▅▅▅▅▅▄▄▄▄▄▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▁▁
trainer/global_step,▁▁▁▂▂▂▂▂▃▃▄▃▃▃▄▄▅▅▅▅▅▅▅▇█▅▆▆▆▆▆▆▆▆▆▇▇▇▇▇
val_loss_epoch,█▄▁
val_loss_step,▇▇█▇█▇▇▇▆▇▇▇█▇▇█▇▆▅▄▄▄▄▅▅▄▄▃▂▂▂▅▁▁▁▂▁▂▁▂

0,1
epoch,2.0
train_loss_epoch,0.42099
train_loss_step,0.39982
trainer/global_step,629.0
val_loss_epoch,0.40584
val_loss_step,0.39163


Best trial: 1. Best value: 0.0862901: 100%|██████████| 5/5 [20:02<00:00, 240.40s/it]

Optimization finished with best validation loss: 0.4058423638343811
Best trial:  FrozenTrial(number=1, state=1, values=[0.08629006892442703], datetime_start=datetime.datetime(2025, 9, 4, 10, 45, 26, 739187), datetime_complete=datetime.datetime(2025, 9, 4, 10, 49, 41, 373948), params={'batch_size': 32, 'max_epochs': 3, 'accumulate_grad_batches': 1, 'precision': 32, 'optimizer': 'AdamW', 'learning_rate': 0.001874583495068444, 'weight_decay': 0.00015012419395533578, 'scheduler': 'CosineAnnealingLR'}, user_attrs={}, system_attrs={}, intermediate_values={}, distributions={'batch_size': CategoricalDistribution(choices=(16, 32, 64)), 'max_epochs': IntDistribution(high=3, log=False, low=3, step=10), 'accumulate_grad_batches': CategoricalDistribution(choices=(1, 2, 4, 8)), 'precision': CategoricalDistribution(choices=('16-mixed', 32)), 'optimizer': CategoricalDistribution(choices=('Adam', 'SGD', 'AdamW')), 'learning_rate': FloatDistribution(high=0.01, log=True, low=5e-05, step=None), 'weight_de


