# Excercises 
# 1. Tune the network
Run the experiment below, explore the different parameters (see suggestions below) and study the result with tensorboard. 
Make a single page (1 a4) report of your findings. Use your visualisation skills to communicate your most important findings.

In [1]:
from mads_datasets import DatasetFactoryProvider, DatasetType

from mltrainer.preprocessors import BasePreprocessor
from mltrainer import imagemodels, Trainer, TrainerSettings, ReportTypes, metrics

import torch.optim as optim
from torch import nn
from tomlserializer import TOMLSerializer

In [2]:
import torch

print("torch version:", torch.__version__)
print("torch cuda version:", torch.version.cuda)
print("cuda available:", torch.cuda.is_available())
print("cuda device count:", torch.cuda.device_count())

torch version: 2.8.0+cpu
torch cuda version: None
cuda available: False
cuda device count: 0


In [3]:
import torch
if torch.backends.mps.is_available() and torch.backends.mps.is_built():
    device = torch.device("mps")
    print("Using MPS")
elif torch.cuda.is_available():
    device = "cuda:0"
    print("using cuda")
else:
    device = "cpu"
    print("using cpu")

using cpu


We will be using `tomlserializer` to easily keep track of our experiments, and to easily save the different things we did during our experiments.
It can export things like settings and models to a simple `toml` file, which can be easily shared, checked and modified.

First, we need the data. 

In [4]:
fashionfactory = DatasetFactoryProvider.create_factory(DatasetType.FASHION)
preprocessor = BasePreprocessor()
streamers = fashionfactory.create_datastreamer(batchsize=64, preprocessor=preprocessor)
train = streamers["train"]
valid = streamers["valid"]
trainstreamer = train.stream()
validstreamer = valid.stream()

[32m2025-09-21 14:34:18.808[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m121[0m - [1mFolder already exists at C:\Users\mwien\.cache\mads_datasets\fashionmnist[0m
[32m2025-09-21 14:34:18.810[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\mwien\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m
[32m2025-09-21 14:34:18.810[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\mwien\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m


We need a way to determine how well our model is performing. We will use accuracy as a metric.

In [5]:
accuracy = metrics.Accuracy()

You can set up a single experiment.

- We will show the model batches of 64 images, 
- and for every epoch we will show the model 100 batches (trainsteps=100).
- then, we will test how well the model is doing on unseen data (teststeps=100).
- we will report our results during training to tensorboard, and report all configuration to a toml file.
- we will log the results into a directory called "modellogs", but you could change this to whatever you want.

In [6]:
import torch
loss_fn = torch.nn.CrossEntropyLoss()

settings = TrainerSettings(
    epochs=3,
    metrics=[accuracy],
    logdir="modellogs",
    train_steps=100,
    valid_steps=100,
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)


We will use a very basic model: a model with three linear layers.

In [7]:
class NeuralNetwork(nn.Module):
    def __init__(self, num_classes: int, units1: int, units2: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        self.units1 = units1
        self.units2 = units2
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28 * 28, units1),
            nn.ReLU(),
            nn.Linear(units1, units2),
            nn.ReLU(),
            nn.Linear(units2, num_classes),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork(
    num_classes=10, units1=256, units2=256)

I developped the `tomlserializer` package, it is a useful tool to save configs, models and settings as a tomlfile; that way it is easy to track what you changed during your experiments.

This package will 1. check if there is a `__dict__` attribute available, and if so, it will use that to extract the parameters that do not start with an underscore, like this:

In [8]:
{k: v for k, v in model.__dict__.items() if not k.startswith("_")}

{'training': True, 'num_classes': 10, 'units1': 256, 'units2': 256}

This means that if you want to add more parameters to the `.toml` file, eg `units3`, you can add them to the class like this:

```python
class NeuralNetwork(nn.Module):
    def __init__(self, num_classes: int, units1: int, units2: int, units3: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        self.units1 = units1
        self.units2 = units2
        self.units3 = units3  # <-- add this line
```

And then it will be added to the `.toml` file. Check the result for yourself by using the `.save()` method of the `TomlSerializer` class like this:

In [9]:
tomlserializer = TOMLSerializer()
tomlserializer.save(settings, "settings.toml")
tomlserializer.save(model, "model.toml")

Check the `settings.toml` and `model.toml` files to see what is in there.

## Script for looping through some epochs

In [11]:
import torch

units = [64, 32, 16]
loss_fn = torch.nn.CrossEntropyLoss()

main_folder = "modellogs"
subfolder = "change_epochs"
amount_of_epochs = [5, 8, 10]

for epochs in amount_of_epochs:
    epoch_subfolder = f"{subfolder}/epochs_{epochs}"
    settings = TrainerSettings(
        epochs=epochs,
        metrics=[accuracy],
        logdir=f"{main_folder}/{epoch_subfolder}",
        train_steps=len(train),
        valid_steps=len(valid),
        reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
    )

    for unit1 in units:
        for unit2 in units:
            if unit2 <= unit1:
                print(f"Epochs: {epochs}, Units: {unit1}, {unit2}")
                model = NeuralNetwork(num_classes=10, units1=unit1, units2=unit2)

                trainer = Trainer(
                    model=model,
                    settings=settings,
                    loss_fn=loss_fn,
                    optimizer=optim.Adam,
                    traindataloader=trainstreamer,
                    validdataloader=validstreamer,
                    scheduler=optim.lr_scheduler.ReduceLROnPlateau,
                )
                trainer.loop()

[32m2025-09-21 14:36:51.313[0m | [1mINFO    [0m | [36mmltrainer.settings[0m:[36mcheck_path[0m:[36m60[0m - [1mCreated logdir c:\Master Applied Data Science\Year 2\Semester 3 (Deep Learning & Model Deployment)\Portfolio-Marcello-Wienhoven\1-hypertuning-gridsearch\modellogs\change_epochs\epochs_5[0m
[32m2025-09-21 14:36:51.315[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\change_epochs\epochs_5\20250921-143651[0m
[32m2025-09-21 14:36:51.316[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m
[32m2025-09-21 14:36:51.315[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\change_epochs\epochs_5\20250921-143651[0m
[32m2025-09-21 14:36:51.316[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m

Epochs: 5, Units: 64, 64


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 390.34it/s]

[32m2025-09-21 14:36:53.925[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.5916 test 0.4596 metric ['0.8371'][0m
 20%|[38;2;30;71;6m██        [0m| 1/5 [00:02<00:10,  2.60s/it][32m2025-09-21 14:36:53.925[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.5916 test 0.4596 metric ['0.8371'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 381.33it/s]

[32m2025-09-21 14:36:56.644[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4127 test 0.4087 metric ['0.8556'][0m
 40%|[38;2;30;71;6m████      [0m| 2/5 [00:05<00:08,  2.67s/it][32m2025-09-21 14:36:56.644[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4127 test 0.4087 metric ['0.8556'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 5, Units: 64, 32


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 403.57it/s]

[32m2025-09-21 14:37:08.122[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6152 test 0.4715 metric ['0.8331'][0m
 20%|[38;2;30;71;6m██        [0m| 1/5 [00:02<00:10,  2.52s/it][32m2025-09-21 14:37:08.122[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6152 test 0.4715 metric ['0.8331'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 379.21it/s]

[32m2025-09-21 14:37:10.861[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4289 test 0.4586 metric ['0.8352'][0m
 40%|[38;2;30;71;6m████      [0m| 2/5 [00:05<00:07,  2.65s/it][32m2025-09-21 14:37:10.861[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4289 test 0.4586 metric ['0.8352'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 5, Units: 64, 16


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 424.50it/s]

[32m2025-09-21 14:37:22.557[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6596 test 0.4877 metric ['0.8248'][0m
 20%|[38;2;30;71;6m██        [0m| 1/5 [00:02<00:09,  2.41s/it][32m2025-09-21 14:37:22.557[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6596 test 0.4877 metric ['0.8248'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 398.00it/s]

[32m2025-09-21 14:37:25.163[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4369 test 0.4418 metric ['0.8444'][0m
 40%|[38;2;30;71;6m████      [0m| 2/5 [00:05<00:07,  2.53s/it][32m2025-09-21 14:37:25.163[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4369 test 0.4418 metric ['0.8444'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 5, Units: 32, 32


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:01<00:00, 478.73it/s]

[32m2025-09-21 14:37:36.513[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6507 test 0.4861 metric ['0.8337'][0m
 20%|[38;2;30;71;6m██        [0m| 1/5 [00:02<00:08,  2.16s/it][32m2025-09-21 14:37:36.513[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6507 test 0.4861 metric ['0.8337'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 462.65it/s]

[32m2025-09-21 14:37:38.778[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4378 test 0.4349 metric ['0.8452'][0m
 40%|[38;2;30;71;6m████      [0m| 2/5 [00:04<00:06,  2.22s/it][32m2025-09-21 14:37:38.778[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4378 test 0.4349 metric ['0.8452'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 5, Units: 32, 16


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:01<00:00, 480.16it/s]

[32m2025-09-21 14:37:48.061[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6868 test 0.5350 metric ['0.8093'][0m
 20%|[38;2;30;71;6m██        [0m| 1/5 [00:02<00:08,  2.16s/it][32m2025-09-21 14:37:48.061[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6868 test 0.5350 metric ['0.8093'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 455.74it/s]

[32m2025-09-21 14:37:50.349[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4578 test 0.4639 metric ['0.8364'][0m
 40%|[38;2;30;71;6m████      [0m| 2/5 [00:04<00:06,  2.24s/it][32m2025-09-21 14:37:50.349[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4578 test 0.4639 metric ['0.8364'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 5, Units: 16, 16


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:01<00:00, 482.77it/s]

[32m2025-09-21 14:37:59.283[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.7012 test 0.5136 metric ['0.8147'][0m
 20%|[38;2;30;71;6m██        [0m| 1/5 [00:02<00:08,  2.15s/it][32m2025-09-21 14:37:59.283[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.7012 test 0.5136 metric ['0.8147'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:01<00:00, 483.68it/s]

[32m2025-09-21 14:38:01.436[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4570 test 0.4959 metric ['0.8260'][0m
 40%|[38;2;30;71;6m████      [0m| 2/5 [00:04<00:06,  2.15s/it][32m2025-09-21 14:38:01.436[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4570 test 0.4959 metric ['0.8260'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 8, Units: 64, 64


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 385.11it/s]

[32m2025-09-21 14:38:10.618[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.5873 test 0.4929 metric ['0.8242'][0m
 12%|[38;2;30;71;6m█▎        [0m| 1/8 [00:02<00:18,  2.66s/it][32m2025-09-21 14:38:10.618[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.5873 test 0.4929 metric ['0.8242'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 344.85it/s]

[32m2025-09-21 14:38:13.659[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4054 test 0.4020 metric ['0.8573'][0m
 25%|[38;2;30;71;6m██▌       [0m| 2/8 [00:05<00:17,  2.88s/it][32m2025-09-21 14:38:13.659[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4054 test 0.4020 metric ['0.8573'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 8, Units: 64, 32


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 405.66it/s]

[32m2025-09-21 14:38:35.667[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6127 test 0.4731 metric ['0.8327'][0m
 12%|[38;2;30;71;6m█▎        [0m| 1/8 [00:02<00:17,  2.52s/it][32m2025-09-21 14:38:35.667[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6127 test 0.4731 metric ['0.8327'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 402.31it/s]

[32m2025-09-21 14:38:38.235[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4214 test 0.4509 metric ['0.8397'][0m
 25%|[38;2;30;71;6m██▌       [0m| 2/8 [00:05<00:15,  2.55s/it][32m2025-09-21 14:38:38.235[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4214 test 0.4509 metric ['0.8397'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 8, Units: 64, 16


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 430.28it/s]

[32m2025-09-21 14:38:59.040[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6313 test 0.5080 metric ['0.8224'][0m
 12%|[38;2;30;71;6m█▎        [0m| 1/8 [00:02<00:16,  2.39s/it][32m2025-09-21 14:38:59.040[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6313 test 0.5080 metric ['0.8224'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 418.91it/s]

[32m2025-09-21 14:39:01.525[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4236 test 0.4265 metric ['0.8502'][0m
 25%|[38;2;30;71;6m██▌       [0m| 2/8 [00:04<00:14,  2.45s/it][32m2025-09-21 14:39:01.525[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4236 test 0.4265 metric ['0.8502'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 8, Units: 32, 32


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 315.29it/s]

[32m2025-09-21 14:39:20.447[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6412 test 0.5010 metric ['0.8247'][0m
 12%|[38;2;30;71;6m█▎        [0m| 1/8 [00:03<00:22,  3.20s/it][32m2025-09-21 14:39:20.447[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6412 test 0.5010 metric ['0.8247'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:01<00:00, 483.42it/s]

[32m2025-09-21 14:39:22.606[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4421 test 0.4734 metric ['0.8325'][0m
 25%|[38;2;30;71;6m██▌       [0m| 2/8 [00:05<00:15,  2.59s/it][32m2025-09-21 14:39:22.606[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4421 test 0.4734 metric ['0.8325'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 8, Units: 32, 16


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:01<00:00, 483.24it/s]

[32m2025-09-21 14:39:38.821[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6716 test 0.5098 metric ['0.8184'][0m
 12%|[38;2;30;71;6m█▎        [0m| 1/8 [00:02<00:15,  2.15s/it][32m2025-09-21 14:39:38.821[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6716 test 0.5098 metric ['0.8184'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 457.75it/s]

[32m2025-09-21 14:39:41.111[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4448 test 0.5023 metric ['0.8177'][0m
 25%|[38;2;30;71;6m██▌       [0m| 2/8 [00:04<00:13,  2.23s/it][32m2025-09-21 14:39:41.111[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4448 test 0.5023 metric ['0.8177'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 8, Units: 16, 16


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:01<00:00, 477.74it/s]

[32m2025-09-21 14:39:57.934[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.7467 test 0.5311 metric ['0.8150'][0m
 12%|[38;2;30;71;6m█▎        [0m| 1/8 [00:02<00:15,  2.17s/it][32m2025-09-21 14:39:57.934[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.7467 test 0.5311 metric ['0.8150'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:01<00:00, 477.38it/s]

[32m2025-09-21 14:40:00.111[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4796 test 0.4755 metric ['0.8315'][0m
 25%|[38;2;30;71;6m██▌       [0m| 2/8 [00:04<00:13,  2.18s/it][32m2025-09-21 14:40:00.111[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4796 test 0.4755 metric ['0.8315'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [0

Epochs: 10, Units: 64, 64


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:03<00:00, 284.10it/s]

[32m2025-09-21 14:40:18.335[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6016 test 0.4713 metric ['0.8350'][0m
 10%|[38;2;30;71;6m█         [0m| 1/10 [00:03<00:32,  3.62s/it][32m2025-09-21 14:40:18.335[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6016 test 0.4713 metric ['0.8350'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:03<00:00, 245.57it/s]

[32m2025-09-21 14:40:22.649[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4214 test 0.4232 metric ['0.8480'][0m
 20%|[38;2;30;71;6m██        [0m| 2/10 [00:07<00:32,  4.03s/it][32m2025-09-21 14:40:22.649[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4214 test 0.4232 metric ['0.8480'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 

Epochs: 10, Units: 64, 32


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 341.57it/s]

[32m2025-09-21 14:40:59.694[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6179 test 0.4767 metric ['0.8341'][0m
 10%|[38;2;30;71;6m█         [0m| 1/10 [00:03<00:27,  3.04s/it][32m2025-09-21 14:40:59.694[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6179 test 0.4767 metric ['0.8341'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 312.66it/s]

[32m2025-09-21 14:41:03.002[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4209 test 0.5030 metric ['0.8161'][0m
[32m2025-09-21 14:41:03.003[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__call__[0m:[36m252[0m - [1mbest loss: 0.4767, current loss 0.5030.Counter 1/10.[0m
 20%|[38;2;30;71;6m██        [0m| 2/10 [00:06<00:25,  3.20s/it][32m2025-09-21 14:41:03.002[0m | [1mI

Epochs: 10, Units: 64, 16


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 416.79it/s]

[32m2025-09-21 14:41:37.491[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6512 test 0.4890 metric ['0.8312'][0m
 10%|[38;2;30;71;6m█         [0m| 1/10 [00:02<00:22,  2.46s/it][32m2025-09-21 14:41:37.491[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6512 test 0.4890 metric ['0.8312'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 380.81it/s]

[32m2025-09-21 14:41:40.242[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4417 test 0.4558 metric ['0.8410'][0m
 20%|[38;2;30;71;6m██        [0m| 2/10 [00:05<00:21,  2.63s/it][32m2025-09-21 14:41:40.242[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4417 test 0.4558 metric ['0.8410'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 

Epochs: 10, Units: 32, 32


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 453.76it/s]

[32m2025-09-21 14:42:14.099[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6376 test 0.4816 metric ['0.8325'][0m
 10%|[38;2;30;71;6m█         [0m| 1/10 [00:02<00:20,  2.29s/it][32m2025-09-21 14:42:14.099[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6376 test 0.4816 metric ['0.8325'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 455.15it/s]

[32m2025-09-21 14:42:16.381[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4357 test 0.4502 metric ['0.8406'][0m
 20%|[38;2;30;71;6m██        [0m| 2/10 [00:04<00:18,  2.29s/it][32m2025-09-21 14:42:16.381[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4357 test 0.4502 metric ['0.8406'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 

Epochs: 10, Units: 32, 16


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 391.28it/s]

[32m2025-09-21 14:42:38.762[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6755 test 0.5010 metric ['0.8278'][0m
 10%|[38;2;30;71;6m█         [0m| 1/10 [00:02<00:23,  2.64s/it][32m2025-09-21 14:42:38.762[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.6755 test 0.5010 metric ['0.8278'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 358.49it/s]

[32m2025-09-21 14:42:41.646[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4551 test 0.4563 metric ['0.8367'][0m
 20%|[38;2;30;71;6m██        [0m| 2/10 [00:05<00:22,  2.78s/it][32m2025-09-21 14:42:41.646[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4551 test 0.4563 metric ['0.8367'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 

Epochs: 10, Units: 16, 16


100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 344.78it/s]

[32m2025-09-21 14:43:08.415[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.7487 test 0.5465 metric ['0.8170'][0m
 10%|[38;2;30;71;6m█         [0m| 1/10 [00:02<00:26,  2.99s/it][32m2025-09-21 14:43:08.415[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.7487 test 0.5465 metric ['0.8170'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 371.01it/s]

[32m2025-09-21 14:43:11.192[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4829 test 0.4854 metric ['0.8285'][0m
 20%|[38;2;30;71;6m██        [0m| 2/10 [00:05<00:22,  2.87s/it][32m2025-09-21 14:43:11.192[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4829 test 0.4854 metric ['0.8285'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 

## Script for changing units

In [None]:
import torch

units = [2**i for i in range(4, 11)]  # 16, 32, 64, ..., 1024
loss_fn = torch.nn.CrossEntropyLoss()

main_folder = "modellogs"
subfolder = "change_units"

for unit1 in units:
    for unit2 in units:
        if unit2 <= unit1:
            run_subfolder = f"{subfolder}/units_{unit1}_{unit2}"
            settings = TrainerSettings(
                epochs=3,
                metrics=[accuracy],
                logdir=f"{main_folder}/{run_subfolder}",
                train_steps=len(train),
                valid_steps=len(valid),
                reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
            )
            print(f"Units: {unit1}, {unit2}")
            model = NeuralNetwork(num_classes=10, units1=unit1, units2=unit2)

            trainer = Trainer(
                model=model,
                settings=settings,
                loss_fn=loss_fn,
                optimizer=optim.Adam,
                traindataloader=trainstreamer,
                validdataloader=validstreamer,
                scheduler=optim.lr_scheduler.ReduceLROnPlateau,
            )
            trainer.loop()

## Script for changing batchsize

In [None]:
import torch

units = [64, 32, 16]
loss_fn = torch.nn.CrossEntropyLoss()

main_folder = "modellogs"
subfolder = "change_units"

settings = TrainerSettings(
    epochs=3,
    metrics=[accuracy],
    logdir=f"{main_folder}/{subfolder}",
    train_steps=len(train),
    valid_steps=len(valid),
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)

for unit1 in units:
    for unit2 in units:
        if unit2 <= unit1:
            print(f"Units: {unit1}, {unit2}")
            model = NeuralNetwork(num_classes=10, units1=unit1, units2=unit2)

            trainer = Trainer(
                model=model,
                settings=settings,
                loss_fn=loss_fn,
                optimizer=optim.Adam,
                traindataloader=trainstreamer,
                validdataloader=validstreamer,
                scheduler=optim.lr_scheduler.ReduceLROnPlateau
            )
            trainer.loop()


[32m2025-09-13 18:32:40.492[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\20250913-183240[0m
[32m2025-09-13 18:32:40.493[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 366.04it/s]
[32m2025-09-13 18:32:43.278[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.5857 test 0.4655 metric ['0.8346'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 334.28it/s]
[32m2025-09-13 18:32:46.325[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.4117 test 0.4120 metric ['0.8553'][0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:02<00:00, 327.81it/s]
[32m2025-09-13 18:32:49.414[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:

Because we have set the ReportType to TOML, you will find in every log dir a model.toml and settings.toml file.

Run the experiment, and study the result with tensorboard. 

Locally, it is easy to do that with VS code itself. On the server, you have to take these steps:

- in the terminal, `cd` to the location of the repository
- activate the python environment for the shell. This can be done with `.venv\Scripts\activate`.
- run `tensorboard --logdir=1-hypertuning-gridsearch/modellogs` in the terminal
- tensorboard will launch at `localhost:6006` and vscode will notify you that the port is forwarded
- you can either press the `launch` button in VScode or open your local browser at `localhost:6006`