# Excercises 
# 1. Tune the network
Run the experiment below, explore the different parameters (see suggestions below) and study the result with tensorboard. 
Make a single page (1 a4) report of your findings. Use your visualisation skills to communicate your most important findings.

In [1]:
from mads_datasets import DatasetFactoryProvider, DatasetType

from mltrainer.preprocessors import BasePreprocessor
from mltrainer import imagemodels, Trainer, TrainerSettings, ReportTypes, metrics

import torch.optim as optim
from torch import nn
from tomlserializer import TOMLSerializer

We will be using `tomlserializer` to easily keep track of our experiments, and to easily save the different things we did during our experiments.
It can export things like settings and models to a simple `toml` file, which can be easily shared, checked and modified.

First, we need the data. 

In [2]:
fashionfactory = DatasetFactoryProvider.create_factory(DatasetType.FASHION)
preprocessor = BasePreprocessor()
streamers = fashionfactory.create_datastreamer(batchsize=64, preprocessor=preprocessor)
train = streamers["train"]
valid = streamers["valid"]
trainstreamer = train.stream()
validstreamer = valid.stream()

[32m2025-09-20 16:43:17.058[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m121[0m - [1mFolder already exists at C:\Users\tycoh\.cache\mads_datasets\fashionmnist[0m
[32m2025-09-20 16:43:17.058[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\tycoh\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m


We need a way to determine how well our model is performing. We will use accuracy as a metric.

In [3]:
accuracy = metrics.Accuracy()

You can set up a single experiment.

- We will show the model batches of 64 images, 
- and for every epoch we will show the model 100 batches (trainsteps=100).
- then, we will test how well the model is doing on unseen data (teststeps=100).
- we will report our results during training to tensorboard, and report all configuration to a toml file.
- we will log the results into a directory called "modellogs", but you could change this to whatever you want.

In [None]:
import torch
loss_fn = torch.nn.CrossEntropyLoss()

settings = TrainerSettings(
    epochs=3,
    metrics=[accuracy],
    logdir="modellogs",
    train_steps= 100,
    valid_steps= 100,
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)


We will use a very basic model: a model with three linear layers.

In [29]:
class NeuralNetwork(nn.Module):
    def __init__(self, num_classes: int, units1: int, units2: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        self.units1 = units1
        self.units2 = units2
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28 * 28, units1),
            nn.ReLU(),
            nn.Linear(units1, units2),
            nn.ReLU(),
            nn.Linear(units2, num_classes),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork(
    num_classes=10, units1=256, units2=256)

I developped the `tomlserializer` package, it is a useful tool to save configs, models and settings as a tomlfile; that way it is easy to track what you changed during your experiments.

This package will 1. check if there is a `__dict__` attribute available, and if so, it will use that to extract the parameters that do not start with an underscore, like this:

In [6]:
{k: v for k, v in model.__dict__.items() if not k.startswith("_")}

{'training': True, 'num_classes': 10, 'units1': 256, 'units2': 256}

This means that if you want to add more parameters to the `.toml` file, eg `units3`, you can add them to the class like this:

```python
class NeuralNetwork(nn.Module):
    def __init__(self, num_classes: int, units1: int, units2: int, units3: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        self.units1 = units1
        self.units2 = units2
        self.units3 = units3  # <-- add this line
```

And then it will be added to the `.toml` file. Check the result for yourself by using the `.save()` method of the `TomlSerializer` class like this:

In [19]:
tomlserializer = TOMLSerializer()
tomlserializer.save(settings, "settings.toml")
tomlserializer.save(model, "model.toml")

Check the `settings.toml` and `model.toml` files to see what is in there.

You can use the `Trainer` class from my `mltrainer` module to train your model. It has the TOMLserializer integrated, so it will automatically save the settings and model to a toml file if you have added `TOML` as a reporttype in the settings.

In [8]:
trainer = Trainer(
    model=model,
    settings=settings,
    loss_fn=loss_fn,
    optimizer=optim.Adam,
    traindataloader=trainstreamer,
    validdataloader=validstreamer,
    scheduler=optim.lr_scheduler.ReduceLROnPlateau
)
trainer.loop()

[32m2025-09-20 16:43:17.187[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\20250920-164317[0m
[32m2025-09-20 16:43:18.558[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m
100%|[38;2;30;71;6m██████████[0m| 100/100 [00:00<00:00, 228.06it/s]
[32m2025-09-20 16:43:19.366[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.9176 test 0.6248 metric ['0.7789'][0m
100%|[38;2;30;71;6m██████████[0m| 100/100 [00:00<00:00, 172.12it/s]
[32m2025-09-20 16:43:20.132[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.5804 test 0.5523 metric ['0.7948'][0m
100%|[38;2;30;71;6m██████████[0m| 100/100 [00:00<00:00, 254.80it/s]
[32m2025-09-20 16:43:20.711[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:

Now, check in the modellogs directory the results of your experiment.

We can now loop this with a naive approach, called a grid-search (why do you think i call it naive?).

> because it doens't know what the best setting is it tries al possible combinations

In [42]:
import random

units = [256, 128, 64]
for _ in range(5):  
    unit1 = random.choice(units)
    unit2 = random.choice(units)
    print(f"Random Units: {unit1}, {unit2}")


Random Units: 64, 256
Random Units: 128, 128
Random Units: 128, 256
Random Units: 256, 128
Random Units: 64, 256


Of course, this might not be the best way to search for a model; some configurations will be better than others (can you predict up front what will be the best configuration?).

So, feel free to improve upon the gridsearch by adding your own logic.

In [45]:
import torch

units = [1024,512,256]
loss_fn = torch.nn.CrossEntropyLoss()

settings = TrainerSettings(
    epochs=5,
    metrics=[accuracy],
    logdir="modellogs",
    train_steps=128,
    valid_steps=128,
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)

units = [256, 128, 64]
for _ in range(5):  
    trainstreamer = train.stream()
    validstreamer = valid.stream()
    unit1 = random.choice(units)
    unit2 = random.choice(units)
    print(f"Random Units: {unit1}, {unit2}")
    model = NeuralNetwork(num_classes=10, units1=unit1, units2=unit2)
    trainer = Trainer(
        model=model,
        settings=settings,
        loss_fn=loss_fn,
        optimizer=optim.Adam,
        traindataloader=trainstreamer,
        validdataloader=validstreamer,
        scheduler=optim.lr_scheduler.ReduceLROnPlateau
    )
    trainer.loop()

[32m2025-09-21 15:31:33.001[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\20250921-153133[0m
[32m2025-09-21 15:31:33.004[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m


Random Units: 256, 64


100%|[38;2;30;71;6m██████████[0m| 128/128 [00:00<00:00, 134.28it/s]
[32m2025-09-21 15:31:34.444[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.9495 test 0.6434 metric ['0.7637'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 107.38it/s]
[32m2025-09-21 15:31:36.026[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.5664 test 0.5483 metric ['0.8048'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 124.31it/s]
[32m2025-09-21 15:31:37.588[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 2 train 0.5070 test 0.5162 metric ['0.8206'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 105.10it/s]
[32m2025-09-21 15:31:39.351[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 3 train 0.4791 test 0.4813 metric ['0.8330'][0m
100%|[38;2;30;71;6m██████████[0m| 

Random Units: 64, 128


100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 123.62it/s]
[32m2025-09-21 15:31:42.700[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 1.0491 test 0.6903 metric ['0.7509'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:00<00:00, 136.13it/s]
[32m2025-09-21 15:31:44.053[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.6088 test 0.5855 metric ['0.7847'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 103.70it/s]
[32m2025-09-21 15:31:45.799[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 2 train 0.5226 test 0.5443 metric ['0.8052'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 97.88it/s]
[32m2025-09-21 15:31:47.576[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 3 train 0.4917 test 0.5064 metric ['0.8240'][0m
100%|[38;2;30;71;6m██████████[0m| 1

Random Units: 64, 64


100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 108.10it/s]
[32m2025-09-21 15:31:50.835[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 1.0976 test 0.6966 metric ['0.7430'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 111.35it/s]
[32m2025-09-21 15:31:52.394[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.6425 test 0.5965 metric ['0.7793'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 121.31it/s]
[32m2025-09-21 15:31:53.837[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 2 train 0.5461 test 0.5295 metric ['0.8170'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:00<00:00, 145.35it/s]
[32m2025-09-21 15:31:55.214[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 3 train 0.5155 test 0.5278 metric ['0.8163'][0m
100%|[38;2;30;71;6m██████████[0m| 

Random Units: 64, 64


100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 121.92it/s]
[32m2025-09-21 15:31:58.443[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 1.1001 test 0.7259 metric ['0.7223'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 104.92it/s]
[32m2025-09-21 15:32:00.033[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.6333 test 0.6062 metric ['0.7875'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 124.73it/s]
[32m2025-09-21 15:32:01.543[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 2 train 0.5478 test 0.5392 metric ['0.8027'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 104.20it/s]
[32m2025-09-21 15:32:03.267[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 3 train 0.4991 test 0.5219 metric ['0.8136'][0m
100%|[38;2;30;71;6m██████████[0m| 

Random Units: 64, 64


100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 123.02it/s]
[32m2025-09-21 15:32:06.376[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 1.0792 test 0.7149 metric ['0.7416'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:00<00:00, 130.63it/s]
[32m2025-09-21 15:32:07.796[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 1 train 0.6204 test 0.6702 metric ['0.7667'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:00<00:00, 130.72it/s]
[32m2025-09-21 15:32:09.213[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 2 train 0.5358 test 0.5376 metric ['0.8011'][0m
100%|[38;2;30;71;6m██████████[0m| 128/128 [00:01<00:00, 121.18it/s]
[32m2025-09-21 15:32:10.648[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 3 train 0.5023 test 0.5562 metric ['0.7966'][0m
[32m2025-09-21 15:32:10.648[0m | 