# Excercises 
# 1. Tune the network
Run the experiment below, explore the different parameters (see suggestions below) and study the result with tensorboard. 
Make a single page (1 a4) report of your findings. Use your visualisation skills to communicate your most important findings.

In [1]:
from mads_datasets import DatasetFactoryProvider, DatasetType

from mltrainer.preprocessors import BasePreprocessor
from mltrainer import imagemodels, Trainer, TrainerSettings, ReportTypes, metrics

import torch.optim as optim
from torch import nn
from tomlserializer import TOMLSerializer

We will be using `tomlserializer` to easily keep track of our experiments, and to easily save the different things we did during our experiments.
It can export things like settings and models to a simple `toml` file, which can be easily shared, checked and modified.

First, we need the data. 

In [2]:
fashionfactory = DatasetFactoryProvider.create_factory(DatasetType.FASHION)
preprocessor = BasePreprocessor()
streamers = fashionfactory.create_datastreamer(batchsize=64, preprocessor=preprocessor)
train = streamers["train"]
valid = streamers["valid"]
trainstreamer = train.stream()
validstreamer = valid.stream()

[32m2025-09-14 13:00:14.357[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m121[0m - [1mFolder already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist[0m
[32m2025-09-14 13:00:14.363[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m


We need a way to determine how well our model is performing. We will use accuracy as a metric.

In [3]:
accuracy = metrics.Accuracy()

You can set up a single experiment.

- We will show the model batches of 64 images, 
- and for every epoch we will show the model 100 batches (trainsteps=100).
- then, we will test how well the model is doing on unseen data (teststeps=100).
- we will report our results during training to tensorboard, and report all configuration to a toml file.
- we will log the results into a directory called "modellogs", but you could change this to whatever you want.

In [4]:
import torch
loss_fn = torch.nn.CrossEntropyLoss()

settings = TrainerSettings(
    epochs=3,
    metrics=[accuracy],
    logdir="modellogs",
    train_steps=100,
    valid_steps=100,
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)


We will use a very basic model: a model with three linear layers.

In [5]:
class NeuralNetwork(nn.Module):
    def __init__(self, num_classes: int, units1: int, units2: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        self.units1 = units1
        self.units2 = units2
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28 * 28, units1),
            nn.ReLU(),
            nn.Linear(units1, units2),
            nn.ReLU(),
            nn.Linear(units2, num_classes),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

model = NeuralNetwork(
    num_classes=10, units1=256, units2=256)

I developped the `tomlserializer` package, it is a useful tool to save configs, models and settings as a tomlfile; that way it is easy to track what you changed during your experiments.

This package will 1. check if there is a `__dict__` attribute available, and if so, it will use that to extract the parameters that do not start with an underscore, like this:

In [6]:
{k: v for k, v in model.__dict__.items() if not k.startswith("_")}

{'training': True, 'num_classes': 10, 'units1': 256, 'units2': 256}

This means that if you want to add more parameters to the `.toml` file, eg `units3`, you can add them to the class like this:

```python
class NeuralNetwork(nn.Module):
    def __init__(self, num_classes: int, units1: int, units2: int, units3: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        self.units1 = units1
        self.units2 = units2
        self.units3 = units3  # <-- add this line
```

And then it will be added to the `.toml` file. Check the result for yourself by using the `.save()` method of the `TomlSerializer` class like this:

In [None]:
tomlserializer = TOMLSerializer()
tomlserializer.save(settings, "settings.toml")
tomlserializer.save(model, "model.toml")

Check the `settings.toml` and `model.toml` files to see what is in there.

You can use the `Trainer` class from my `mltrainer` module to train your model. It has the TOMLserializer integrated, so it will automatically save the settings and model to a toml file if you have added `TOML` as a reporttype in the settings.

In [None]:
trainer = Trainer(
    model=model,
    settings=settings,
    loss_fn=loss_fn,
    optimizer=optim.Adam,
    traindataloader=trainstreamer,
    validdataloader=validstreamer,
    scheduler=optim.lr_scheduler.ReduceLROnPlateau
)
trainer.loop()

Now, check in the modellogs directory the results of your experiment.

We can now loop this with a naive approach, called a grid-search (why do you think i call it naive?).

In [None]:
units = [256, 128, 64]
for unit1 in units:
    for unit2 in units:
        print(f"Units: {unit1}, {unit2}")

Of course, this might not be the best way to search for a model; some configurations will be better than others (can you predict up front what will be the best configuration?).

So, feel free to improve upon the gridsearch by adding your own logic.

In [None]:
import torch

units = [1024, 512, 256, 128, 64, 32, 16]
loss_fn = torch.nn.CrossEntropyLoss()

settings = TrainerSettings(
    epochs=3,
    metrics=[accuracy],
    logdir="modellogs",
    train_steps=len(train),
    valid_steps=len(valid),
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)

for unit1 in units:
    for unit2 in units:

        model = NeuralNetwork(num_classes=10, units1=unit1, units2=unit2)

        trainer = Trainer(
            model=model,
            settings=settings,
            loss_fn=loss_fn,
            optimizer=optim.Adam,
            traindataloader=trainstreamer,
            validdataloader=validstreamer,
            scheduler=optim.lr_scheduler.ReduceLROnPlateau
        )
        trainer.loop()


Because we have set the ReportType to TOML, you will find in every log dir a model.toml and settings.toml file.

Run the experiment, and study the result with tensorboard. 

Locally, it is easy to do that with VS code itself. On the server, you have to take these steps:

- in the terminal, `cd` to the location of the repository
- activate the python environment for the shell. Note how the correct environment is being activated.
- run `tensorboard --logdir=modellogs` in the terminal
- tensorboard will launch at `localhost:6006` and vscode will notify you that the port is forwarded
- you can either press the `launch` button in VScode or open your local browser at `localhost:6006`

##### Experiment

In [7]:
class NeuralNetwork(nn.Module):
    def __init__(self, num_classes: int, units1: int, units2: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        self.units1 = units1
        self.units2 = units2
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28 * 28, units1),
            nn.ReLU(),
            nn.Linear(units1, units2),
            nn.ReLU(),
            nn.Linear(units2, num_classes),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

In [28]:
from tensorboard.backend.event_processing.event_accumulator import EventAccumulator
def extract_scalar_data(log_dir, scalar_tag):
    event_acc = EventAccumulator(log_dir)
    event_acc.Reload()  # Load the logs
    scalar_data = event_acc.Scalars(scalar_tag)  
    return [(e.step, e.value) for e in scalar_data]

In [60]:
import os
from tensorboard.backend.event_processing import event_accumulator

def list_scalar_tags(log_dir):
    # Ensure the log directory exists
    if not os.path.exists(log_dir):
        print(f"The directory {log_dir} does not exist.")
        return
    
    # Load the event file using TensorBoard's EventAccumulator
    ea = event_accumulator.EventAccumulator(log_dir)
    ea.Reload()  # Read the events from the log file
    
    # Get all scalar tags (metrics that have been logged)
    scalar_tags = ea.Tags()['scalars']
    
    # Return the list of scalar tags
    return scalar_tags

# Replace with the path to your TensorBoard log directory
log_dir = 'modellogs/20250914-131641'

# List all scalar tags in the log directory
scalar_tags = list_scalar_tags(log_dir)
print(scalar_tags)

['Loss/train', 'Loss/test', 'metric/Accuracy', 'learning_rate']


change the number of epochs, eg to 5 or 10

In [9]:
number_epochs = [5, 10]

streamers = fashionfactory.create_datastreamer(batchsize=64, preprocessor=preprocessor)
train = streamers["train"]
valid = streamers["valid"]
trainstreamer = train.stream()
validstreamer = valid.stream()
loss_fn = torch.nn.CrossEntropyLoss()

for epoch in number_epochs:
    settings = TrainerSettings(
        epochs=epoch,
        metrics=[accuracy],
        logdir="modellogs",
        train_steps=len(train),
        valid_steps=len(valid),
        reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
    )
    model = NeuralNetwork(num_classes=10, units1=256, units2=256)

    trainer = Trainer(
        model=model,
        settings=settings,
        loss_fn=loss_fn,
        optimizer=optim.SGD,
        traindataloader=trainstreamer,
        validdataloader=validstreamer,
        scheduler=optim.lr_scheduler.ReduceLROnPlateau
    )
    trainer.loop()

[32m2025-09-14 13:01:40.692[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m121[0m - [1mFolder already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist[0m
[32m2025-09-14 13:01:40.693[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m
[32m2025-09-14 13:01:40.716[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\20250914-130140[0m
[32m2025-09-14 13:01:41.642[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:03<00:00, 271.66it/s]
[32m2025-09-14 13:01:45.636[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 2.2482 test 2.1

In [None]:
log_dir_base = 'modellogs/'
map_epochs = ['20250914-130140', '20250914-130200']
number_epochs = [5, 10]

log_dir_per_epoch = {}
for i in range(len(map_epochs)):
    log_dir = f"{log_dir_base}{map_epochs[i]}"
    accuracy_data = extract_scalar_data(log_dir, 'metric/Accuracy')
    log_dir_per_epoch[number_epochs[i]] = accuracy_data

import matplotlib.pyplot as plt
plt.figure(figsize=(10, 6))

for epoch, accuracy_data in log_dir_per_epoch.items():
    steps = [x[0] for x in accuracy_data]
    accuracies = [x[1] for x in accuracy_data]
    
    plt.plot(steps, accuracies, label=f'Epochs = {epoch}')

plt.xlabel('Steps')
plt.ylabel('Accuracy')
plt.title('Accuracy vs Steps for Different Epoch Configurations')
plt.legend()
plt.grid(True)
plt.savefig('figures/epoch_accuracy_plot.png')
plt.close()

In [10]:
import time
time.sleep(120)

changing the amount of units1 and units2 to values between 16 and 1024. Use factors of 2 to easily scan the ranges: 16, 32, 64, etc.

In [None]:
import torch
units = [1024, 512, 256, 128, 64, 32, 16]

streamers = fashionfactory.create_datastreamer(batchsize=64, preprocessor=preprocessor)
train = streamers["train"]
valid = streamers["valid"]
trainstreamer = train.stream()
validstreamer = valid.stream()
loss_fn = torch.nn.CrossEntropyLoss()

settings = TrainerSettings(
    epochs=3,
    metrics=[accuracy],
    logdir="modellogs",
    train_steps=len(train),
    valid_steps=len(valid),
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)

for unit1 in units:
    for unit2 in units:

        model = NeuralNetwork(num_classes=10, units1=unit1, units2=unit2)

        trainer = Trainer(
            model=model,
            settings=settings,
            loss_fn=loss_fn,
            optimizer=optim.SGD,
            traindataloader=trainstreamer,
            validdataloader=validstreamer,
            scheduler=optim.lr_scheduler.ReduceLROnPlateau
        )
        trainer.loop()

[32m2025-09-14 13:04:40.618[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m121[0m - [1mFolder already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist[0m
[32m2025-09-14 13:04:40.619[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m
[32m2025-09-14 13:04:40.649[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\20250914-130440[0m
[32m2025-09-14 13:04:40.652[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:06<00:00, 133.98it/s]
[32m2025-09-14 13:04:48.175[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 2.2098 test 2.1

In [57]:
import numpy as np
log_dirs = ["20250914-130440", "20250914-130503", "20250914-130521", "20250914-130539", "20250914-130555", 
            "20250914-130611", "20250914-130626", "20250914-130641", "20250914-130657", "20250914-130711",
            "20250914-130725", "20250914-130738", "20250914-130750", "20250914-130803", "20250914-130815",
            "20250914-130828", "20250914-130840", "20250914-130852", "20250914-130903", "20250914-130914",
            "20250914-130926", "20250914-130937", "20250914-130949", "20250914-131000", "20250914-131012",
            "20250914-131023", "20250914-131034", "20250914-131044", "20250914-131055", "20250914-131106", 
            "20250914-131117", "20250914-131128", "20250914-131139", "20250914-131149", "20250914-131200",
            "20250914-131210", "20250914-131222", "20250914-131233", "20250914-131243", "20250914-131254", 
            "20250914-131305", "20250914-131316", "20250914-131326", "20250914-131337", "20250914-131348",
            "20250914-131359", "20250914-131409", "20250914-131420", "20250914-131431"]
units = [1024, 512, 256, 128, 64, 32, 16]
combinations = [(unit1, unit2) for unit1 in units for unit2 in units]
accuracy_data_per_experiment = {}

for i in range(len(log_dirs)):
    accuracy_data= extract_scalar_data('modellogs/'+ log_dirs[i], 'metric/Accuracy')
    accuracy_data_per_experiment[combinations[i]] =accuracy_data[-1][1]

layer_1_units = sorted(set(unit1 for unit1, unit2 in accuracy_data_per_experiment.keys()))
layer_2_units = sorted(set(unit2 for unit1, unit2 in accuracy_data_per_experiment.keys()))

accuracy_grid = np.zeros((len(layer_1_units), len(layer_2_units)))

for i, layer_1 in enumerate(layer_1_units):
    for j, layer_2 in enumerate(layer_2_units):
        accuracy_grid[i, j] = accuracy_data_per_experiment.get((layer_1, layer_2), np.nan)  # Get accuracy for (layer_1, layer_2)

plt.figure(figsize=(10, 6))
cax = plt.imshow(accuracy_grid, cmap='YlGnBu', interpolation='nearest')

# Add a color bar to show the accuracy scale
plt.colorbar(cax)

# Label the axes with units
plt.xticks(np.arange(len(layer_2_units)), layer_2_units)  
plt.yticks(np.arange(len(layer_1_units)), layer_1_units)

# Rotate the x and y tick labels to make them readable
plt.xticks(rotation=45, ha='right')
plt.yticks(rotation=45, ha='right')

# Add title and axis labels
plt.title('Heatmap of Final Accuracy for Different Unit Combinations')
plt.xlabel('Units in Layer 2')
plt.ylabel('Units in Layer 1')
plt.savefig('figures/heatmap_accuracy_amount_units.png')
plt.close()

In [12]:
import time
time.sleep(120)

changing the batchsize to values between 4 and 128. Again, use factors of two for convenience.

In [13]:
import torch
batchsizes = [4, 8, 16, 32, 64, 128]

loss_fn = torch.nn.CrossEntropyLoss()

settings = TrainerSettings(
    epochs=3,
    metrics=[accuracy],
    logdir="modellogs",
    train_steps=len(train),
    valid_steps=len(valid),
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)
model = NeuralNetwork(num_classes=10, units1=256, units2=256)

for batchsize in batchsizes:
    streamers = fashionfactory.create_datastreamer(batchsize=batchsize, preprocessor=preprocessor)
    train = streamers["train"]
    valid = streamers["valid"]
    trainstreamer = train.stream()
    validstreamer = valid.stream()

    trainer = Trainer(
        model=model,
        settings=settings,
        loss_fn=loss_fn,
        optimizer=optim.SGD,
        traindataloader=trainstreamer,
        validdataloader=validstreamer,
        scheduler=optim.lr_scheduler.ReduceLROnPlateau
    )
    trainer.loop()

[32m2025-09-14 13:16:41.614[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m121[0m - [1mFolder already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist[0m
[32m2025-09-14 13:16:41.618[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m
[32m2025-09-14 13:16:41.660[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\20250914-131641[0m
[32m2025-09-14 13:16:41.660[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:00<00:00, 1113.09it/s]
[32m2025-09-14 13:16:42.560[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 2.2487 test 2.

In [79]:
log_dirs = ["20250914-131641", "20250914-131644", "20250914-131647","20250914-131652", "20250914-131659", "20250914-131710"]
batchsizes = [4, 8, 16, 32, 64, 128]
epochs = [0, 1, 2]
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b']
loss_train_data_per_experiment = {}
loss_test_data_per_experiment = {}

for i in range(len(log_dirs)):
    loss_train_data= extract_scalar_data('modellogs/'+ log_dirs[i], 'Loss/train')
    loss_test_data = extract_scalar_data('modellogs/'+ log_dirs[i], 'Loss/test')
    loss_train_data_per_experiment[batchsizes[i]] =[value for index, value in loss_train_data]
    loss_test_data_per_experiment[batchsizes[i]] =[value for index, value in loss_test_data]

for i, batch_size in enumerate(batchsizes):
    plt.plot(epochs, loss_train_data_per_experiment[batch_size], label=f'Train Loss Batch {batch_size}', color=colors[i])
    plt.plot(epochs, loss_test_data_per_experiment[batch_size], label=f'Test Loss Batch {batch_size}', linestyle='--', color=colors[i])

plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Test Loss vs Epochs for Different Batch Sizes')
plt.legend(loc='best')
plt.grid(True)
plt.tight_layout()
plt.savefig('figures/loss_batchsize.png')
plt.close()

In [72]:
gen_gap = {batch_size: np.array(loss_train_data_per_experiment[batch_size]) - np.array(loss_test_data_per_experiment[batch_size])
           for batch_size in batchsizes}
for i, batch_size in enumerate(batchsizes):
    plt.plot(epochs, gen_gap[batch_size], label=f'Difference in Loss Batch {batch_size}', color=colors[i])
plt.xlabel('Epochs')
plt.ylabel('Difference in Loss')
plt.title('Difference in Training and Test Loss vs Epochs for Different Batch Sizes')
plt.legend(loc='best')
plt.grid(True)
plt.tight_layout()
plt.savefig('figures/gen_loss_batchsize.png')
plt.close()

In [14]:
import time
time.sleep(120)

change the depth of your model by adding a additional linear layer + activation function

In [15]:
class NeuralNetwork(nn.Module):
    def __init__(self, num_classes: int, units1: int, units2: int, units3: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        self.units1 = units1
        self.units2 = units2
        self.units3 = units3
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28 * 28, units1),
            nn.ReLU(),
            nn.Linear(units1, units2),
            nn.ReLU(),
            nn.Linear(units2, units3),
            nn.ReLU(),
            nn.Linear(units3, num_classes),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

In [16]:
streamers = fashionfactory.create_datastreamer(batchsize=64, preprocessor=preprocessor)
train = streamers["train"]
valid = streamers["valid"]
trainstreamer = train.stream()
validstreamer = valid.stream()
loss_fn = torch.nn.CrossEntropyLoss()
settings = TrainerSettings(
    epochs=3,
    metrics=[accuracy],
    logdir="modellogs",
    train_steps=len(train),
    valid_steps=len(valid),
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)
model = NeuralNetwork(num_classes=10, units1=256, units2=256, units3=256)
trainer = Trainer(
        model=model,
        settings=settings,
        loss_fn=loss_fn,
        optimizer=optim.SGD,
        traindataloader=trainstreamer,
        validdataloader=validstreamer,
        scheduler=optim.lr_scheduler.ReduceLROnPlateau
    )
trainer.loop()

[32m2025-09-14 13:19:31.589[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m121[0m - [1mFolder already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist[0m
[32m2025-09-14 13:19:31.589[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m
[32m2025-09-14 13:19:31.609[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\20250914-131931[0m
[32m2025-09-14 13:19:31.609[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:03<00:00, 260.24it/s]
[32m2025-09-14 13:19:35.643[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 2.2960 test 2.2

In [86]:
log_dirs = ["20250914-142547", "20250914-131931"]
number_layers = ['2', '3']
epochs = [0, 1, 2]
colors = ['#1f77b4', '#ff7f0e']
accuracy_data_per_experiment = {}

for i in range(len(log_dirs)):
    accuracy_data= extract_scalar_data('modellogs/'+ log_dirs[i], 'metric/Accuracy')
    accuracy_data_per_experiment[number_layers[i]] =[value for index, value in accuracy_data]

for i, number_layer in enumerate(number_layers):
    plt.plot(epochs, accuracy_data_per_experiment[number_layer], label=f'{number_layer} layers', color=colors[i])

plt.xlabel('Epochs')
plt.ylabel('Accuracy')
plt.title('Accuracy vs Steps for Different Number of Layers')
plt.legend(loc='best')
plt.grid(True)
plt.tight_layout()
plt.savefig('figures/accuracy_depth.png')
plt.close()

In [23]:
class NeuralNetwork(nn.Module):
    def __init__(self, num_classes: int, units1: int, units2: int) -> None:
        super().__init__()
        self.num_classes = num_classes
        self.units1 = units1
        self.units2 = units2
        self.flatten = nn.Flatten()
        self.linear_relu_stack = nn.Sequential(
            nn.Linear(28 * 28, units1),
            nn.ReLU(),
            nn.Linear(units1, units2),
            nn.ReLU(),
            nn.Linear(units2, num_classes),
        )

    def forward(self, x: torch.Tensor) -> torch.Tensor:
        x = self.flatten(x)
        logits = self.linear_relu_stack(x)
        return logits

In [18]:
import time
time.sleep(120)

changing the learningrate to values between 1e-2 and 1e-5

In [27]:
learningrates = [0.01, 0.001, 0.0001, 0.00001]

streamers = fashionfactory.create_datastreamer(batchsize=64, preprocessor=preprocessor)
train = streamers["train"]
valid = streamers["valid"]
trainstreamer = train.stream()
validstreamer = valid.stream()
loss_fn = torch.nn.CrossEntropyLoss()



model = NeuralNetwork(num_classes=10, units1=256, units2=256)
for learningrate in learningrates:
    settings = TrainerSettings(
        epochs=3,
        metrics=[accuracy],
        logdir="modellogs",
        train_steps=len(train),
        valid_steps=len(valid),
        reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
        optimizer_kwargs={"lr": learningrate}
    )
    trainer = Trainer(
            model=model,
            settings=settings,
            loss_fn=loss_fn,
            optimizer=optim.SGD,
            traindataloader=trainstreamer,
            validdataloader=validstreamer,
            scheduler=optim.lr_scheduler.ReduceLROnPlateau
        )
    trainer.loop()

[32m2025-09-14 14:25:47.179[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m121[0m - [1mFolder already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist[0m
[32m2025-09-14 14:25:47.179[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m
[32m2025-09-14 14:25:47.222[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\20250914-142547[0m
[32m2025-09-14 14:25:47.223[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:03<00:00, 289.02it/s]
[32m2025-09-14 14:25:50.889[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 1.3328 test 0.8

In [81]:
log_dirs = ["20250914-142547","20250914-142558", "20250914-142609", "20250914-142620"]
learningrates = [0.01, 0.001, 0.0001, 0.00001]
epochs = [0, 1, 2]
colors = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728']
loss_train_data_per_experiment = {}
loss_test_data_per_experiment = {}

for i in range(len(log_dirs)):
    loss_train_data= extract_scalar_data('modellogs/'+ log_dirs[i], 'Loss/train')
    loss_test_data = extract_scalar_data('modellogs/'+ log_dirs[i], 'Loss/test')
    loss_train_data_per_experiment[learningrates[i]] =[value for index, value in loss_train_data]
    loss_test_data_per_experiment[learningrates[i]] =[value for index, value in loss_test_data]

for i, learningrate in enumerate(learningrates):
    plt.plot(epochs, loss_train_data_per_experiment[learningrate], label=f'Train Loss learningrate {learningrate}', color=colors[i])
    plt.plot(epochs, loss_test_data_per_experiment[learningrate], label=f'Test Loss learningrate {learningrate}', linestyle='--', color=colors[i])

plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Test Loss vs Epochs for Different learningrates')
plt.legend(loc='best')
plt.grid(True)
plt.tight_layout()
plt.savefig('figures/loss_learningrates.png')
plt.close()

In [26]:
import time
time.sleep(120)

changing the optimizer from SGD to one of the other available algoritms

In [25]:
import torch.optim as optim
streamers = fashionfactory.create_datastreamer(batchsize=64, preprocessor=preprocessor)
train = streamers["train"]
valid = streamers["valid"]
trainstreamer = train.stream()
validstreamer = valid.stream()
loss_fn = torch.nn.CrossEntropyLoss()

settings = TrainerSettings(
    epochs=3,
    metrics=[accuracy],
    logdir="modellogs",
    train_steps=len(train),
    valid_steps=len(valid),
    reporttypes=[ReportTypes.TENSORBOARD, ReportTypes.TOML],
)

model = NeuralNetwork(num_classes=10, units1=256, units2=256)
trainer = Trainer(
        model=model,
        settings=settings,
        loss_fn=loss_fn,
        optimizer=optim.Adam,
        traindataloader=trainstreamer,
        validdataloader=validstreamer,
        scheduler=optim.lr_scheduler.ReduceLROnPlateau
    )
trainer.loop()

[32m2025-09-14 14:23:14.351[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m121[0m - [1mFolder already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist[0m
[32m2025-09-14 14:23:14.352[0m | [1mINFO    [0m | [36mmads_datasets.base[0m:[36mdownload_data[0m:[36m124[0m - [1mFile already exists at C:\Users\pikob\.cache\mads_datasets\fashionmnist\fashionmnist.pt[0m
[32m2025-09-14 14:23:14.382[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mdir_add_timestamp[0m:[36m24[0m - [1mLogging to modellogs\20250914-142314[0m
[32m2025-09-14 14:23:14.382[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36m__init__[0m:[36m68[0m - [1mFound earlystop_kwargs in settings.Set to None if you dont want earlystopping.[0m
100%|[38;2;30;71;6m██████████[0m| 937/937 [00:04<00:00, 214.64it/s]
[32m2025-09-14 14:23:19.217[0m | [1mINFO    [0m | [36mmltrainer.trainer[0m:[36mreport[0m:[36m209[0m - [1mEpoch 0 train 0.5119 test 0.4

In [87]:
log_dirs = ["20250914-142547", "20250914-142314"]
optimizers = ['SGD', 'Adam']
epochs = [0, 1, 2]
colors = ['#1f77b4', '#ff7f0e']
accuracy_data_per_experiment = {}

for i in range(len(log_dirs)):
    loss_train_data= extract_scalar_data('modellogs/'+ log_dirs[i], 'Loss/train')
    loss_test_data = extract_scalar_data('modellogs/'+ log_dirs[i], 'Loss/test')
    loss_train_data_per_experiment[optimizers[i]] =[value for index, value in loss_train_data]
    loss_test_data_per_experiment[optimizers[i]] =[value for index, value in loss_test_data]

for i, optimizer in enumerate(optimizers):
    plt.plot(epochs, loss_train_data_per_experiment[optimizer], label=f'Train Loss {optimizer} optimizer', color=colors[i])
    plt.plot(epochs, loss_test_data_per_experiment[optimizer], label=f'Test Loss {optimizer} optimizer', linestyle='--', color=colors[i])

plt.xlabel('Epochs')
plt.ylabel('Loss')
plt.title('Training and Test Loss vs Epochs for Different optimizers')
plt.legend(loc='best')
plt.grid(True)
plt.tight_layout()
plt.savefig('figures/loss_optimizer.png')
plt.close()