<a href="https://colab.research.google.com/github/hingma/cs182fa25_public/blob/main/hw06/code/wandb.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Exploring Tooling with Weights and Biases
Similar to tensorboard, weights and biases is an application that tracks all your training metrics, and performs visualizations for you. This tool allows you to cleanly sort, organize, and visualize your experiments. In this notebook, we will go through an example of how to use wandb.ai and have you practice.

1. Make an account at https://wandb.ai/site

2. pip install wandb

3. wandb login

4. After step 3, please paste your wandb API key


In [1]:
!git clone https://github.com/hingma/cs182fa25_public.git

Cloning into 'cs182fa25_public'...
remote: Enumerating objects: 1071, done.[K
remote: Counting objects: 100% (74/74), done.[K
remote: Compressing objects: 100% (42/42), done.[K
remote: Total 1071 (delta 35), reused 50 (delta 19), pack-reused 997 (from 1)[K
Receiving objects: 100% (1071/1071), 196.75 MiB | 13.78 MiB/s, done.
Resolving deltas: 100% (460/460), done.
Updating files: 100% (477/477), done.


In [2]:
%cd cs182fa25_public/hw06/code/

/content/cs182fa25_public/hw06/code


In [3]:
import torch
import torch.nn as nn
import torch.optim as optim
import torchvision
import torchvision.transforms as transforms
import wandb
from architectures import BasicConvNet, ResNet18, MLP

## Organizing wandb Projects

With each run, you will want to have a set of parameters associated with it. For example, I want to be able to log different hyperparameters that I am using, so let's clearly list them below

In [4]:
project = 'CS182 WANDB.AI Practice Notebook'
learning_rate = 0.01
epochs = 2
architecture ='CNN'
dataset = 'CIFAR-10'
batch_size = 64
momentum = 0.9
log_freq = 20
print_freq = 200
cuda = torch.cuda.is_available()
device = torch.device("cuda" if cuda else "cpu")

### Initializing the Run

In [5]:
wandb.init(
    # set the wandb project where this run will be logged
    project=project,

    # track hyperparameters and run metadata
    config={
    "learning_rate": learning_rate,
    "architecture": architecture,
    "dataset": dataset,
    "epochs": epochs,
    "batch_size": batch_size,
    "momentum": momentum
    }
)

  | |_| | '_ \/ _` / _` |  _/ -_)


<IPython.core.display.Javascript object>

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize?ref=models
wandb: Paste an API key from your profile and hit enter:

 ··········


[34m[1mwandb[0m: No netrc file found, creating one.
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc
[34m[1mwandb[0m: Currently logged in as: [33mmoxintang[0m ([33mmoxintang-ucb[0m) to [32mhttps://api.wandb.ai[0m. Use [1m`wandb login --relogin`[0m to force relogin


From here on, we have some standard CIFAR training definitions.

In [6]:
transform = transforms.Compose(
    [transforms.ToTensor(),
     transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                        download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                          shuffle=True, num_workers=2)

testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                       download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                         shuffle=False, num_workers=2)

100%|██████████| 170M/170M [00:14<00:00, 12.1MB/s]


In [7]:
class Net(nn.Module):
    def __init__(self):
        super().__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)
        self.relu = nn.ReLU()

    def forward(self, x):
        x = self.pool(self.relu(self.conv1(x)))
        x = self.pool(self.relu(self.conv2(x)))
        x = torch.flatten(x, 1) # flatten all dimensions except batch
        x = self.relu(self.fc1(x))
        x = self.relu(self.fc2(x))
        x = self.fc3(x)
        return x

In [8]:
net = Net()

In [9]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(net.parameters(), lr=learning_rate, momentum=momentum)

### Training with wandb

As you can see, similar to tensorboard, each gradient step we will want to log the accuracy and loss. See below for an example.

In [10]:
for epoch in range(epochs):  # loop over the dataset multiple times
    running_loss = 0.0
    running_acc = 0.0
    for i, data in enumerate(trainloader, 0):
        # get the inputs; data is a list of [inputs, labels]
        inputs, labels = data

        # zero the parameter gradients
        optimizer.zero_grad()

        # forward + backward + optimize
        outputs = net(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()
        accuracy = torch.mean((torch.argmax(outputs, dim=1) == labels).float()).item() * 100

        # print statistics
        running_acc += accuracy
        running_loss += loss.item()
        if i % log_freq == log_freq - 1:
            wandb.log({'accuracy': accuracy, 'loss': loss.item()})

        if i % print_freq == print_freq - 1:    # print every 200 mini-batches
            print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / print_freq:.5f} accuracy: {running_acc/print_freq:.5f}')
            running_loss = 0.0
            running_acc = 0.0








[1,   200] loss: 2.25136 accuracy: 15.80469
[1,   400] loss: 1.88940 accuracy: 30.85156
[1,   600] loss: 1.64993 accuracy: 39.31250
[2,   200] loss: 1.48792 accuracy: 45.63281
[2,   400] loss: 1.44491 accuracy: 48.08594
[2,   600] loss: 1.38183 accuracy: 50.07812


After we are done with this run, we will want to call
 `wandb.finish()`

In [11]:
wandb.finish()

0,1
accuracy,▁▁▁▄▃▂▅▃▄▅▆▅▇▅▅▅█▆▇█▆▅▇▇█▅█▇▆██▅▇█▇▇██▇▇
loss,███▇▇▆▆▆▅▅▄▅▄▄▄▃▄▃▄▃▄▂▃▃▂▂▄▃▄▂▃▃▃▁▁▃▃▁▃▂

0,1
accuracy,51.5625
loss,1.31215


## Your Task
We will be once again building classifiers for the CIFAR-10. There are various architectures set up for you to use in the architectures.py file. Using wandb, please search through 10 different hyperparameter configurations. Examples of choices include: learning rate, batch size, architecture, optimization algorithm, etc. Please submit the hyperparameters that result in the highest accuracies for this classification task. Please then explore wandb for all the visualization that you may need. In addition, feel free to run as many epochs as you like.

In [12]:
def run(params):
    # Extract parameters
    learning_rate = params.get('learning_rate', 0.01)
    architecture = params.get('architecture', 'BasicConvNet')
    batch_size = params.get('batch_size', 64)
    epochs = params.get('epochs', 5)
    momentum = params.get('momentum', 0.9)
    optimizer_name = params.get('optimizer', 'sgd')
    weight_decay = params.get('weight_decay', 0)
    log_freq = 20
    print_freq = 200
    resize_for_resnet = architecture == 'ResNet18'  # ResNet18 requires 224x224 images

    # Set device
    cuda = torch.cuda.is_available()
    device = torch.device("cuda" if cuda else "cpu")

    # Initialize wandb
    run = wandb.init(
        project='CS182 CIFAR-10 Hyperparameter Search',
        config=params,
        reinit=True  # Allow multiple runs in the same process
    )

    # Create transformations for data
    transform_list = [transforms.ToTensor()]
    if resize_for_resnet:
        transform_list.insert(0, transforms.Resize((224, 224)))
    transform_list.append(transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5)))
    transform = transforms.Compose(transform_list)

    # Load datasets
    trainset = torchvision.datasets.CIFAR10(root='./data', train=True,
                                            download=True, transform=transform)
    trainloader = torch.utils.data.DataLoader(trainset, batch_size=batch_size,
                                              shuffle=True, num_workers=2)

    testset = torchvision.datasets.CIFAR10(root='./data', train=False,
                                           download=True, transform=transform)
    testloader = torch.utils.data.DataLoader(testset, batch_size=batch_size,
                                             shuffle=False, num_workers=2)

    # Create the model based on architecture parameter
    if architecture == 'BasicConvNet':
        net = BasicConvNet()
    elif architecture == 'ResNet18':
        net = ResNet18()
    elif architecture == 'MLP':
        hidden_layers = params.get('hidden_layers', 7)
        hidden_size = params.get('hidden_size', 2048)
        net = MLP(num_layers=hidden_layers, size=hidden_size)
    else:
        raise ValueError(f"Unknown architecture: {architecture}")

    # Move model to device
    net = net.to(device)

    # Define loss function
    criterion = nn.CrossEntropyLoss()

    # Choose optimizer based on parameter
    if optimizer_name.lower() == 'sgd':
        optimizer = optim.SGD(net.parameters(), lr=learning_rate,
                             momentum=momentum, weight_decay=weight_decay)
    elif optimizer_name.lower() == 'adam':
        optimizer = optim.Adam(net.parameters(), lr=learning_rate,
                             weight_decay=weight_decay)
    elif optimizer_name.lower() == 'rmsprop':
        optimizer = optim.RMSprop(net.parameters(), lr=learning_rate,
                                weight_decay=weight_decay)
    else:
        raise ValueError(f"Unknown optimizer: {optimizer_name}")

    # Optional: learning rate scheduler
    scheduler = optim.lr_scheduler.ReduceLROnPlateau(
        optimizer, 'min', patience=1, factor=0.5) if params.get('use_scheduler', False) else None

    # Training loop
    best_accuracy = 0.0
    for epoch in range(epochs):
        # Training phase
        net.train()
        running_loss = 0.0
        running_acc = 0.0
        for i, data in enumerate(trainloader):
            # Get the inputs; data is a list of [inputs, labels]
            inputs, labels = data
            inputs, labels = inputs.to(device), labels.to(device)

            # For MLP, we need to flatten the inputs
            if architecture == 'MLP':
                inputs = inputs.view(inputs.size(0), -1)

            # Zero the parameter gradients
            optimizer.zero_grad()

            # Forward + backward + optimize
            outputs = net(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()
            accuracy = torch.mean((torch.argmax(outputs, dim=1) == labels).float()).item() * 100

            # Log statistics
            running_acc += accuracy
            running_loss += loss.item()
            if i % log_freq == log_freq - 1:
                wandb.log({'train_accuracy': accuracy, 'train_loss': loss.item()})

            if i % print_freq == print_freq - 1:
                print(f'[{epoch + 1}, {i + 1:5d}] loss: {running_loss / print_freq:.5f} accuracy: {running_acc/print_freq:.5f}')
                running_loss = 0.0
                running_acc = 0.0

        # Evaluation phase
        net.eval()
        test_loss = 0.0
        correct = 0
        total = 0
        with torch.no_grad():
            for data in testloader:
                images, labels = data
                images, labels = images.to(device), labels.to(device)

                # For MLP, we need to flatten the inputs
                if architecture == 'MLP':
                    images = images.view(images.size(0), -1)

                outputs = net(images)
                loss = criterion(outputs, labels)
                test_loss += loss.item()

                _, predicted = torch.max(outputs.data, 1)
                total += labels.size(0)
                correct += (predicted == labels).sum().item()

        epoch_test_accuracy = 100 * correct / total
        epoch_test_loss = test_loss / len(testloader)

        print(f'Epoch {epoch+1} Test Accuracy: {epoch_test_accuracy:.2f}%')

        wandb.log({
            'epoch': epoch + 1,
            'test_accuracy': epoch_test_accuracy,
            'test_loss': epoch_test_loss
        })

        # Update best accuracy
        if epoch_test_accuracy > best_accuracy:
            best_accuracy = epoch_test_accuracy

        # Update learning rate if using scheduler
        if scheduler is not None:
            scheduler.step(epoch_test_loss)

    # Log best accuracy as summary metric
    wandb.run.summary['best_accuracy'] = best_accuracy

    # Finish the wandb run
    wandb.finish()

    return best_accuracy

In [13]:
# Define hyperparameter configurations to try
hyperparameter_configs = [
    {'architecture': 'BasicConvNet', 'learning_rate': 0.01, 'batch_size': 64, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.9},
    {'architecture': 'BasicConvNet', 'learning_rate': 0.001, 'batch_size': 128, 'epochs': 10, 'optimizer': 'adam'},
    {'architecture': 'ResNet18', 'learning_rate': 0.01, 'batch_size': 64, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.9},
    {'architecture': 'ResNet18', 'learning_rate': 0.001, 'batch_size': 64, 'epochs': 10, 'optimizer': 'adam'},
    {'architecture': 'MLP', 'learning_rate': 0.01, 'batch_size': 64, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.9, 'hidden_layers': 5, 'hidden_size': 1024},
    {'architecture': 'MLP', 'learning_rate': 0.001, 'batch_size': 128, 'epochs': 10, 'optimizer': 'adam', 'hidden_layers': 3, 'hidden_size': 2048},
    {'architecture': 'BasicConvNet', 'learning_rate': 0.005, 'batch_size': 32, 'epochs': 10, 'optimizer': 'rmsprop'},
    {'architecture': 'ResNet18', 'learning_rate': 0.0005, 'batch_size': 32, 'epochs': 10, 'optimizer': 'adam', 'weight_decay': 1e-4},
    {'architecture': 'BasicConvNet', 'learning_rate': 0.01, 'batch_size': 64, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.9, 'weight_decay': 1e-4, 'use_scheduler': True},
    {'architecture': 'ResNet18', 'learning_rate': 0.01, 'batch_size': 128, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.95, 'weight_decay': 5e-4, 'use_scheduler': True}
]


In [14]:
# Run each configuration and track results
results = []
for i, config in enumerate(hyperparameter_configs):
    print(f"Running configuration {i+1}/10: {config}")
    accuracy = run(config)
    results.append((config, accuracy))
    print(f"Configuration {i+1} finished with best accuracy: {accuracy:.2f}%")
    print("-" * 50)

# Sort and display results
results.sort(key=lambda x: x[1], reverse=True)
print("\nBest hyperparameter configurations:")
for i, (config, accuracy) in enumerate(results[:3]):
    print(f"{i+1}. Accuracy: {accuracy:.2f}%, Config: {config}")


Running configuration 1/10: {'architecture': 'BasicConvNet', 'learning_rate': 0.01, 'batch_size': 64, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.9}




[1,   200] loss: 2.21749 accuracy: 16.75781
[1,   400] loss: 1.84322 accuracy: 31.19531
[1,   600] loss: 1.66043 accuracy: 38.53906
Epoch 1 Test Accuracy: 45.09%
[2,   200] loss: 1.44783 accuracy: 47.18750
[2,   400] loss: 1.38988 accuracy: 48.96875
[2,   600] loss: 1.34219 accuracy: 51.95312
Epoch 2 Test Accuracy: 52.46%
[3,   200] loss: 1.26201 accuracy: 54.89062
[3,   400] loss: 1.24554 accuracy: 55.42188
[3,   600] loss: 1.21554 accuracy: 56.42969
Epoch 3 Test Accuracy: 57.78%
[4,   200] loss: 1.14053 accuracy: 59.58594
[4,   400] loss: 1.12471 accuracy: 59.62500
[4,   600] loss: 1.13176 accuracy: 59.89844
Epoch 4 Test Accuracy: 60.25%
[5,   200] loss: 1.05287 accuracy: 62.19531
[5,   400] loss: 1.05480 accuracy: 62.90625
[5,   600] loss: 1.05615 accuracy: 62.71875
Epoch 5 Test Accuracy: 59.77%
[6,   200] loss: 0.98103 accuracy: 65.53125
[6,   400] loss: 0.98685 accuracy: 64.64844
[6,   600] loss: 1.01185 accuracy: 63.85156
Epoch 6 Test Accuracy: 61.64%
[7,   200] loss: 0.92874 acc

0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▄▆▇▇▇▇███
test_loss,█▅▃▂▂▁▁▂▁▁
train_accuracy,▁▂▂▃▄▅▂▅▄▄▅▃▄▆▇▅▇▇▇▆▇▆▅▇▇▇▇▆▇▇█▇▆▆█▇▆▇▇▆
train_loss,█▇▄▅▄▄▄▄▂▄▃▃▃▂▃▃▃▃▃▃▂▁▃▃▃▂▂▂▂▁▁▁▁▁▂▂▃▁▃▂

0,1
best_accuracy,63.13
epoch,10.0
test_accuracy,63.13
test_loss,1.08336
train_accuracy,62.5
train_loss,0.96215


Configuration 1 finished with best accuracy: 63.13%
--------------------------------------------------
Running configuration 2/10: {'architecture': 'BasicConvNet', 'learning_rate': 0.001, 'batch_size': 128, 'epochs': 10, 'optimizer': 'adam'}


[1,   200] loss: 1.86723 accuracy: 31.78125
Epoch 1 Test Accuracy: 45.37%
[2,   200] loss: 1.50027 accuracy: 45.61328
Epoch 2 Test Accuracy: 48.61%
[3,   200] loss: 1.37613 accuracy: 50.08203
Epoch 3 Test Accuracy: 51.96%
[4,   200] loss: 1.28771 accuracy: 53.30859
Epoch 4 Test Accuracy: 53.83%
[5,   200] loss: 1.22682 accuracy: 55.85156
Epoch 5 Test Accuracy: 54.65%
[6,   200] loss: 1.18192 accuracy: 57.79688
Epoch 6 Test Accuracy: 56.65%
[7,   200] loss: 1.12857 accuracy: 59.76953
Epoch 7 Test Accuracy: 57.73%
[8,   200] loss: 1.08711 accuracy: 60.89844
Epoch 8 Test Accuracy: 58.74%
[9,   200] loss: 1.06589 accuracy: 62.02734
Epoch 9 Test Accuracy: 58.95%
[10,   200] loss: 1.02536 accuracy: 63.56641
Epoch 10 Test Accuracy: 59.67%


0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▃▄▅▆▇▇███
test_loss,█▆▅▃▃▂▂▁▁▁
train_accuracy,▁▃▃▂▃▅▄▅▅▅▆▆▅▅▆▅▅▅▆▆▆▆▇▇▆▇▇▇█▆▇▆▇███▇▇▇█
train_loss,█▆▆▇▅▄▅▄▅▄▄▆▅▃▄▃▅▃▃▄▃▂▃▄▃▂▄▂▃▁▂▂▁▁▁▁▂▂▃▁

0,1
best_accuracy,59.67
epoch,10.0
test_accuracy,59.67
test_loss,1.14061
train_accuracy,71.875
train_loss,0.97077


Configuration 2 finished with best accuracy: 59.67%
--------------------------------------------------
Running configuration 3/10: {'architecture': 'ResNet18', 'learning_rate': 0.01, 'batch_size': 64, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.9}


[1,   200] loss: 1.82471 accuracy: 32.15625
[1,   400] loss: 1.48193 accuracy: 45.82812
[1,   600] loss: 1.26819 accuracy: 54.50000
Epoch 1 Test Accuracy: 59.37%
[2,   200] loss: 0.97097 accuracy: 65.03906
[2,   400] loss: 0.89907 accuracy: 67.82812
[2,   600] loss: 0.85237 accuracy: 70.26562
Epoch 2 Test Accuracy: 59.65%
[3,   200] loss: 0.64145 accuracy: 77.36719
[3,   400] loss: 0.63822 accuracy: 78.00000
[3,   600] loss: 0.62007 accuracy: 78.58594
Epoch 3 Test Accuracy: 72.66%
[4,   200] loss: 0.44498 accuracy: 84.14844
[4,   400] loss: 0.46976 accuracy: 83.60156
[4,   600] loss: 0.47346 accuracy: 83.28125
Epoch 4 Test Accuracy: 74.83%
[5,   200] loss: 0.29772 accuracy: 89.51562
[5,   400] loss: 0.33950 accuracy: 88.36719
[5,   600] loss: 0.36887 accuracy: 86.98438
Epoch 5 Test Accuracy: 76.18%
[6,   200] loss: 0.19345 accuracy: 93.48438
[6,   400] loss: 0.23554 accuracy: 91.74219
[6,   600] loss: 0.25330 accuracy: 91.27344
Epoch 6 Test Accuracy: 78.01%
[7,   200] loss: 0.14223 acc

0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▁▆▆▇▇▇▇▇█
test_loss,▆█▂▁▁▂▃▄▄▂
train_accuracy,▁▃▃▂▃▃▅▄▅▅▅▆▆▅▆▇▆▆▆▇▇█▇▆██▇████▇▇██▇████
train_loss,█▆▅▄▅▄▃▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▁▂▁▁▁▂▁▂▁▁▁▁▁▁

0,1
best_accuracy,79.89
epoch,10.0
test_accuracy,79.89
test_loss,0.82744
train_accuracy,96.875
train_loss,0.09114


Configuration 3 finished with best accuracy: 79.89%
--------------------------------------------------
Running configuration 4/10: {'architecture': 'ResNet18', 'learning_rate': 0.001, 'batch_size': 64, 'epochs': 10, 'optimizer': 'adam'}


[1,   200] loss: 1.79470 accuracy: 33.51562
[1,   400] loss: 1.44093 accuracy: 46.95312
[1,   600] loss: 1.18810 accuracy: 57.61719
Epoch 1 Test Accuracy: 57.56%
[2,   200] loss: 0.93859 accuracy: 66.79688
[2,   400] loss: 0.85998 accuracy: 70.00000
[2,   600] loss: 0.79733 accuracy: 71.78906
Epoch 2 Test Accuracy: 69.75%
[3,   200] loss: 0.67189 accuracy: 76.57031
[3,   400] loss: 0.62917 accuracy: 77.87500
[3,   600] loss: 0.61844 accuracy: 78.39844
Epoch 3 Test Accuracy: 72.50%
[4,   200] loss: 0.50138 accuracy: 82.53906
[4,   400] loss: 0.51063 accuracy: 81.77344
[4,   600] loss: 0.50694 accuracy: 82.68750
Epoch 4 Test Accuracy: 79.84%
[5,   200] loss: 0.39457 accuracy: 86.43750
[5,   400] loss: 0.40673 accuracy: 85.78125
[5,   600] loss: 0.40142 accuracy: 86.08594
Epoch 5 Test Accuracy: 81.24%
[6,   200] loss: 0.29900 accuracy: 89.53125
[6,   400] loss: 0.32850 accuracy: 88.73438
[6,   600] loss: 0.32730 accuracy: 88.26562
Epoch 6 Test Accuracy: 81.11%
[7,   200] loss: 0.23415 acc

0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▄▅▇▇▇████
test_loss,█▄▄▂▁▁▁▂▂▃
train_accuracy,▁▃▄▃▅▄▄▅▅▆▆▇▅▆▅▇▇▆▆▇▆▇▇▇▆▇▇▇▇▇█▇▇▇███▇██
train_loss,███▆▇▅▅▅▅▅▃▃▃▂▄▃▃▃▂▃▃▂▂▂▂▂▃▃▂▃▂▁▂▂▁▂▁▁▂▁

0,1
best_accuracy,83.55
epoch,10.0
test_accuracy,82.3
test_loss,0.70746
train_accuracy,93.75
train_loss,0.139


Configuration 4 finished with best accuracy: 83.55%
--------------------------------------------------
Running configuration 5/10: {'architecture': 'MLP', 'learning_rate': 0.01, 'batch_size': 64, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.9, 'hidden_layers': 5, 'hidden_size': 1024}


[1,   200] loss: 2.29987 accuracy: 11.21875
[1,   400] loss: 2.15575 accuracy: 19.35156
[1,   600] loss: 1.92862 accuracy: 27.35156
Epoch 1 Test Accuracy: 37.40%
[2,   200] loss: 1.71724 accuracy: 37.89062
[2,   400] loss: 1.61306 accuracy: 41.41406
[2,   600] loss: 1.57668 accuracy: 42.94531
Epoch 2 Test Accuracy: 47.07%
[3,   200] loss: 1.48302 accuracy: 46.72656
[3,   400] loss: 1.44888 accuracy: 47.73438
[3,   600] loss: 1.42025 accuracy: 48.86719
Epoch 3 Test Accuracy: 48.36%
[4,   200] loss: 1.35131 accuracy: 51.46094
[4,   400] loss: 1.32541 accuracy: 52.53125
[4,   600] loss: 1.32012 accuracy: 52.89062
Epoch 4 Test Accuracy: 52.77%
[5,   200] loss: 1.21967 accuracy: 56.38281
[5,   400] loss: 1.24061 accuracy: 55.48438
[5,   600] loss: 1.24626 accuracy: 55.63281
Epoch 5 Test Accuracy: 52.27%
[6,   200] loss: 1.13471 accuracy: 59.24219
[6,   400] loss: 1.15651 accuracy: 58.71875
[6,   600] loss: 1.17499 accuracy: 57.86719
Epoch 6 Test Accuracy: 54.27%
[7,   200] loss: 1.07319 acc

0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▅▅▇▇▇█▇▇█
test_loss,█▄▃▂▂▁▁▁▂▁
train_accuracy,▁▁▂▂▂▅▄▅▅▅▄▄▄▅▅▄▅▅▆▆▅▇▆▇▆▆▆▆▆▇▆▇█▆▇▆▇▆▇▆
train_loss,█▆▇▄▄▆▄▅▃▄▄▄▃▄▃▄▅▃▃▃▃▃▃▃▃▃▃▄▃▃▁▃▂▂▂▁▂▁▃▂

0,1
best_accuracy,56.0
epoch,10.0
test_accuracy,56.0
test_loss,1.28959
train_accuracy,70.3125
train_loss,0.82167


Configuration 5 finished with best accuracy: 56.00%
--------------------------------------------------
Running configuration 6/10: {'architecture': 'MLP', 'learning_rate': 0.001, 'batch_size': 128, 'epochs': 10, 'optimizer': 'adam', 'hidden_layers': 3, 'hidden_size': 2048}


[1,   200] loss: 1.86276 accuracy: 32.82422
Epoch 1 Test Accuracy: 38.96%
[2,   200] loss: 1.61963 accuracy: 42.81641
Epoch 2 Test Accuracy: 45.14%
[3,   200] loss: 1.50971 accuracy: 46.83203
Epoch 3 Test Accuracy: 46.69%
[4,   200] loss: 1.42994 accuracy: 49.31250
Epoch 4 Test Accuracy: 46.91%
[5,   200] loss: 1.38447 accuracy: 51.17969
Epoch 5 Test Accuracy: 48.57%
[6,   200] loss: 1.31566 accuracy: 53.50391
Epoch 6 Test Accuracy: 50.23%
[7,   200] loss: 1.27230 accuracy: 55.18359
Epoch 7 Test Accuracy: 50.06%
[8,   200] loss: 1.21784 accuracy: 57.07812
Epoch 8 Test Accuracy: 52.10%
[9,   200] loss: 1.17424 accuracy: 58.42969
Epoch 9 Test Accuracy: 52.17%
[10,   200] loss: 1.12082 accuracy: 60.16016
Epoch 10 Test Accuracy: 52.03%


0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▄▅▅▆▇▇███
test_loss,█▅▄▃▃▂▂▁▂▁
train_accuracy,▁▁▂▃▅▄▄▃▅▁▃▄▅▄▄▄▅▅▆▄▇▆▇▄▃▇▅▅▅▇▅▆█▇█▆▅▅▅▅
train_loss,█▇▇▇█▆▅▅▇▇█▇▅▆▄▅▃▅▄▅▄▄▄▄▄▂▄▂▃▃▂▃▃▃▃▂▂▃▁▃

0,1
best_accuracy,52.17
epoch,10.0
test_accuracy,52.03
test_loss,1.39151
train_accuracy,55.46875
train_loss,1.23156


Configuration 6 finished with best accuracy: 52.17%
--------------------------------------------------
Running configuration 7/10: {'architecture': 'BasicConvNet', 'learning_rate': 0.005, 'batch_size': 32, 'epochs': 10, 'optimizer': 'rmsprop'}


[1,   200] loss: 2.19267 accuracy: 20.09375
[1,   400] loss: 1.95653 accuracy: 27.12500
[1,   600] loss: 1.84637 accuracy: 31.96875
[1,   800] loss: 1.79304 accuracy: 33.85938
[1,  1000] loss: 1.74992 accuracy: 36.48438
[1,  1200] loss: 1.70046 accuracy: 37.07812
[1,  1400] loss: 1.68046 accuracy: 39.39062
Epoch 1 Test Accuracy: 39.25%
[2,   200] loss: 1.59426 accuracy: 42.37500
[2,   400] loss: 1.60086 accuracy: 41.23438
[2,   600] loss: 1.56491 accuracy: 43.39062
[2,   800] loss: 1.55803 accuracy: 43.76562
[2,  1000] loss: 1.57938 accuracy: 42.78125
[2,  1200] loss: 1.49804 accuracy: 45.04688
[2,  1400] loss: 1.51343 accuracy: 45.00000
Epoch 2 Test Accuracy: 44.72%
[3,   200] loss: 1.49477 accuracy: 46.59375
[3,   400] loss: 1.43094 accuracy: 47.82812
[3,   600] loss: 1.46887 accuracy: 47.78125
[3,   800] loss: 1.44597 accuracy: 48.43750
[3,  1000] loss: 1.45069 accuracy: 48.65625
[3,  1200] loss: 1.43753 accuracy: 48.43750
[3,  1400] loss: 1.42855 accuracy: 48.68750
Epoch 3 Test Acc

0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▃▄▆▅▆▇▇█▇
test_loss,█▅▅▃▅▃▂▂▁▂
train_accuracy,▁▂▂▂▇▆▅▅▄▅▆█▆▄▄▃▅▅▇▅▇▆▅▇▆▅▇▇█▇▅▅▇▄▆█▆▇▆▆
train_loss,▇▆▆▆▆▅▄▅▄▇▇█▃▄▄▆▃▃▄▄▂▅▁▅▃▅▃▄▃▃▂▅▃▆▃▅▅▃▃▃

0,1
best_accuracy,56.04
epoch,10.0
test_accuracy,54.23
test_loss,1.33972
train_accuracy,56.25
train_loss,1.39995


Configuration 7 finished with best accuracy: 56.04%
--------------------------------------------------
Running configuration 8/10: {'architecture': 'ResNet18', 'learning_rate': 0.0005, 'batch_size': 32, 'epochs': 10, 'optimizer': 'adam', 'weight_decay': 0.0001}


[1,   200] loss: 1.83146 accuracy: 33.09375
[1,   400] loss: 1.55195 accuracy: 43.10938
[1,   600] loss: 1.36633 accuracy: 49.76562
[1,   800] loss: 1.24634 accuracy: 55.14062
[1,  1000] loss: 1.13737 accuracy: 59.01562
[1,  1200] loss: 1.06803 accuracy: 61.98438
[1,  1400] loss: 1.02463 accuracy: 63.48438
Epoch 1 Test Accuracy: 63.42%
[2,   200] loss: 0.89126 accuracy: 68.42188
[2,   400] loss: 0.87598 accuracy: 69.06250
[2,   600] loss: 0.82726 accuracy: 70.95312
[2,   800] loss: 0.81739 accuracy: 71.06250
[2,  1000] loss: 0.77317 accuracy: 73.10938
[2,  1200] loss: 0.78414 accuracy: 73.17188
[2,  1400] loss: 0.73159 accuracy: 74.90625
Epoch 2 Test Accuracy: 66.03%
[3,   200] loss: 0.66513 accuracy: 77.17188
[3,   400] loss: 0.65289 accuracy: 77.79688
[3,   600] loss: 0.67286 accuracy: 76.73438
[3,   800] loss: 0.65271 accuracy: 77.57812
[3,  1000] loss: 0.64802 accuracy: 77.53125
[3,  1200] loss: 0.63405 accuracy: 78.56250
[3,  1400] loss: 0.61991 accuracy: 78.43750
Epoch 3 Test Acc

0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▂▅▅▆▅█▇██
test_loss,█▇▄▄▃▅▁▂▁▂
train_accuracy,▁▄▃▅▆▆▅▄▆▆▅▆▇▆▇▇▇▆▇▇▆▇▇▇▇▇▇▆▇▇▇█▇▇▇█▇▇▇█
train_loss,█▇▇▄▅▄▄▃▄▃▂▃▃▃▂▂▂▂▁▁▃▃▂▄▁▃▂▁▂▂▂▂▁▂▂▂▂▁▂▂

0,1
best_accuracy,84.21
epoch,10.0
test_accuracy,83.51
test_loss,0.54669
train_accuracy,93.75
train_loss,0.12635


Configuration 8 finished with best accuracy: 84.21%
--------------------------------------------------
Running configuration 9/10: {'architecture': 'BasicConvNet', 'learning_rate': 0.01, 'batch_size': 64, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.9, 'weight_decay': 0.0001, 'use_scheduler': True}


[1,   200] loss: 2.23501 accuracy: 16.21875
[1,   400] loss: 1.93454 accuracy: 28.64844
[1,   600] loss: 1.69823 accuracy: 37.82031
Epoch 1 Test Accuracy: 43.44%
[2,   200] loss: 1.48333 accuracy: 46.27344
[2,   400] loss: 1.44140 accuracy: 47.62500
[2,   600] loss: 1.39416 accuracy: 49.75000
Epoch 2 Test Accuracy: 52.87%
[3,   200] loss: 1.28870 accuracy: 53.89062
[3,   400] loss: 1.28386 accuracy: 54.51562
[3,   600] loss: 1.24409 accuracy: 55.82812
Epoch 3 Test Accuracy: 55.52%
[4,   200] loss: 1.17692 accuracy: 58.29688
[4,   400] loss: 1.15958 accuracy: 58.57031
[4,   600] loss: 1.15068 accuracy: 60.09375
Epoch 4 Test Accuracy: 58.43%
[5,   200] loss: 1.10522 accuracy: 61.16406
[5,   400] loss: 1.07702 accuracy: 62.47656
[5,   600] loss: 1.09105 accuracy: 61.75000
Epoch 5 Test Accuracy: 59.16%
[6,   200] loss: 1.00946 accuracy: 64.77344
[6,   400] loss: 1.01364 accuracy: 64.58594
[6,   600] loss: 1.03573 accuracy: 63.81250
Epoch 6 Test Accuracy: 60.00%
[7,   200] loss: 0.96449 acc

0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▄▅▆▆▇████
test_loss,█▅▄▃▃▂▁▁▂▂
train_accuracy,▂▂▁▄▄▅▄▅▅▆▅▇▆▆▆▆▆▇▆▇▇█▆▅▆▇▆▇▆▇▇▇▇▇▆▇▇▇▇▆
train_loss,█▅▇▆▅▆▅▄▄▃▃▃▃▅▃▅▅▄▅▃▄▆▄▂▂▂▂▄▂▂▂▂▄▂▂▂▂▂▁▂

0,1
best_accuracy,63.58
epoch,10.0
test_accuracy,62.4
test_loss,1.11097
train_accuracy,62.5
train_loss,1.0093


Configuration 9 finished with best accuracy: 63.58%
--------------------------------------------------
Running configuration 10/10: {'architecture': 'ResNet18', 'learning_rate': 0.01, 'batch_size': 128, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.95, 'weight_decay': 0.0005, 'use_scheduler': True}


[1,   200] loss: 1.70425 accuracy: 37.07422
Epoch 1 Test Accuracy: 53.56%
[2,   200] loss: 1.00847 accuracy: 64.25000
Epoch 2 Test Accuracy: 64.81%
[3,   200] loss: 0.67692 accuracy: 76.42578
Epoch 3 Test Accuracy: 67.89%
[4,   200] loss: 0.49640 accuracy: 82.91016
Epoch 4 Test Accuracy: 67.43%
[5,   200] loss: 0.36238 accuracy: 87.39062
Epoch 5 Test Accuracy: 76.16%
[6,   200] loss: 0.23666 accuracy: 91.83203
Epoch 6 Test Accuracy: 77.89%
[7,   200] loss: 0.16029 accuracy: 94.57031
Epoch 7 Test Accuracy: 77.68%
[8,   200] loss: 0.12126 accuracy: 95.86328
Epoch 8 Test Accuracy: 74.34%
[9,   200] loss: 0.05339 accuracy: 98.52344
Epoch 9 Test Accuracy: 83.95%
[10,   200] loss: 0.00916 accuracy: 99.92969
Epoch 10 Test Accuracy: 84.95%


0,1
epoch,▁▂▃▃▄▅▆▆▇█
test_accuracy,▁▄▄▄▆▆▆▆██
test_loss,█▅▅▆▃▃▃▅▁▁
train_accuracy,▁▂▂▂▄▆▆▆▆▆▇▆▆▆▆▆▇▇▆▇▇▇█▇▇█▇▇▇▇███▇██████
train_loss,█▇▇▆▆▄▅▄▅▄▃▃▃▃▃▃▂▂▃▂▃▂▂▂▂▂▂▂▁▁▁▂▂▁▁▁▁▁▁▁

0,1
best_accuracy,84.95
epoch,10.0
test_accuracy,84.95
test_loss,0.53108
train_accuracy,100.0
train_loss,0.0102


Configuration 10 finished with best accuracy: 84.95%
--------------------------------------------------

Best hyperparameter configurations:
1. Accuracy: 84.95%, Config: {'architecture': 'ResNet18', 'learning_rate': 0.01, 'batch_size': 128, 'epochs': 10, 'optimizer': 'sgd', 'momentum': 0.95, 'weight_decay': 0.0005, 'use_scheduler': True}
2. Accuracy: 84.21%, Config: {'architecture': 'ResNet18', 'learning_rate': 0.0005, 'batch_size': 32, 'epochs': 10, 'optimizer': 'adam', 'weight_decay': 0.0001}
3. Accuracy: 83.55%, Config: {'architecture': 'ResNet18', 'learning_rate': 0.001, 'batch_size': 64, 'epochs': 10, 'optimizer': 'adam'}


This software/tutorial is based on PyTorch, an open-source project available at https://github.com/pytorch/tutorials/

There is a BSD 3-Clause License as seen here: https://github.com/pytorch/tutorials/blob/main/LICENSE