# Hyperparameter sweeps with Weights&Biases

In [1]:
import os
import sys
module_path = os.path.abspath(os.path.join('../../fmriDEEP'))
if module_path not in sys.path:
    sys.path.append(module_path)

In [2]:
import torch
import numpy as np
from _utils.train_fns import standard_train
from _core.networks.LinearNets import SimpleLinearModel
from torch.utils.data import DataLoader, Dataset
from torchvision.transforms import ToTensor
from torchvision.datasets import MNIST

# NEW IMPORTANT IMPORTS
from _utils.tools import compute_accuracy, convert_wandb_config, read_config
import wandb

# this variable contains information whether a GPU can be used for training. If not, we automatically use the CPU.
DEVICE = torch.device("cuda:0" if torch.cuda.is_available() else "cpu")

Setting all random seeds for reproducibility.

In [3]:
# set the random seed for reproducibility
def set_random_seed(seed):
    import random 
    
    torch.manual_seed(seed)
    random.seed(seed)
    np.random.seed(seed)
    g = torch.Generator() # can be used in pytorch dataloaders for reproducible sample selection when shuffle=True
    g.manual_seed(seed)
    
    return g

g = set_random_seed(42)

Let us already load the MNIST data such that we do not forget it before running the parameter sweep :).

In [4]:
# get the MNIST dataset
mnist_train = MNIST('./data/', train=True, download=True if not os.path.exists('./data/MNIST') else False, transform=ToTensor())
mnist_test = MNIST('./data/', train=False, download=False, transform=ToTensor())

# create the dataloaders
dl_train = DataLoader(mnist_train, batch_size=256, shuffle=True, generator=g)
dl_test = DataLoader(mnist_test, batch_size=256, shuffle=True, generator=g)

## What are hyperparameters?

A __hyperparameter__ is a variable or setting that controls the learning process. Hyperparameters are "known" before the learning process begins and are not changed during training. They should never be confused with the __model parameters__ (e.g., weights and biases). 

Typical hyperparameters are the following:
* the learning rate
* the number of hidden layers 
* the number of neurons per hidden layer
* the kernel size per convolutional layer
* the number of channels per convolutional layer
* the cost function
* the optimization algorithm
* the activation function
* the number of epochs we use for training
* the batch size
* even the ratio of train/validation splits
* etc.

I am sure you get the idea that anything that changes the way a network learns is considered a hyperaparameter. Now, depending on how many hyperparameters we use the searchspace of the optimal parameter settings simply explodes. It is thus pretty much impossible to set these parameters manually. Thus we need some help in determining what the optimal parameters in our given searchspace are.

This is where the [weights&biases](www.wandb.ai) (wandb) package comes into play. 

## Using weights&biases (wandb)

What you will see in this jupyterbook is quite condensed and in certain cases you may need additional information that we do not provide here yet. Thus you can check out the official documention of hyperparameter sweeps with wandb [here](https://docs.wandb.ai/guides/sweeps). 

How we use wandb (adapted from [docs](https://docs.wandb.ai/guides/sweeps)):
1. Write config: Define the variables and ranges to sweep over and determine the search strategy. Wandb offers a few options:
    * grid: run all possible combinations
    * random: randomly choose a user supplied __n__ number of parameter combinations
    * and Bayesian search
2. Initialize the search: Wandb hosts a controller and coordinates between the agent(s) that execute the sweep. They can be local or distributed.
3. Launch agent(s): If we wanted to use multiple computers, we could use the same command to execute one training process with a selected parameter combination. The agent(s) ask(s) the sweep server what hyperparameter combination to try next, and then they execute the runs.
4. We visualize the results: we can do this locally on our computers or we can use the wandb platform to do so.

Let's get into it then.

First, we need to set ourselfs a goal. Let's say we want use a ```SimpleLinearModel``` to classify handwritten digits. We saw that the default settings of the ```SimpleLinearModel``` worked quite well but we cannot be sure that those parameters yield the best results. We therefore decided to change do a parameter search over the number of neurons per layer, the learning rate, and the number of epochs. 

There are two ways in which we can do it.
1. Defining a python ```dict```
2. Loading a .yaml file

In the cells below you see how both ways work.

### Setting up a sweep config with a python dict

In [5]:
sweep_config = {
    "name": "linear-mnist-sweep",
    "method": "random",
    "metric": {
        "name": "test_acc"
    },
    "parameters": {
        "lin_neurons1": {
            "values": [512, 256, 128, 64, 32, 16, 8] # the possible values for the first linear layer
        },
        "lin_neurons2": {
            "values": [512, 256, 128, 64, 32, 16, 8] # the possible values for the second linear layer
        },
        "lin_neurons3": {
            "values": [512, 256, 128, 64, 32, 16, 8] # the possible values for the third linear layer
        },
        "learning_rate": { # describes the range of possible values for the learning rate
            "min": .0001,
            "max": .1
        },
        "epochs": {
            "values": [5, 10, 20, 30] # the possible values for how many epochs to train the network
        }
    }
}

### Setting up a sweep config using a yaml file

In [6]:
sweep_config = read_config("mnist_sweep_config.yaml")
print(sweep_config)

{'name': 'linear-mnist-sweep', 'method': 'random', 'metric': {'name': 'test_acc'}, 'parameters': {'lin_neurons1': {'values': [512, 256, 128, 64, 32, 16, 8]}, 'lin_neurons2': {'values': [512, 256, 128, 64, 32, 16, 8]}, 'lin_neurons3': {'values': [512, 256, 128, 64, 32, 16, 8]}, 'learning_rate': {'min': 0.0001, 'max': 0.1}, 'epochs': {'values': [5, 10, 20, 30]}}}


You probably noticed that this ```dict```or config is not in the same format as the ```dict``` or config we need to configure our neural networks. 
Unfortunately, [wandb](www.wandb.ai) does not yet support nested values in hyperparameter searches (at least not to my knowledge). But do not be alarmed, I took care of this issue for now by writing a converter method called ```convert_wandb_config```. You can find it in the ```_utils.tools``` package. 

You will see this function in action in the sections below.

## Setting up the sweep

We are almost there.

Wandb also requires you to set a function for your agents to call. At least in a jupyternotebook like this one it does.

What you will find in the next code sections are two new functions: ```train_net()``` and ```run_train()```

We will now look in detail what each of them does:

In the new ```train_net()``` function I defined below you should notice, that this function works now for any network you supply to it. Additionally, there is something new in there: the ```wandb.log()``` function which takes a dict with loss and accuracy scores as its input. This function is part of the weights&biases package and logs and creates plots from the values we supply to it in real-time. 

In [7]:
def train_net(model, n_epochs, lr, logwandb=True):

    # loop for the above set number of epochs
    for epoch in range(0, n_epochs):

        # THIS IS WHERE THE MAGIC HAPPENS
        # calling the model.fit() function will execute the 'standard_train' function as defined above.
        train_loss, train_stats = model.fit(dl_train, lr=lr, device=DEVICE)
        train_acc = compute_accuracy(train_stats[:, -1], train_stats[:, -2])

        # for validating or testing set the network into evaluation mode such that layers like dropout are not active
        with torch.no_grad():
            test_loss, test_stats = model.fit(dl_test, device=DEVICE, train=False)
            test_acc = compute_accuracy(test_stats[:, -1], test_stats[:, -2])

        print('epoch=%03d, train_loss=%1.3f, train_acc=%1.3f, test_loss=%1.3f, test_acc=%1.3f' % 
             (epoch, train_loss, train_acc, test_loss, test_acc))

        # LOG PARAMETERS WITH WANDB
        # Please keep in mind that the code below might be better placed somewhere else
        # in case you want to use this function without weights and biases or use the
        # logwandb flag like here
        if logwandb:
            wandb.log({
                "train_loss": train_loss,
                "train_acc": train_acc,
                "test_loss": test_loss,
                "test_acc": test_acc,
            })

The ```run_train()``` function defined below might seem a bit redundant. Even though the ```train_net()``` function implemented above could be adapted with all the code below, it is best to separate as much functionality as much as possible. The way I programmed it now allows me to use the ```train_net()``` function in many different approaches. Whereas the ```run_train()``` function is currently specific for the ```SimpleLinearModel``` class. It is also the function I supply to the sweep-agents.

In [8]:
# define the training function with the wandb init
def run_train():
    
    # here we initialize weights&biases. 
    with wandb.init() as run:
        
        #Within this context we have access to the parameters the agent chose.
        #It would look something like this:
        #wandb.config.epochs = 5
        #wandb.config.lin_neurons1 = 512
        #wandb.config.lin_neurons2 = 8
        #wandb.config.lin_neurons3 = 128
        #wandb.config.learning_rate = 0.00791742
        
        # here's the promised conversion of the wandb.config
        # this results into a dict that contains key-value pairs that we can use to configure our network:
        # converted_config['lin_neurons'] = [512, 8, 128]
        converted_config = convert_wandb_config(wandb.config, SimpleLinearModel._REQUIRED_PARAMS)
        
        model = SimpleLinearModel(784, 10, converted_config)
        
        # We do not necessarily need this line but it is nice to update the config.
        wandb.config.update(model.config, allow_val_change=True)
        
        # now train the netwok, yay!
        train_net(model, wandb.config.epochs, wandb.config.learning_rate)


It is now time to create the sweep and thus the central controller:

In [9]:
%%script echo Already ran. Skipping to save time.
# set the wandb sweep config
#os.environ['WANDB_MODE'] = 'offline'
os.environ['WANDB_ENTITY'] = "philis893" # this is my wandb account name. This can also be a group name, for example
os.environ['WANDB_PROJECT'] = "test-jupytersweep" # this is simply the project name where we want to store the sweep logs and plots
sweep_id = wandb.sweep(sweep_config)

Create sweep with ID: ybwt9udl
Sweep URL: https://wandb.ai/philis893/test-jupytersweep/sweeps/ybwt9udl


In [10]:
%%script echo Already ran. Checkout: https://wandb.ai/philis893/test-jupytersweep/sweeps/ybwt9udl?workspace=user-philis893
count = 20
wandb.agent(sweep_id, function=run_train, count=count)

[34m[1mwandb[0m: Agent Starting Run: 1peget1q with config:
[34m[1mwandb[0m: 	epochs: 5
[34m[1mwandb[0m: 	learning_rate: 0.07651874690502439
[34m[1mwandb[0m: 	lin_neurons1: 512
[34m[1mwandb[0m: 	lin_neurons2: 32
[34m[1mwandb[0m: 	lin_neurons3: 256
[34m[1mwandb[0m: Currently logged in as: [33mphilis893[0m. Use [1m`wandb login --relogin`[0m to force relogin


epoch=000, train_loss=2.984, train_acc=0.108, test_loss=2.303, test_acc=0.114
epoch=001, train_loss=2.305, train_acc=0.106, test_loss=2.304, test_acc=0.114
epoch=002, train_loss=2.304, train_acc=0.106, test_loss=2.303, test_acc=0.114
epoch=003, train_loss=2.305, train_acc=0.106, test_loss=2.304, test_acc=0.103
epoch=004, train_loss=2.305, train_acc=0.107, test_loss=2.306, test_acc=0.114


VBox(children=(Label(value='0.425 MB of 0.425 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,███▁█
test_loss,▁▄▂▄█
train_acc,█▄▁▁▄
train_loss,█▁▁▁▁

0,1
test_acc,0.1135
test_loss,2.30609
train_acc,0.10667
train_loss,2.30481


[34m[1mwandb[0m: Agent Starting Run: u08dkmha with config:
[34m[1mwandb[0m: 	epochs: 20
[34m[1mwandb[0m: 	learning_rate: 0.08129885815148435
[34m[1mwandb[0m: 	lin_neurons1: 16
[34m[1mwandb[0m: 	lin_neurons2: 64
[34m[1mwandb[0m: 	lin_neurons3: 256


epoch=000, train_loss=2.371, train_acc=0.107, test_loss=2.307, test_acc=0.114
epoch=001, train_loss=2.305, train_acc=0.107, test_loss=2.310, test_acc=0.097
epoch=002, train_loss=2.305, train_acc=0.104, test_loss=2.304, test_acc=0.096
epoch=003, train_loss=2.305, train_acc=0.106, test_loss=2.302, test_acc=0.098
epoch=004, train_loss=2.305, train_acc=0.105, test_loss=2.309, test_acc=0.114
epoch=005, train_loss=2.305, train_acc=0.105, test_loss=2.303, test_acc=0.101
epoch=006, train_loss=2.305, train_acc=0.106, test_loss=2.304, test_acc=0.098
epoch=007, train_loss=2.305, train_acc=0.107, test_loss=2.305, test_acc=0.101
epoch=008, train_loss=2.305, train_acc=0.106, test_loss=2.305, test_acc=0.103
epoch=009, train_loss=2.305, train_acc=0.107, test_loss=2.303, test_acc=0.114
epoch=010, train_loss=2.305, train_acc=0.106, test_loss=2.304, test_acc=0.114
epoch=011, train_loss=2.305, train_acc=0.107, test_loss=2.306, test_acc=0.101
epoch=012, train_loss=2.304, train_acc=0.107, test_loss=2.304, t

VBox(children=(Label(value='0.418 MB of 0.418 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,█▂▁▂█▃▂▃▄██▃███▄██▃█
test_loss,▅█▃▁▇▂▃▃▄▂▃▅▃▃▂▁▂▃▄▃
train_acc,██▁▆▄▃▅▇▆▇▅▆█▇▇▃▂▅▇▄
train_loss,█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
test_acc,0.1135
test_loss,2.3039
train_acc,0.10557
train_loss,2.30468


[34m[1mwandb[0m: Agent Starting Run: l7haux0t with config:
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	learning_rate: 0.0681960703387337
[34m[1mwandb[0m: 	lin_neurons1: 32
[34m[1mwandb[0m: 	lin_neurons2: 8
[34m[1mwandb[0m: 	lin_neurons3: 8


epoch=000, train_loss=1.912, train_acc=0.248, test_loss=1.583, test_acc=0.376
epoch=001, train_loss=1.828, train_acc=0.278, test_loss=1.693, test_acc=0.274
epoch=002, train_loss=1.815, train_acc=0.286, test_loss=1.446, test_acc=0.477
epoch=003, train_loss=1.758, train_acc=0.308, test_loss=1.471, test_acc=0.411
epoch=004, train_loss=1.760, train_acc=0.307, test_loss=1.373, test_acc=0.428
epoch=005, train_loss=1.778, train_acc=0.301, test_loss=1.332, test_acc=0.470
epoch=006, train_loss=1.751, train_acc=0.310, test_loss=1.308, test_acc=0.471
epoch=007, train_loss=1.758, train_acc=0.308, test_loss=1.298, test_acc=0.504
epoch=008, train_loss=1.744, train_acc=0.318, test_loss=2.115, test_acc=0.292
epoch=009, train_loss=1.764, train_acc=0.310, test_loss=1.374, test_acc=0.475


VBox(children=(Label(value='0.418 MB of 0.418 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▄▁▇▅▆▇▇█▂▇
test_loss,▃▄▂▂▂▁▁▁█▂
train_acc,▁▄▅▇▇▆▇▇█▇
train_loss,█▄▄▂▂▂▁▂▁▂

0,1
test_acc,0.4747
test_loss,1.37402
train_acc,0.30975
train_loss,1.76403


[34m[1mwandb[0m: Agent Starting Run: vl3u75w3 with config:
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	learning_rate: 0.02381620275967221
[34m[1mwandb[0m: 	lin_neurons1: 256
[34m[1mwandb[0m: 	lin_neurons2: 16
[34m[1mwandb[0m: 	lin_neurons3: 512


epoch=000, train_loss=2.314, train_acc=0.107, test_loss=2.301, test_acc=0.101
epoch=001, train_loss=2.309, train_acc=0.108, test_loss=2.303, test_acc=0.114
epoch=002, train_loss=2.303, train_acc=0.111, test_loss=2.302, test_acc=0.103
epoch=003, train_loss=2.303, train_acc=0.109, test_loss=2.303, test_acc=0.114
epoch=004, train_loss=2.302, train_acc=0.110, test_loss=2.302, test_acc=0.103
epoch=005, train_loss=2.302, train_acc=0.111, test_loss=2.301, test_acc=0.114
epoch=006, train_loss=2.302, train_acc=0.110, test_loss=2.302, test_acc=0.114
epoch=007, train_loss=2.302, train_acc=0.109, test_loss=2.302, test_acc=0.114
epoch=008, train_loss=2.302, train_acc=0.110, test_loss=2.302, test_acc=0.114
epoch=009, train_loss=2.302, train_acc=0.109, test_loss=2.302, test_acc=0.114


VBox(children=(Label(value='0.430 MB of 0.430 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▁█▂█▂█████
test_loss,▃█▇█▄▁▄▄▄▅
train_acc,▁▁▆▄▅█▅▃▆▄
train_loss,█▅▁▁▁▁▁▁▁▁

0,1
test_acc,0.1135
test_loss,2.30181
train_acc,0.10907
train_loss,2.30223


[34m[1mwandb[0m: Agent Starting Run: jm1pa0lk with config:
[34m[1mwandb[0m: 	epochs: 30
[34m[1mwandb[0m: 	learning_rate: 0.08381828115342409
[34m[1mwandb[0m: 	lin_neurons1: 64
[34m[1mwandb[0m: 	lin_neurons2: 32
[34m[1mwandb[0m: 	lin_neurons3: 512


epoch=000, train_loss=2.504, train_acc=0.106, test_loss=2.303, test_acc=0.114
epoch=001, train_loss=2.308, train_acc=0.105, test_loss=2.303, test_acc=0.114
epoch=002, train_loss=2.305, train_acc=0.107, test_loss=2.306, test_acc=0.114
epoch=003, train_loss=2.305, train_acc=0.107, test_loss=2.302, test_acc=0.114
epoch=004, train_loss=2.305, train_acc=0.105, test_loss=2.307, test_acc=0.098
epoch=005, train_loss=2.305, train_acc=0.106, test_loss=2.304, test_acc=0.114
epoch=006, train_loss=2.305, train_acc=0.108, test_loss=2.305, test_acc=0.103
epoch=007, train_loss=2.306, train_acc=0.107, test_loss=2.305, test_acc=0.114
epoch=008, train_loss=2.305, train_acc=0.107, test_loss=2.303, test_acc=0.114
epoch=009, train_loss=2.305, train_acc=0.104, test_loss=2.305, test_acc=0.103
epoch=010, train_loss=2.305, train_acc=0.107, test_loss=2.305, test_acc=0.103
epoch=011, train_loss=2.305, train_acc=0.104, test_loss=2.306, test_acc=0.103
epoch=012, train_loss=2.305, train_acc=0.107, test_loss=2.303, t

VBox(children=(Label(value='0.430 MB of 0.430 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,████▂█▄██▄▄▄▃▂███▄█▃▂▁█▃▂▃▂▄█▄
test_loss,▂▁▃▁▄▂▃▃▂▃▃▄▂▄▂▂▁▃▃▂▂▅▃█▃▄▃▁▃▃
train_acc,▅▂▅▆▂▄█▆▆▁▆▁▆▂▃▁▃▁▅▆▃▅▅▂▅▃▂▄▁▇
train_loss,█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
test_acc,0.1028
test_loss,2.3047
train_acc,0.1078
train_loss,2.30434


[34m[1mwandb[0m: Agent Starting Run: o3vegdsa with config:
[34m[1mwandb[0m: 	epochs: 20
[34m[1mwandb[0m: 	learning_rate: 0.023067025878110575
[34m[1mwandb[0m: 	lin_neurons1: 16
[34m[1mwandb[0m: 	lin_neurons2: 16
[34m[1mwandb[0m: 	lin_neurons3: 64


epoch=000, train_loss=1.520, train_acc=0.467, test_loss=1.049, test_acc=0.680
epoch=001, train_loss=1.388, train_acc=0.533, test_loss=1.011, test_acc=0.682
epoch=002, train_loss=1.374, train_acc=0.540, test_loss=0.973, test_acc=0.704
epoch=003, train_loss=1.351, train_acc=0.550, test_loss=0.999, test_acc=0.717
epoch=004, train_loss=1.352, train_acc=0.551, test_loss=0.869, test_acc=0.769
epoch=005, train_loss=1.330, train_acc=0.562, test_loss=0.863, test_acc=0.773
epoch=006, train_loss=1.331, train_acc=0.558, test_loss=0.877, test_acc=0.757
epoch=007, train_loss=1.321, train_acc=0.562, test_loss=0.826, test_acc=0.780
epoch=008, train_loss=1.344, train_acc=0.556, test_loss=0.927, test_acc=0.755
epoch=009, train_loss=1.340, train_acc=0.558, test_loss=0.792, test_acc=0.807
epoch=010, train_loss=1.338, train_acc=0.551, test_loss=0.824, test_acc=0.784
epoch=011, train_loss=1.328, train_acc=0.558, test_loss=0.852, test_acc=0.798
epoch=012, train_loss=1.336, train_acc=0.553, test_loss=0.824, t

VBox(children=(Label(value='0.442 MB of 0.442 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▁▁▂▃▅▅▅▆▅▇▆▆▇▇▇▇█▇▆▃
test_loss,█▇▆▇▄▃▄▃▅▂▂▃▂▃▂▂▁▂▂▆
train_acc,▁▆▆▇▇█████▇█▇▇█████▇
train_loss,█▃▃▂▂▁▁▁▂▂▂▁▂▂▁▁▁▁▁▁

0,1
test_acc,0.7296
test_loss,0.95219
train_acc,0.55465
train_loss,1.32553


[34m[1mwandb[0m: Agent Starting Run: bxa3pvz4 with config:
[34m[1mwandb[0m: 	epochs: 20
[34m[1mwandb[0m: 	learning_rate: 0.00869246994257245
[34m[1mwandb[0m: 	lin_neurons1: 32
[34m[1mwandb[0m: 	lin_neurons2: 16
[34m[1mwandb[0m: 	lin_neurons3: 512


epoch=000, train_loss=1.148, train_acc=0.604, test_loss=0.605, test_acc=0.855
epoch=001, train_loss=0.933, train_acc=0.702, test_loss=0.517, test_acc=0.896
epoch=002, train_loss=0.894, train_acc=0.721, test_loss=0.554, test_acc=0.888
epoch=003, train_loss=0.860, train_acc=0.727, test_loss=0.496, test_acc=0.892
epoch=004, train_loss=0.842, train_acc=0.738, test_loss=0.548, test_acc=0.893
epoch=005, train_loss=0.834, train_acc=0.739, test_loss=0.523, test_acc=0.901
epoch=006, train_loss=0.837, train_acc=0.738, test_loss=0.587, test_acc=0.852
epoch=007, train_loss=0.834, train_acc=0.739, test_loss=0.549, test_acc=0.882
epoch=008, train_loss=0.825, train_acc=0.743, test_loss=0.469, test_acc=0.901
epoch=009, train_loss=0.823, train_acc=0.744, test_loss=0.535, test_acc=0.898
epoch=010, train_loss=0.816, train_acc=0.749, test_loss=0.546, test_acc=0.878
epoch=011, train_loss=0.817, train_acc=0.745, test_loss=0.526, test_acc=0.889
epoch=012, train_loss=0.816, train_acc=0.749, test_loss=0.492, t

VBox(children=(Label(value='0.450 MB of 0.450 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▁▇▆▇▇█▁▅██▅▆▇▆▇▇▇▇▆▆
test_loss,█▄▆▃▅▄▇▅▁▅▅▄▃▆▃▂▁▂▄▂
train_acc,▁▆▆▇▇▇▇▇▇▇██████████
train_loss,█▄▃▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁

0,1
test_acc,0.8904
test_loss,0.48999
train_acc,0.7552
train_loss,0.79474


[34m[1mwandb[0m: Agent Starting Run: 6j2up1wc with config:
[34m[1mwandb[0m: 	epochs: 30
[34m[1mwandb[0m: 	learning_rate: 0.025088393451725127
[34m[1mwandb[0m: 	lin_neurons1: 256
[34m[1mwandb[0m: 	lin_neurons2: 64
[34m[1mwandb[0m: 	lin_neurons3: 256


epoch=000, train_loss=1.006, train_acc=0.687, test_loss=0.446, test_acc=0.890
epoch=001, train_loss=0.846, train_acc=0.769, test_loss=0.480, test_acc=0.885
epoch=002, train_loss=0.866, train_acc=0.768, test_loss=0.456, test_acc=0.896
epoch=003, train_loss=0.884, train_acc=0.771, test_loss=0.465, test_acc=0.897
epoch=004, train_loss=0.912, train_acc=0.756, test_loss=0.505, test_acc=0.883
epoch=005, train_loss=0.911, train_acc=0.759, test_loss=0.507, test_acc=0.889
epoch=006, train_loss=0.951, train_acc=0.759, test_loss=0.486, test_acc=0.898
epoch=007, train_loss=0.972, train_acc=0.747, test_loss=0.552, test_acc=0.866
epoch=008, train_loss=1.028, train_acc=0.741, test_loss=0.520, test_acc=0.874
epoch=009, train_loss=1.034, train_acc=0.750, test_loss=0.518, test_acc=0.877
epoch=010, train_loss=1.025, train_acc=0.742, test_loss=0.560, test_acc=0.885
epoch=011, train_loss=1.022, train_acc=0.739, test_loss=0.569, test_acc=0.867
epoch=012, train_loss=1.147, train_acc=0.725, test_loss=0.593, t

VBox(children=(Label(value='0.460 MB of 0.460 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▇▇██▆▇█▅▅▆▇▅▅▇▆▆▆▁▅▆▄▇▂▁▅▄▅▃▂▁
test_loss,▁▂▁▂▃▃▂▄▃▃▄▄▅▄▃▃▃▆▄▄▅▃▆▇▅█▆▇▇▇
train_acc,▁███▇▇▇▆▆▆▆▅▄▄▄▅▃▄▅▅▄▄▄▄▂▃▃▄▃▃
train_loss,▄▁▁▂▂▂▃▃▄▄▄▄▆▆▅▄▆▆▅▄▆▅▇▇█▆▇██▆

0,1
test_acc,0.8304
test_loss,0.66409
train_acc,0.70628
train_loss,1.12664


[34m[1mwandb[0m: Agent Starting Run: xgqoqdcm with config:
[34m[1mwandb[0m: 	epochs: 30
[34m[1mwandb[0m: 	learning_rate: 0.005256195272405696
[34m[1mwandb[0m: 	lin_neurons1: 64
[34m[1mwandb[0m: 	lin_neurons2: 512
[34m[1mwandb[0m: 	lin_neurons3: 8


epoch=000, train_loss=0.675, train_acc=0.786, test_loss=0.240, test_acc=0.931
epoch=001, train_loss=0.402, train_acc=0.881, test_loss=0.197, test_acc=0.943
epoch=002, train_loss=0.350, train_acc=0.898, test_loss=0.186, test_acc=0.950
epoch=003, train_loss=0.326, train_acc=0.904, test_loss=0.190, test_acc=0.950
epoch=004, train_loss=0.318, train_acc=0.906, test_loss=0.161, test_acc=0.958
epoch=005, train_loss=0.305, train_acc=0.911, test_loss=0.184, test_acc=0.953
epoch=006, train_loss=0.296, train_acc=0.914, test_loss=0.161, test_acc=0.958
epoch=007, train_loss=0.294, train_acc=0.917, test_loss=0.164, test_acc=0.956
epoch=008, train_loss=0.280, train_acc=0.919, test_loss=0.172, test_acc=0.955
epoch=009, train_loss=0.278, train_acc=0.920, test_loss=0.160, test_acc=0.958
epoch=010, train_loss=0.270, train_acc=0.922, test_loss=0.154, test_acc=0.959
epoch=011, train_loss=0.265, train_acc=0.925, test_loss=0.150, test_acc=0.961
epoch=012, train_loss=0.266, train_acc=0.924, test_loss=0.168, t

VBox(children=(Label(value='0.464 MB of 0.464 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▁▃▅▅▇▆▇▆▆▇▇▇▇▇▇█▇▇▇▇▇▇▇▇▇▇████
test_loss,█▅▄▄▂▄▂▂▃▂▂▁▃▁▂▂▃▂▃▃▃▂▂▂▁▂▁▁▁▂
train_acc,▁▆▆▇▇▇▇▇▇▇▇███████████████████
train_loss,█▄▃▂▂▂▂▂▂▂▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
test_acc,0.9634
test_loss,0.16053
train_acc,0.93348
train_loss,0.23925


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: zo8t64hk with config:
[34m[1mwandb[0m: 	epochs: 5
[34m[1mwandb[0m: 	learning_rate: 0.007916330440962894
[34m[1mwandb[0m: 	lin_neurons1: 128
[34m[1mwandb[0m: 	lin_neurons2: 256
[34m[1mwandb[0m: 	lin_neurons3: 16


epoch=000, train_loss=0.526, train_acc=0.834, test_loss=0.191, test_acc=0.942
epoch=001, train_loss=0.331, train_acc=0.904, test_loss=0.163, test_acc=0.952
epoch=002, train_loss=0.296, train_acc=0.916, test_loss=0.149, test_acc=0.959
epoch=003, train_loss=0.277, train_acc=0.922, test_loss=0.139, test_acc=0.962
epoch=004, train_loss=0.263, train_acc=0.928, test_loss=0.136, test_acc=0.964


VBox(children=(Label(value='0.472 MB of 0.472 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▁▄▆▇█
test_loss,█▄▃▁▁
train_acc,▁▆▇██
train_loss,█▃▂▁▁

0,1
test_acc,0.9639
test_loss,0.13556
train_acc,0.92817
train_loss,0.26264


[34m[1mwandb[0m: Agent Starting Run: ahoak6xt with config:
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	learning_rate: 0.006867094940682864
[34m[1mwandb[0m: 	lin_neurons1: 64
[34m[1mwandb[0m: 	lin_neurons2: 512
[34m[1mwandb[0m: 	lin_neurons3: 256


epoch=000, train_loss=0.593, train_acc=0.808, test_loss=0.249, test_acc=0.930
epoch=001, train_loss=0.436, train_acc=0.869, test_loss=0.245, test_acc=0.926
epoch=002, train_loss=0.401, train_acc=0.883, test_loss=0.224, test_acc=0.941
epoch=003, train_loss=0.392, train_acc=0.888, test_loss=0.224, test_acc=0.947
epoch=004, train_loss=0.383, train_acc=0.892, test_loss=0.213, test_acc=0.945
epoch=005, train_loss=0.376, train_acc=0.896, test_loss=0.215, test_acc=0.945
epoch=006, train_loss=0.381, train_acc=0.893, test_loss=0.208, test_acc=0.948
epoch=007, train_loss=0.376, train_acc=0.896, test_loss=0.206, test_acc=0.948
epoch=008, train_loss=0.384, train_acc=0.897, test_loss=0.216, test_acc=0.943
epoch=009, train_loss=0.383, train_acc=0.897, test_loss=0.195, test_acc=0.946


VBox(children=(Label(value='0.472 MB of 0.472 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▂▁▆█▇▇██▆▇
test_loss,██▅▅▃▄▃▂▄▁
train_acc,▁▆▇▇██████
train_loss,█▃▂▂▁▁▁▁▁▁

0,1
test_acc,0.9456
test_loss,0.19507
train_acc,0.89738
train_loss,0.38341


[34m[1mwandb[0m: Agent Starting Run: wre54fil with config:
[34m[1mwandb[0m: 	epochs: 20
[34m[1mwandb[0m: 	learning_rate: 0.05118006483301388
[34m[1mwandb[0m: 	lin_neurons1: 128
[34m[1mwandb[0m: 	lin_neurons2: 128
[34m[1mwandb[0m: 	lin_neurons3: 64


epoch=000, train_loss=1.715, train_acc=0.367, test_loss=1.233, test_acc=0.528
epoch=001, train_loss=1.717, train_acc=0.362, test_loss=1.382, test_acc=0.488
epoch=002, train_loss=1.797, train_acc=0.331, test_loss=1.354, test_acc=0.480
epoch=003, train_loss=1.820, train_acc=0.334, test_loss=1.364, test_acc=0.470
epoch=004, train_loss=1.819, train_acc=0.334, test_loss=1.453, test_acc=0.436
epoch=005, train_loss=1.859, train_acc=0.325, test_loss=1.512, test_acc=0.435
epoch=006, train_loss=1.858, train_acc=0.334, test_loss=1.455, test_acc=0.522
epoch=007, train_loss=1.856, train_acc=0.340, test_loss=1.432, test_acc=0.402
epoch=008, train_loss=1.919, train_acc=0.321, test_loss=1.467, test_acc=0.452
epoch=009, train_loss=1.844, train_acc=0.328, test_loss=1.468, test_acc=0.435
epoch=010, train_loss=1.873, train_acc=0.315, test_loss=1.447, test_acc=0.440
epoch=011, train_loss=1.876, train_acc=0.312, test_loss=1.459, test_acc=0.457
epoch=012, train_loss=1.848, train_acc=0.317, test_loss=1.390, t

VBox(children=(Label(value='0.483 MB of 0.483 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,█▆▆▆▄▄█▃▅▄▄▅▅▇▅▄▃▁▄▄
test_loss,▁▄▃▃▅▆▅▅▅▅▅▅▄▄▅▅▇▇▆█
train_acc,██▅▅▅▄▅▆▄▅▄▃▄▃▃▁▁▁▂▂
train_loss,▁▁▃▃▃▄▄▄▅▃▄▄▄▇▅▅█▅▄▄

0,1
test_acc,0.4361
test_loss,1.6076
train_acc,0.29748
train_loss,1.86577


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: horg29d5 with config:
[34m[1mwandb[0m: 	epochs: 20
[34m[1mwandb[0m: 	learning_rate: 0.05722151642949417
[34m[1mwandb[0m: 	lin_neurons1: 32
[34m[1mwandb[0m: 	lin_neurons2: 256
[34m[1mwandb[0m: 	lin_neurons3: 512


epoch=000, train_loss=2.616, train_acc=0.110, test_loss=2.303, test_acc=0.114
epoch=001, train_loss=2.304, train_acc=0.107, test_loss=2.307, test_acc=0.097
epoch=002, train_loss=2.304, train_acc=0.108, test_loss=2.303, test_acc=0.103
epoch=003, train_loss=2.304, train_acc=0.106, test_loss=2.303, test_acc=0.114
epoch=004, train_loss=2.304, train_acc=0.110, test_loss=2.302, test_acc=0.114
epoch=005, train_loss=2.304, train_acc=0.107, test_loss=2.303, test_acc=0.101
epoch=006, train_loss=2.304, train_acc=0.106, test_loss=2.303, test_acc=0.103
epoch=007, train_loss=2.304, train_acc=0.107, test_loss=2.302, test_acc=0.114
epoch=008, train_loss=2.303, train_acc=0.107, test_loss=2.303, test_acc=0.114
epoch=009, train_loss=2.304, train_acc=0.107, test_loss=2.305, test_acc=0.103
epoch=010, train_loss=2.304, train_acc=0.106, test_loss=2.302, test_acc=0.114
epoch=011, train_loss=2.304, train_acc=0.107, test_loss=2.303, test_acc=0.103
epoch=012, train_loss=2.304, train_acc=0.108, test_loss=2.306, t

VBox(children=(Label(value='0.491 MB of 0.491 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,█▁▃██▃▃██▃█▃▃▁█▃▄▁▃█
test_loss,▃█▂▃▁▂▂▁▂▅▁▂▇▅▆▂▂▅▃▂
train_acc,█▄▅▃█▅▃▄▄▄▃▄▆▃▁▄▃▄▄▅
train_loss,█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
test_acc,0.1135
test_loss,2.30288
train_acc,0.10785
train_loss,2.30401


[34m[1mwandb[0m: Agent Starting Run: ryo3v4ra with config:
[34m[1mwandb[0m: 	epochs: 5
[34m[1mwandb[0m: 	learning_rate: 0.041806213244571365
[34m[1mwandb[0m: 	lin_neurons1: 16
[34m[1mwandb[0m: 	lin_neurons2: 256
[34m[1mwandb[0m: 	lin_neurons3: 8


epoch=000, train_loss=2.032, train_acc=0.203, test_loss=1.724, test_acc=0.296
epoch=001, train_loss=2.022, train_acc=0.208, test_loss=1.710, test_acc=0.300
epoch=002, train_loss=2.032, train_acc=0.207, test_loss=1.669, test_acc=0.333
epoch=003, train_loss=2.039, train_acc=0.199, test_loss=1.725, test_acc=0.287
epoch=004, train_loss=2.024, train_acc=0.202, test_loss=1.673, test_acc=0.297


VBox(children=(Label(value='0.491 MB of 0.491 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▂▃█▁▂
test_loss,█▆▁█▂
train_acc,▄█▇▁▃
train_loss,▅▁▅█▂

0,1
test_acc,0.2966
test_loss,1.67309
train_acc,0.20208
train_loss,2.02417


[34m[1mwandb[0m: Agent Starting Run: warxm32e with config:
[34m[1mwandb[0m: 	epochs: 30
[34m[1mwandb[0m: 	learning_rate: 0.04582628671500435
[34m[1mwandb[0m: 	lin_neurons1: 16
[34m[1mwandb[0m: 	lin_neurons2: 256
[34m[1mwandb[0m: 	lin_neurons3: 64


epoch=000, train_loss=2.316, train_acc=0.109, test_loss=2.302, test_acc=0.114
epoch=001, train_loss=2.303, train_acc=0.109, test_loss=2.303, test_acc=0.114
epoch=002, train_loss=2.303, train_acc=0.110, test_loss=2.305, test_acc=0.114
epoch=003, train_loss=2.303, train_acc=0.108, test_loss=2.306, test_acc=0.089
epoch=004, train_loss=2.303, train_acc=0.108, test_loss=2.303, test_acc=0.114
epoch=005, train_loss=2.303, train_acc=0.108, test_loss=2.303, test_acc=0.101
epoch=006, train_loss=2.303, train_acc=0.107, test_loss=2.303, test_acc=0.103
epoch=007, train_loss=2.304, train_acc=0.107, test_loss=2.304, test_acc=0.114
epoch=008, train_loss=2.303, train_acc=0.108, test_loss=2.305, test_acc=0.097
epoch=009, train_loss=2.303, train_acc=0.107, test_loss=2.306, test_acc=0.101
epoch=010, train_loss=2.303, train_acc=0.107, test_loss=2.305, test_acc=0.114
epoch=011, train_loss=2.304, train_acc=0.108, test_loss=2.303, test_acc=0.103
epoch=012, train_loss=2.303, train_acc=0.108, test_loss=2.303, t

VBox(children=(Label(value='0.502 MB of 0.502 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,███▁█▄▅█▃▄█▅▅▄▄██▅▃▅▅█▄██▄████
test_loss,▂▃▄▅▃▃▃▄▅▆▅▃▃▃▄▃▅█▄▃▄▄▃▃▂▃▃▂▁▂
train_acc,▅▅▆▄▃▄▁▂▃▂▂▃▄▆▅▁█▄▃▃▃▂▄▂▇▆▃▂▆▁
train_loss,█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁

0,1
test_acc,0.1135
test_loss,2.30213
train_acc,0.10663
train_loss,2.30364


[34m[1mwandb[0m: Agent Starting Run: bl7wmb77 with config:
[34m[1mwandb[0m: 	epochs: 20
[34m[1mwandb[0m: 	learning_rate: 0.028360026816680692
[34m[1mwandb[0m: 	lin_neurons1: 512
[34m[1mwandb[0m: 	lin_neurons2: 512
[34m[1mwandb[0m: 	lin_neurons3: 512


epoch=000, train_loss=1.913, train_acc=0.441, test_loss=0.886, test_acc=0.760
epoch=001, train_loss=1.472, train_acc=0.490, test_loss=0.887, test_acc=0.742
epoch=002, train_loss=1.556, train_acc=0.472, test_loss=0.814, test_acc=0.798
epoch=003, train_loss=1.600, train_acc=0.447, test_loss=1.067, test_acc=0.633
epoch=004, train_loss=1.584, train_acc=0.460, test_loss=0.974, test_acc=0.650
epoch=005, train_loss=1.640, train_acc=0.441, test_loss=1.013, test_acc=0.675
epoch=006, train_loss=1.649, train_acc=0.433, test_loss=1.023, test_acc=0.617
epoch=007, train_loss=1.700, train_acc=0.426, test_loss=1.192, test_acc=0.635
epoch=008, train_loss=1.729, train_acc=0.423, test_loss=1.174, test_acc=0.588
epoch=009, train_loss=1.806, train_acc=0.390, test_loss=1.251, test_acc=0.610
epoch=010, train_loss=1.901, train_acc=0.362, test_loss=1.435, test_acc=0.476
epoch=011, train_loss=1.884, train_acc=0.347, test_loss=1.347, test_acc=0.507
epoch=012, train_loss=1.963, train_acc=0.334, test_loss=1.230, t

VBox(children=(Label(value='0.510 MB of 0.510 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▇▇█▅▆▆▅▅▅▅▃▃▄▂▃▃▂▂▁▂
test_loss,▂▂▁▃▂▃▃▄▄▄▆▅▄▆▆▅█▅▇▅
train_acc,▆█▇▇▇▆▆▆▆▅▄▃▃▃▂▂▁▁▂▁
train_loss,▇▁▂▃▂▃▃▄▄▅▆▆▇▇▇▆▇█▇█

0,1
test_acc,0.4227
test_loss,1.3825
train_acc,0.30138
train_loss,2.02885


[34m[1mwandb[0m: Agent Starting Run: yuafiup4 with config:
[34m[1mwandb[0m: 	epochs: 5
[34m[1mwandb[0m: 	learning_rate: 0.025519390701341717
[34m[1mwandb[0m: 	lin_neurons1: 256
[34m[1mwandb[0m: 	lin_neurons2: 8
[34m[1mwandb[0m: 	lin_neurons3: 64


epoch=000, train_loss=1.841, train_acc=0.298, test_loss=1.185, test_acc=0.523
epoch=001, train_loss=1.460, train_acc=0.482, test_loss=0.902, test_acc=0.700
epoch=002, train_loss=1.367, train_acc=0.522, test_loss=0.884, test_acc=0.721
epoch=003, train_loss=1.311, train_acc=0.542, test_loss=0.846, test_acc=0.714
epoch=004, train_loss=1.252, train_acc=0.565, test_loss=0.826, test_acc=0.756


VBox(children=(Label(value='0.517 MB of 0.517 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▁▆▇▇█
test_loss,█▂▂▁▁
train_acc,▁▆▇▇█
train_loss,█▃▂▂▁

0,1
test_acc,0.7563
test_loss,0.82592
train_acc,0.56508
train_loss,1.25215


[34m[1mwandb[0m: Agent Starting Run: cyky78ph with config:
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	learning_rate: 0.06558277936961326
[34m[1mwandb[0m: 	lin_neurons1: 64
[34m[1mwandb[0m: 	lin_neurons2: 512
[34m[1mwandb[0m: 	lin_neurons3: 16


epoch=000, train_loss=2.456, train_acc=0.108, test_loss=2.304, test_acc=0.114
epoch=001, train_loss=2.304, train_acc=0.106, test_loss=2.303, test_acc=0.103
epoch=002, train_loss=2.304, train_acc=0.105, test_loss=2.304, test_acc=0.114
epoch=003, train_loss=2.305, train_acc=0.105, test_loss=2.307, test_acc=0.103
epoch=004, train_loss=2.304, train_acc=0.108, test_loss=2.305, test_acc=0.114
epoch=005, train_loss=2.304, train_acc=0.106, test_loss=2.304, test_acc=0.103
epoch=006, train_loss=2.304, train_acc=0.108, test_loss=2.301, test_acc=0.114
epoch=007, train_loss=2.304, train_acc=0.107, test_loss=2.304, test_acc=0.114
epoch=008, train_loss=2.304, train_acc=0.107, test_loss=2.306, test_acc=0.114
epoch=009, train_loss=2.304, train_acc=0.108, test_loss=2.304, test_acc=0.114


VBox(children=(Label(value='0.517 MB of 0.517 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,█▁█▁█▁████
test_loss,▄▄▄█▆▅▁▄▆▄
train_acc,▇▃▁▁█▃▇▅▅█
train_loss,█▁▁▁▁▁▁▁▁▁

0,1
test_acc,0.1135
test_loss,2.30387
train_acc,0.10802
train_loss,2.30411


[34m[1mwandb[0m: Agent Starting Run: mxx8cpk1 with config:
[34m[1mwandb[0m: 	epochs: 30
[34m[1mwandb[0m: 	learning_rate: 0.07101988863639593
[34m[1mwandb[0m: 	lin_neurons1: 128
[34m[1mwandb[0m: 	lin_neurons2: 16
[34m[1mwandb[0m: 	lin_neurons3: 128


epoch=000, train_loss=2.346, train_acc=0.106, test_loss=2.302, test_acc=0.114
epoch=001, train_loss=2.304, train_acc=0.104, test_loss=2.307, test_acc=0.103
epoch=002, train_loss=2.304, train_acc=0.107, test_loss=2.304, test_acc=0.103
epoch=003, train_loss=2.304, train_acc=0.106, test_loss=2.303, test_acc=0.097
epoch=004, train_loss=2.304, train_acc=0.107, test_loss=2.304, test_acc=0.097
epoch=005, train_loss=2.304, train_acc=0.106, test_loss=2.303, test_acc=0.103
epoch=006, train_loss=2.305, train_acc=0.106, test_loss=2.303, test_acc=0.114
epoch=007, train_loss=2.305, train_acc=0.106, test_loss=2.304, test_acc=0.103
epoch=008, train_loss=2.304, train_acc=0.104, test_loss=2.303, test_acc=0.103
epoch=009, train_loss=2.305, train_acc=0.106, test_loss=2.307, test_acc=0.103
epoch=010, train_loss=2.305, train_acc=0.107, test_loss=2.303, test_acc=0.114
epoch=011, train_loss=2.305, train_acc=0.107, test_loss=2.303, test_acc=0.103
epoch=012, train_loss=2.304, train_acc=0.104, test_loss=2.305, t

VBox(children=(Label(value='0.517 MB of 0.517 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,█▄▄▂▂▄█▄▄▄█▄▁█▃▃██████▂█▂▄███▄
test_loss,▂▇▄▃▄▃▃▄▂▇▃▃▅▅▃▄▁▄▂▂█▄▆▁▅▇▆▄▂▃
train_acc,▅▃▆▅▆▅▄▅▂▅▆▆▃▅▆█▄▆▇▅█▅▆▆▄▁▄▇▆▇
train_loss,█▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▁▂▁▁▁▁▁▁

0,1
test_acc,0.1028
test_loss,2.30277
train_acc,0.1073
train_loss,2.30398


[34m[1mwandb[0m: Agent Starting Run: zkbu8si3 with config:
[34m[1mwandb[0m: 	epochs: 5
[34m[1mwandb[0m: 	learning_rate: 0.0099700288356997
[34m[1mwandb[0m: 	lin_neurons1: 32
[34m[1mwandb[0m: 	lin_neurons2: 256
[34m[1mwandb[0m: 	lin_neurons3: 128


epoch=000, train_loss=0.828, train_acc=0.723, test_loss=0.360, test_acc=0.910
epoch=001, train_loss=0.655, train_acc=0.791, test_loss=0.388, test_acc=0.898
epoch=002, train_loss=0.629, train_acc=0.799, test_loss=0.390, test_acc=0.918
epoch=003, train_loss=0.622, train_acc=0.805, test_loss=0.349, test_acc=0.919
epoch=004, train_loss=0.617, train_acc=0.807, test_loss=0.349, test_acc=0.916


VBox(children=(Label(value='0.533 MB of 0.533 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
test_acc,▅▁██▇
test_loss,▃██▁▁
train_acc,▁▇▇██
train_loss,█▂▁▁▁

0,1
test_acc,0.9165
test_loss,0.34857
train_acc,0.80725
train_loss,0.61713


## Exercises

Try running a hyperparameter sweep over different hyperparameters for the ```Simple2dCnnClassifier```.