# Introduction to WanDB

## A. What is WanDB?

Weights & Biases (WanDB) is a developer-oriented toolset, designed specifically for machine learning. It helps monitor and visualize the model's training process and its performance in a more intuitive way. WanDB provides a centralized platform where teams can log, share, and collaborate on their machine learning projects, making it easier to compare different runs and models' performance.

## B. Why use WanDB?

There are several reasons why WanDB stands out as a preferred tool for machine learning projects:

1. **Track and Visualize Models**: WanDB provides a simple way to track every detail of your experiment, providing real-time visualization of your models' training and results.

2. **Hyperparameter Optimization**: With WanDB's Sweeps, you can automate hyperparameter tuning and explore the parameter space more efficiently to optimize your model's performance.

3. **Collaboration**: WanDB makes it easy to share your experiment results with colleagues and the community, fostering collaboration and knowledge sharing.

4. **Reproducibility**: By logging all the metadata from your runs, WanDB helps maintain the reproducibility of your experiments, which is crucial in machine learning projects.

## C. Understanding the Importance of Monitoring and Improving Model Training

Monitoring and improving model training is an essential part of the machine learning workflow.

- **Monitoring**: By keeping track of various metrics such as loss and accuracy during the training process, you can understand how well your model is learning and whether it's improving over time. This can help you detect issues like overfitting or underfitting early on and take corrective actions.

- **Improving**: Once you monitor your model's performance, the next step is to improve it. This could involve tweaking the model's architecture, optimizing hyperparameters, or using more/ different data for training. Tools like WanDB make it easier to experiment with these aspects and track the impact of each change, thereby helping you build better models.


# Setting up WanDB

## A. Account Setup

To get started with WanDB, you need to create a free account:

1. Visit the official Weights & Biases website: [wandb.ai](https://wandb.ai/site)
2. Click on `Sign Up` at the top right corner of the home page.
3. You can opt to sign up using a GitHub, Google, or LinkedIn account. Alternatively, sign up using your email address and a password.
4. You will see your API Keys/Token, you will need it. If you didn't save it or you lost it, you can look it up in `User Settings`.

## B. Installing the Wandb library

Once you've set up your account, you need to install the Wandb library in your Python environment. It can be installed using pip:

```bash
pip install wandb
```

Or with conda:

```bash
conda install -c conda-forge wandb
```

Ensure you have the latest version of the library for optimal functionality.

## C. Initializing WanDB in Your Code

After installing the Wandb library, you need to import it and initialize it within your project. Start by importing wandb:

```python
import wandb
```

Then, initialize wandb with `wandb.init()`. You can pass several optional parameters to `wandb.init()`, such as:

- `project`: The name of the project where you're logging runs. This could be any string, and a new project will be created if it doesn't already exist.
- `entity`: The username or team name under which the project is to be created.

An example of initializing Wandb for a project named 'my_project' under username 'my_username' would be:

```python
wandb.init(project='my_project', entity='my_username')
```

After running `wandb.init()`, a new run will be created on the WanDB website, where you can track your model's progress, visualize results, and more.


Let's create a simple project that utilizes WandB (Weights and Biases) for experiment tracking. This project will be about classifying the CIFAR-10 dataset using a Convolutional Neural Network (CNN) implemented in Pytorch.

# Project: CIFAR-10 Image Classification with Pytorch and WandB

## Overview

The CIFAR-10 dataset consists of 60,000 32x32 color images in 10 classes, with 6,000 images per class. The goal of this project is to classify the images into these classes.

We will use a Convolutional Neural Network (CNN) in Pytorch to perform this classification task. The model's performance will be logged and visualized using WandB, a tool for machine learning experiment tracking.

## Steps

### 1. Setting Up the Environment

Install the necessary libraries.


In [None]:
!pip install torch torchvision wandb


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m23.3.2[0m[39;49m -> [0m[32;49m24.0[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


### 2. Import the Libraries


In [None]:
import torch
from torch import nn, optim
import torchvision
from torchvision import datasets, transforms
import wandb
import torch.nn.functional as F


### 3. Initialize WandB


In [None]:
wandb.login()
run = wandb.init(project='cifar10-classification', entity='ricky-kurniawan')

[34m[1mwandb[0m: W&B API key is configured. Use [1m`wandb login --relogin`[0m to force relogin


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011156480233411356, max=1.0…

### 4. Prepare the Data

Load the CIFAR-10 dataset. Normalize the data and create dataloaders.


In [None]:
transform = transforms.Compose([transforms.ToTensor(), transforms.Normalize((0.5, 0.5, 0.5), (0.5, 0.5, 0.5))])

trainset = datasets.CIFAR10(root='./data', train=True, download=True, transform=transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

testset = datasets.CIFAR10(root='./data', train=False, download=True, transform=transform)
testloader = torch.utils.data.DataLoader(testset, batch_size=64, shuffle=False)


Files already downloaded and verified
Files already downloaded and verified


### 5. Define the Model

Define a simple CNN model.


In [None]:
class Net(nn.Module):
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 6, 5)
        self.pool = nn.MaxPool2d(2, 2)
        self.conv2 = nn.Conv2d(6, 16, 5)
        self.fc1 = nn.Linear(16 * 5 * 5, 120)
        self.fc2 = nn.Linear(120, 84)
        self.fc3 = nn.Linear(84, 10)

    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.pool(F.relu(self.conv2(x)))
        x = x.view(-1, 16 * 5 * 5)
        x = F.relu(self.fc1(x))
        x = F.relu(self.fc2(x))
        x = self.fc3(x)
        return x

model = Net()


### 6. Set Up the Loss Function and Optimizer


In [None]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.SGD(model.parameters(), lr=0.001, momentum=0.9)


### 7. Train the Model

Train the model and log the loss and accuracy to WandB.


In [None]:
def train_model(run, model, criterion, optimizer, trainloader):
    for epoch in range(10):
        running_loss = 0.0
        correct_predictions = 0
        total_predictions = 0

        for i, data in enumerate(trainloader, 0):
            inputs, labels = data

            optimizer.zero_grad()

            outputs = model(inputs)
            loss = criterion(outputs, labels)
            loss.backward()
            optimizer.step()

            running_loss += loss.item()

            _, predicted = torch.max(outputs.data, 1)
            total_predictions += labels.size(0)
            correct_predictions += (predicted == labels).sum().item()

            if i % 200 == 199:    # Every 200 mini-batches
                print('[Epoch %d, Mini-batch %5d] Loss: %.3f' % (epoch + 1, i + 1, running_loss / 2000))
                wandb.log({'Epoch': epoch + 1, 'Loss': running_loss / 2000})
                wandb.log({'Epoch': epoch + 1, 'Loss': running_loss / 2000, 'Accuracy': correct_predictions / total_predictions * 100})
                running_loss = 0.0
                correct_predictions = 0
                total_predictions = 0

train_model(run, model, criterion, optimizer, trainloader)


[Epoch 1, Mini-batch   200] Loss: 0.230
[Epoch 1, Mini-batch   400] Loss: 0.230
[Epoch 1, Mini-batch   600] Loss: 0.229
[Epoch 2, Mini-batch   200] Loss: 0.218
[Epoch 2, Mini-batch   400] Loss: 0.209
[Epoch 2, Mini-batch   600] Loss: 0.200
[Epoch 3, Mini-batch   200] Loss: 0.193
[Epoch 3, Mini-batch   400] Loss: 0.189
[Epoch 3, Mini-batch   600] Loss: 0.183
[Epoch 4, Mini-batch   200] Loss: 0.179
[Epoch 4, Mini-batch   400] Loss: 0.175
[Epoch 4, Mini-batch   600] Loss: 0.171
[Epoch 5, Mini-batch   200] Loss: 0.166
[Epoch 5, Mini-batch   400] Loss: 0.163
[Epoch 5, Mini-batch   600] Loss: 0.161
[Epoch 6, Mini-batch   200] Loss: 0.155
[Epoch 6, Mini-batch   400] Loss: 0.153
[Epoch 6, Mini-batch   600] Loss: 0.151
[Epoch 7, Mini-batch   200] Loss: 0.148
[Epoch 7, Mini-batch   400] Loss: 0.145
[Epoch 7, Mini-batch   600] Loss: 0.145
[Epoch 8, Mini-batch   200] Loss: 0.141
[Epoch 8, Mini-batch   400] Loss: 0.139
[Epoch 8, Mini-batch   600] Loss: 0.142
[Epoch 9, Mini-batch   200] Loss: 0.137


### 8. Evaluate the Model

Evaluate the model on the test data and log the test accuracy to WandB.


In [None]:
correct = 0
total = 0
with torch.no_grad():
    for data in testloader:
        images, labels = data
        outputs = model(images)
        _, predicted = torch.max(outputs.data, 1)
        total += labels.size(0)
        correct += (predicted == labels).sum().item()

print('Accuracy of the network on the test images: %d %%' % (100 * correct / total))

wandb.log({'Test Accuracy': 100.0*correct/total})


Accuracy of the network on the test images: 52 %


This section of the code calculates the accuracy of the model on the test set and logs this test accuracy to WandB for visualization and tracking.

At this point, you can go to the WandB website, navigate to your project, and see a live visualization of your model's loss and accuracy throughout the training process. This helps to understand how well the model is learning and provides insights for further improvements.

Finally, don't forget to close your WandB run after you're done:


In [None]:
run.finish()


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
Accuracy,▁▂▂▃▃▄▄▅▅▅▅▆▆▆▆▆▇▇▇▇▇▇▇▇██████
Epoch,▁▁▁▁▂▂▂▂▃▃▃▃▃▃▃▃▄▄▄▄▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇████
Loss,████▇▇▇▆▅▅▅▅▄▄▄▄▄▄▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▁▁▁▁▁▁
Test Accuracy,▁

0,1
Accuracy,53.6875
Epoch,10.0
Loss,0.12919
Test Accuracy,52.62


This ensures all resources are properly freed and all logs are uploaded to the WandB server. This step is crucial to make sure all your model training progress and metrics are properly saved and can be reviewed later in the WandB dashboard.


## Basic Concepts of WanDB

Here's what the WanDB dashboard looks like:

<img src=https://storage.googleapis.com/rg-ai-bootcamp/mlops/wandb-project.png width="800" height="400">


### A. Projects

In Wandb, a **project** is a collection of related machine learning experiments (known as "runs"). It provides a shared space where you and your team can compare results, share insights, and discuss potential improvements. Each project has a dedicated page on the Wandb web application, showcasing visualizations, comparisons, and other useful metrics. We set the project name when we do `wandb.init()` in Step-3 above.

### B. Runs

A **run** in Wandb is a single execution of your machine learning script. During a run, you can log various metrics, such as loss and accuracy, system performance data, and even media like images or 3D objects. Each run gets its page in the Wandb web application, where you can view and analyze logged data.

As you can see from the screenshot of the WanDB dashboard, on the left we have 5 runs in our project.

### C. Artifacts

**Artifacts** in Wandb are used to handle version control of datasets, models, and other result files from runs. They help to track the inputs and outputs of your runs, providing a clear and useful lineage of your models and data. For example, an input artifact could be your training dataset, while output artifacts could be your trained model or predictions.

On the left you can click on the Artifacts navigation icon which will take you to:

<img src=https://storage.googleapis.com/rg-ai-bootcamp/mlops/wandb-artifacts.png width="800" height="400">

In this case, the artifact is a Jupyter notebook file (`job-git_github.com_ruang-guru_ai-bootcamp.git_09_mlops_01_wandb_00_wandb_intro_setup.ipynb`) from the project `cifar10-classification` owned by the user `ricky-kurniawan`. The version of the artifact is specified after the colon - in this case, `v1` indicates it's the first version.

The `artifact.download()` command is used to download the artifact to the local machine for use in the current run.

In the "Used By" section, the listed items are the runs that have used this artifact. For example, the run `proud-salad-5` used this artifact. Information such as the run's performance metrics, the project, the user, the artifact used, and the time of artifact creation is displayed.

In this case, `run-yqpg9z5p-history:v0` is an output artifact of the run. This run history artifact contains information about the run, such as the logged metrics. This artifact is created automatically by W&B when you log metrics or other information during a run. This allows you to revisit the specifics of a run, analyze the performance, and potentially identify areas for improvement or further exploration.

### D. Sweep

**Sweep** is a feature in Wandb for hyperparameter optimization. A sweep involves a set of runs, each with different hyperparameters, allowing you to explore a range of possibilities and identify the best parameters for your model. Wandb automates this process, generating a set of permutations of hyperparameters (based on a configuration file you create), running them, and logging the results. This makes it easier to optimize your model's performance.


Let's try doing a Sweep using our CIFAR-10 Project.

### 1. Setting Up the Configuration

First, we need to create a configuration for our sweep. This configuration will specify the range and distribution of hyperparameters for the sweep. Here's a basic example:


In [None]:
sweep_config = {
    'method': 'random', #grid, random
    'metric': {
      'name': 'accuracy',
      'goal': 'maximize'
    },
    'parameters': {
        'epochs': {
            'values': [5, 10, 15]
        },
        'batch_size': {
            'values': [16, 32, 64]
        },
        'learning_rate': {
            'min': 1e-5,
            'max': 0.1
        },
    }
}


In this configuration, we're specifying that we want to use a random search method (other options are `grid` for grid search and `bayes` for Bayesian optimization), and that our goal is to maximize accuracy. We're also specifying the range of values for the hyperparameters that we want to optimize: epochs, batch size, and learning rate.

### 2. Initialize the Sweep


In [None]:
sweep_id = wandb.sweep(sweep_config, project="cifar10-classification")


Create sweep with ID: kqyufn28
Sweep URL: https://wandb.ai/ricky-kurniawan/cifar10-classification/sweeps/kqyufn28


This command initializes the sweep and returns a sweep ID. This ID uniquely identifies the sweep in WandB.

### 3. Define the Train Function

Next, we need to modify the training function to accept configurations and log them to WandB. Add the following lines at the beginning of the function:


In [None]:
# Define a global counter
global_counter = 0
max_runs = 5

def train():
    global global_counter
    if global_counter >= max_runs:
        return
    global_counter += 1
    with wandb.init(config=sweep_config):
        config = wandb.config
        model = Net()
        criterion = nn.CrossEntropyLoss()
        optimizer = optim.SGD(model.parameters(), lr=config.learning_rate, momentum=0.9)
        train_model(run, model, criterion, optimizer, trainloader)

In this function, `wandb.init()` is called with the sweep configuration to start a new run. `wandb.config` is then used to access the hyperparameters for the current run.

For the purpose of teaching, we are limiting the run to a maksimum of 5. Naturally you should let the sweep run and try out all possible combinations which may take a long time. You can safely remove all lines containing "global_counter" for real case study.

### 4. Run the Sweep

In [None]:
wandb.agent(sweep_id, train)


[34m[1mwandb[0m: Agent Starting Run: jbxkaalg with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	learning_rate: 0.052303023093622406
[34m[1mwandb[0m: Currently logged in as: [33mricky-kurniawan[0m. Use [1m`wandb login --relogin`[0m to force relogin


[Epoch 1, Mini-batch   200] Loss: 0.205
[Epoch 1, Mini-batch   400] Loss: 0.171
[Epoch 1, Mini-batch   600] Loss: 0.161
[Epoch 2, Mini-batch   200] Loss: 0.152
[Epoch 2, Mini-batch   400] Loss: 0.148
[Epoch 2, Mini-batch   600] Loss: 0.145
[Epoch 3, Mini-batch   200] Loss: 0.141
[Epoch 3, Mini-batch   400] Loss: 0.141
[Epoch 3, Mini-batch   600] Loss: 0.140
[Epoch 4, Mini-batch   200] Loss: 0.137
[Epoch 4, Mini-batch   400] Loss: 0.136
[Epoch 4, Mini-batch   600] Loss: 0.137
[Epoch 5, Mini-batch   200] Loss: 0.131
[Epoch 5, Mini-batch   400] Loss: 0.134
[Epoch 5, Mini-batch   600] Loss: 0.132
[Epoch 6, Mini-batch   200] Loss: 0.130
[Epoch 6, Mini-batch   400] Loss: 0.132
[Epoch 6, Mini-batch   600] Loss: 0.136
[Epoch 7, Mini-batch   200] Loss: 0.127
[Epoch 7, Mini-batch   400] Loss: 0.134
[Epoch 7, Mini-batch   600] Loss: 0.135
[Epoch 8, Mini-batch   200] Loss: 0.129
[Epoch 8, Mini-batch   400] Loss: 0.132
[Epoch 8, Mini-batch   600] Loss: 0.129
[Epoch 9, Mini-batch   200] Loss: 0.125


VBox(children=(Label(value='0.001 MB of 0.007 MB uploaded\r'), FloatProgress(value=0.1496937212863706, max=1.0…

0,1
Accuracy,▁▄▅▆▆▆▆▇▇▇▇▇▇▇▇█▇▇█▇▇█▇██████▇
Epoch,▁▁▁▁▂▂▂▂▃▃▃▃▃▃▃▃▄▄▄▄▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇████
Loss,██▅▄▃▃▃▃▂▂▂▂▂▂▂▂▁▁▂▂▁▁▂▂▁▁▂▂▁▁▂▁▁▁▁▂▁▁▁▂

0,1
Accuracy,54.84375
Epoch,10.0
Loss,0.13498


[34m[1mwandb[0m: Agent Starting Run: hil12twe with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	epochs: 15
[34m[1mwandb[0m: 	learning_rate: 0.0778152359691495


[Epoch 1, Mini-batch   200] Loss: 0.206
[Epoch 1, Mini-batch   400] Loss: 0.184
[Epoch 1, Mini-batch   600] Loss: 0.173
[Epoch 2, Mini-batch   200] Loss: 0.162
[Epoch 2, Mini-batch   400] Loss: 0.164
[Epoch 2, Mini-batch   600] Loss: 0.164
[Epoch 3, Mini-batch   200] Loss: 0.168
[Epoch 3, Mini-batch   400] Loss: 0.161
[Epoch 3, Mini-batch   600] Loss: 0.163
[Epoch 4, Mini-batch   200] Loss: 0.160
[Epoch 4, Mini-batch   400] Loss: 0.161
[Epoch 4, Mini-batch   600] Loss: 0.161
[Epoch 5, Mini-batch   200] Loss: 0.159
[Epoch 5, Mini-batch   400] Loss: 0.162
[Epoch 5, Mini-batch   600] Loss: 0.166
[Epoch 6, Mini-batch   200] Loss: 0.164
[Epoch 6, Mini-batch   400] Loss: 0.166
[Epoch 6, Mini-batch   600] Loss: 0.169
[Epoch 7, Mini-batch   200] Loss: 0.164
[Epoch 7, Mini-batch   400] Loss: 0.166
[Epoch 7, Mini-batch   600] Loss: 0.167
[Epoch 8, Mini-batch   200] Loss: 0.171
[Epoch 8, Mini-batch   400] Loss: 0.169
[Epoch 8, Mini-batch   600] Loss: 0.169
[Epoch 9, Mini-batch   200] Loss: 0.166


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
Accuracy,▁▄▆▇▇▇▇▇▇███████▇▇██▇▇▇▇█▇▇▇▇▇
Epoch,▁▁▁▁▂▂▂▂▃▃▃▃▃▃▃▃▄▄▄▄▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇████
Loss,██▅▃▁▁▂▂▂▂▁▂▁▁▁▁▁▁▁▂▂▂▂▂▂▂▂▂▃▃▂▂▂▂▃▃▂▂▃▃

0,1
Accuracy,40.67188
Epoch,10.0
Loss,0.17274


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: wtl4q1ha with config:
[34m[1mwandb[0m: 	batch_size: 16
[34m[1mwandb[0m: 	epochs: 15
[34m[1mwandb[0m: 	learning_rate: 0.010806903248724136


[Epoch 1, Mini-batch   200] Loss: 0.224
[Epoch 1, Mini-batch   400] Loss: 0.190
[Epoch 1, Mini-batch   600] Loss: 0.167
[Epoch 2, Mini-batch   200] Loss: 0.146
[Epoch 2, Mini-batch   400] Loss: 0.141
[Epoch 2, Mini-batch   600] Loss: 0.136
[Epoch 3, Mini-batch   200] Loss: 0.126
[Epoch 3, Mini-batch   400] Loss: 0.124
[Epoch 3, Mini-batch   600] Loss: 0.120
[Epoch 4, Mini-batch   200] Loss: 0.114
[Epoch 4, Mini-batch   400] Loss: 0.111
[Epoch 4, Mini-batch   600] Loss: 0.113
[Epoch 5, Mini-batch   200] Loss: 0.105
[Epoch 5, Mini-batch   400] Loss: 0.104
[Epoch 5, Mini-batch   600] Loss: 0.104
[Epoch 6, Mini-batch   200] Loss: 0.099
[Epoch 6, Mini-batch   400] Loss: 0.099
[Epoch 6, Mini-batch   600] Loss: 0.099
[Epoch 7, Mini-batch   200] Loss: 0.091
[Epoch 7, Mini-batch   400] Loss: 0.092
[Epoch 7, Mini-batch   600] Loss: 0.095
[Epoch 8, Mini-batch   200] Loss: 0.087
[Epoch 8, Mini-batch   400] Loss: 0.088
[Epoch 8, Mini-batch   600] Loss: 0.091
[Epoch 9, Mini-batch   200] Loss: 0.081


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
Accuracy,▁▃▄▅▅▅▆▆▆▆▇▆▇▇▇▇▇▇▇█▇██▇██████
Epoch,▁▁▁▁▂▂▂▂▃▃▃▃▃▃▃▃▄▄▄▄▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇████
Loss,██▆▅▄▄▄▄▃▃▃▃▃▃▃▃▂▂▂▂▂▂▂▂▂▂▂▂▁▁▁▂▁▁▁▁▁▁▁▁

0,1
Accuracy,70.96094
Epoch,10.0
Loss,0.083


[34m[1mwandb[0m: Agent Starting Run: ettfrm5h with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	epochs: 5
[34m[1mwandb[0m: 	learning_rate: 0.0795927437531589


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011121241633292003, max=1.0…

[Epoch 1, Mini-batch   200] Loss: 0.201
[Epoch 1, Mini-batch   400] Loss: 0.178
[Epoch 1, Mini-batch   600] Loss: 0.174
[Epoch 2, Mini-batch   200] Loss: 0.166
[Epoch 2, Mini-batch   400] Loss: 0.168
[Epoch 2, Mini-batch   600] Loss: 0.166
[Epoch 3, Mini-batch   200] Loss: 0.166
[Epoch 3, Mini-batch   400] Loss: 0.162
[Epoch 3, Mini-batch   600] Loss: 0.166
[Epoch 4, Mini-batch   200] Loss: 0.161
[Epoch 4, Mini-batch   400] Loss: 0.162
[Epoch 4, Mini-batch   600] Loss: 0.165
[Epoch 5, Mini-batch   200] Loss: 0.163
[Epoch 5, Mini-batch   400] Loss: 0.161
[Epoch 5, Mini-batch   600] Loss: 0.163
[Epoch 6, Mini-batch   200] Loss: 0.164
[Epoch 6, Mini-batch   400] Loss: 0.163
[Epoch 6, Mini-batch   600] Loss: 0.162
[Epoch 7, Mini-batch   200] Loss: 0.163
[Epoch 7, Mini-batch   400] Loss: 0.168
[Epoch 7, Mini-batch   600] Loss: 0.162
[Epoch 8, Mini-batch   200] Loss: 0.164
[Epoch 8, Mini-batch   400] Loss: 0.164
[Epoch 8, Mini-batch   600] Loss: 0.165
[Epoch 9, Mini-batch   200] Loss: 0.168


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
Accuracy,▁▅▆▆▆▇▇▇▇█▇▇███▇███▇████▇▇▇▇▇█
Epoch,▁▁▁▁▂▂▂▂▃▃▃▃▃▃▃▃▄▄▄▄▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇████
Loss,██▄▃▂▂▂▂▂▂▁▂▁▁▁▂▁▁▁▁▂▂▁▁▁▁▂▁▂▂▂▂▂▂▂▃▃▃▃▂

0,1
Accuracy,42.27344
Epoch,10.0
Loss,0.168


[34m[1mwandb[0m: Agent Starting Run: sli7lpw0 with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	learning_rate: 0.02990947205127064


VBox(children=(Label(value='Waiting for wandb.init()...\r'), FloatProgress(value=0.011166345700055697, max=1.0…

[Epoch 1, Mini-batch   200] Loss: 0.207
[Epoch 1, Mini-batch   400] Loss: 0.169
[Epoch 1, Mini-batch   600] Loss: 0.157
[Epoch 2, Mini-batch   200] Loss: 0.139
[Epoch 2, Mini-batch   400] Loss: 0.137
[Epoch 2, Mini-batch   600] Loss: 0.134
[Epoch 3, Mini-batch   200] Loss: 0.125
[Epoch 3, Mini-batch   400] Loss: 0.123
[Epoch 3, Mini-batch   600] Loss: 0.125
[Epoch 4, Mini-batch   200] Loss: 0.116
[Epoch 4, Mini-batch   400] Loss: 0.118
[Epoch 4, Mini-batch   600] Loss: 0.117
[Epoch 5, Mini-batch   200] Loss: 0.110
[Epoch 5, Mini-batch   400] Loss: 0.113
[Epoch 5, Mini-batch   600] Loss: 0.113
[Epoch 6, Mini-batch   200] Loss: 0.105
[Epoch 6, Mini-batch   400] Loss: 0.109
[Epoch 6, Mini-batch   600] Loss: 0.110
[Epoch 7, Mini-batch   200] Loss: 0.101
[Epoch 7, Mini-batch   400] Loss: 0.104
[Epoch 7, Mini-batch   600] Loss: 0.109
[Epoch 8, Mini-batch   200] Loss: 0.099
[Epoch 8, Mini-batch   400] Loss: 0.105
[Epoch 8, Mini-batch   600] Loss: 0.101
[Epoch 9, Mini-batch   200] Loss: 0.096


VBox(children=(Label(value='0.001 MB of 0.001 MB uploaded\r'), FloatProgress(value=1.0, max=1.0)))

0,1
Accuracy,▁▄▄▅▅▆▆▆▆▇▇▇▇▇▇▇▇▇██▇█▇██████▇
Epoch,▁▁▁▁▂▂▂▂▃▃▃▃▃▃▃▃▄▄▄▄▅▅▅▅▆▆▆▆▆▆▆▆▇▇▇▇████
Loss,██▆▅▄▄▄▄▃▃▃▃▂▂▃▂▂▂▂▂▂▂▂▂▂▂▂▂▁▁▂▂▁▁▂▂▁▁▂▂

0,1
Accuracy,64.03125
Epoch,10.0
Loss,0.10403


[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: rtnk6ct2 with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	epochs: 15
[34m[1mwandb[0m: 	learning_rate: 0.04013296976488706
[34m[1mwandb[0m: Agent Starting Run: bqo62g9l with config:
[34m[1mwandb[0m: 	batch_size: 64
[34m[1mwandb[0m: 	epochs: 15
[34m[1mwandb[0m: 	learning_rate: 0.019177756444346203
[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[34m[1mwandb[0m: Agent Starting Run: 4xg5m9q5 with config:
[34m[1mwandb[0m: 	batch_size: 16
[34m[1mwandb[0m: 	epochs: 10
[34m[1mwandb[0m: 	learning_rate: 0.08786538663827181
[34m[1mwandb[0m: Agent Starting Run: hccvb2aj with config:
[34m[1mwandb[0m: 	batch_size: 32
[34m[1mwandb[0m: 	epochs: 15
[34m[1mwandb[0m: 	learning_rate: 0.09598692806072671
[34m[1mwandb[0m: Sweep Agent: Waiting for job.
[34m[1mwandb[0m: Job received.
[3

This command will start the sweep we just defined. WandB will call the `train` function with the different combinations of hyperparameters defined in the sweep configuration.

Here's what the Sweep Dashboard looks like:

<img src=https://storage.googleapis.com/rg-ai-bootcamp/mlops/wandb-sweep.png width="800" height="400">

We can clearly see the which combination is more effective by looking at the charts, we can then dive in to fine tune our hyperparameters.

WandB's sweeps are a powerful tool for optimizing your model's hyperparameters. By integrating WandB with your model training code, you can automate the process of training many models with different hyperparameters, and then easily compare their performance on the WandB's dashboard.

You can view the full project at this link: [CIFAR10 WanDB Dashboard](https://wandb.ai/ricky-kurniawan/cifar10-classification/overview?workspace=user-ricky-kurniawan)