# 0. Introduction

<center>
<img src='https://i.postimg.cc/dQhm7c9b/wandb-dash.jpg' width=700>
</center>

Wandb is a **free MLOps platform** used by developers to 'build better models faster'. It's capable of:

* *Experiment tracking*
* *Dataset versioning*
* *Model management*

In this notebook, we show how to create an account and **get started** with weights and biases.

In [1]:
import numpy as np
import pandas as pd

import tensorflow as tf
from tensorflow import keras
from keras import layers

import os
os.environ['TF_CPP_MIN_LOG_LEVEL'] = '3'

# This turns off wandb logs
#os.environ['WANDB_SILENT'] = 'true'

# 1. Import wandb

If you are working locally, then you need to first **install** the library using `pip install wandb`.

In [2]:
import wandb

# 2. Create a free account

Go to https://wandb.ai/ and create a free account.

<center>
<img src='https://i.postimg.cc/D0WDxph9/wandb-site.png' width=600>
</center>

Sign up using github, gmail, microsoft or just an email and password. Fill in your details and select a **personal account** (this is **free** and you get **unlimited tracking hours**).

<center>
<img src='https://i.postimg.cc/QNyVVFyC/wandb-signup.png' width=600>

# 3. Get API key

An API key is a **unique indentifier** to authenticate a user to an API. Go to https://wandb.ai/authorize to get your unique key. 

<center>
<img src='https://i.postimg.cc/HnvthbHF/wandb-apikey.png' width=600>

# 4. Create secret

Then you need to save your API key as a **secret** in kaggle. Click Add-ons at the top of your screen and then secrets.

<center>
<img src='https://i.postimg.cc/SxR6JgC9/wandb-secrets.png' width=600>

Then click "Add a new secret".

<center>
<img src='https://i.postimg.cc/gJ3hh5gM/wandb-addsecret.png' width=600>

Paste in your **API key** that you copied earlier into the "value" box and add a name in the "label" box. After this, click save.

<center>
<img src='https://i.postimg.cc/MTJCtMYW/wandb-labelvalue.png' width=600>

Make sure the box "Attach to Notebook" is ticked for the new secret you created. Then press "Done".

<center>
<img src='https://i.postimg.cc/LXLppdT3/wandb-attach.png' width=600>

# 5. Login

The `kaggle_secrets` library allows us to retrieve the secret we created and use it to login to weights and biases.

Note: if you gave your secret a different **label** (i.e. different to `"wandb_api_key"`), then you have have to replace it in the third line of code below. 

In [3]:
from kaggle_secrets import UserSecretsClient

user_secrets = UserSecretsClient()

my_secret = user_secrets.get_secret("wandb_api_key") 

wandb.login(key=my_secret)

[34m[1mwandb[0m: W&B API key is configured. Use [1m`wandb login --relogin`[0m to force relogin
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


True

# 6. Build model

Now we build the model. Organise the **hyperparameters** you want to keep track of into a dictionary.

In [4]:
# Load MNIST dataset
(X_train, y_train), (X_test, y_test) = tf.keras.datasets.mnist.load_data()

# Normalise and reshape arrays
X_train, X_test = X_train.reshape((-1, 784))/255.0, X_test.reshape((-1, 784))/255.0

# Print shapes
X_train.shape, X_test.shape

Downloading data from https://storage.googleapis.com/tensorflow/tf-keras-datasets/mnist.npz


((60000, 784), (10000, 784))

In [5]:
def build_model():
    # Define model
    model = keras.Sequential([

        # hidden layer 1
        layers.Dense(units=CFG['layer1_units'], activation=CFG['layer1_act'], input_shape=[X_train.shape[1]]),
        layers.Dropout(rate=CFG['layer1_drop']),

        # hidden layer 2
        layers.Dense(units=CFG['layer2_units'], activation=CFG['layer2_act']),
        layers.Dropout(rate=CFG['layer2_drop']),

        # output layer
        layers.Dense(units=10, activation='softmax')
    ])

    # Define loss, optimizer and metric
    model.compile(optimizer=keras.optimizers.Adam(learning_rate=CFG['learning_rate']),
                  loss='sparse_categorical_crossentropy',
                  metrics=['sparse_categorical_accuracy'])
    
    return model

# 7. Create config file

The config file needs to be a **dictionary** - collect all the hyperparameters here.

In [6]:
CFG = dict(
    layer1_units = 256,
    layer1_act = 'relu',
    layer1_drop = 0.25,
    layer2_units = 128,
    layer2_act = 'relu',
    layer2_drop = 0.25,
    optimiser = 'Adam',
    learning_rate = 0.01,
    batch_size = 128,
    epochs = 20,
)

# 8. Run experiment

Use `run = wandb.init()` to start a new run and `run.finish()` to end one. The init method takes the following **parameters**:

* `entity`: An entity is a username or team name where you're sending runs. 
* `project`: The name of the project where you're sending the new run. If the project is not specified, the run is put in an "Uncategorized" project.
* `config`: This sets wandb.config, a dictionary-like object for saving inputs to your job, like hyperparameters for a model or settings for a data preprocessing job.
* `save_code`: Turn this on to save the main script or notebook to W&B. This is valuable for improving experiment reproducibility and to diff code across experiments in the UI.
* `group`: Specify a group to organize individual runs into a larger experiment. For example, you can use this to create groups for different model architectures.
* `job_type`: Specify the type of run, which is useful when you're grouping runs together into larger experiments using group. Typical job types are "train", "evaluate", etc.

In [7]:
# Build model
model = build_model()

# Initialise run
run = wandb.init(entity = 'scortinhas',
                 project = 'mnist-tutorial',
                 config = CFG,
                 save_code = True,
                 #group = 'ANN',
                 #job_type = 'train'
)

[34m[1mwandb[0m: Currently logged in as: [33mscortinhas[0m. Use [1m`wandb login --relogin`[0m to force relogin


Some frameworks, like keras, have a **callback function** that tracks all the metrics during training. Alternatively, you can use the **log method**, for example: `wandb.log({'accuracy': train_acc, 'loss': train_loss})`.

In [8]:
from wandb.keras import WandbCallback

In [9]:
# Train model
model.fit(X_train, y_train,
    validation_data = (X_test, y_test),
    batch_size = CFG['batch_size'],
    epochs = CFG['epochs'],
    callbacks = [WandbCallback()],
    verbose = True)



Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x7f0f9817b5d0>

# 9. Save model as an Artifact

Artifacts is a place to **save your data**. For example, this could be the **datasets** you use or the **model weights** after training. It is a very useful feature to ensure your work is **reproducible**. 

In [10]:
# Save model
model.save('neural_network.h5')

# Save model as an Artifact
artifact = wandb.Artifact(name='neural_network', type='model')
artifact.add_file('neural_network.h5')
run.log_artifact(artifact)

<wandb.sdk.wandb_artifacts.Artifact at 0x7f0f981b7b10>

Don't forget to **finish the run**. This is useful if you want to do multiple runs in one notebook.

In [11]:
# Complete W&B run
run.finish()

VBox(children=(Label(value='5.471 MB of 5.471 MB uploaded (0.015 MB deduped)\r'), FloatProgress(value=1.0, max…

0,1
epoch,▁▁▂▂▂▃▃▄▄▄▅▅▅▆▆▇▇▇██
loss,█▄▄▃▃▃▂▂▂▂▂▂▂▂▂▂▁▁▁▁
sparse_categorical_accuracy,▁▅▅▆▆▆▇▇▇▇▇▇▇▇██████
val_loss,▆▆▅▃▅▁▁▄▃▃▅▇▅▆██▄▃▆▄
val_sparse_categorical_accuracy,▁▂▂▄▄▇▆▇█▆▇▆█▇▅█▇██▆

0,1
GFLOPS,0.00023
best_epoch,6.0
best_val_loss,0.12353
epoch,19.0
loss,0.15039
sparse_categorical_accuracy,0.96363
val_loss,0.14014
val_sparse_categorical_accuracy,0.9672


**Notes:**
* You can see the **real time tracking** of results by clicking the **link** above.
* You can **turn off logging** by using `os.environ['WANDB_SILENT'] = 'true'` at the start of the notebook.

# 10. Explore workspace

To view my **workspace** go to https://wandb.ai/scortinhas/mnist-tutorial?workspace=user-scortinhas (I have made this project public).

<center>
<img src='https://i.postimg.cc/XYjQ8wDR/wandb-run3.png' width=600>
</center>

On the left, you can also select to see a **table view** of the experiments.

<center>
<img src='https://i.postimg.cc/dV9d5pT6/wandb-table.png' width=600>
</center>

You can also see the **lineage** of our saved model in the **artifacts** tab. 

<center>
<img src='https://i.postimg.cc/Gp8Db257/wandb-lineage.png' width=600>
</center>

# 11. Smoothing

A nice additional feature is that you can apply **smoothing** to the plots by clicking the **iron symbol**. 

<center>
<img src='https://i.postimg.cc/G2vqhz55/wandb-smooth.png' width=600>
</center>

# 12. Parallel plots

Add a cool **parallel plot** by clicking on 'Add panel'.

<center>
<img src='https://i.postimg.cc/KjwzqrZm/wandb-parallel.png' width=600>
</center>

Then select the **columns** you want to visualise.

<center>
<img src='https://i.postimg.cc/DZWmqMTx/wandb-addcols.png' width=600>
</center>

Select 'switch to custom layout' for greater **flexibility** of the dashboard.

<center>
<img src='https://i.postimg.cc/0Q9cj715/wandb-custom.png' width=600>
</center>

**Organise** the panels by dragging the **bottom right corners**.

<center>
<img src='https://i.postimg.cc/QMKF2d4X/wandb-organise2.png' width=600>
</center>

Run some **more experients** and you get something looking like this!

<center>
<img src='https://i.postimg.cc/GmxZQFJY/wandb-final.png' width=600>
</center>

**Part 2:**

* [🐝 Advanced WandB: Hyper-parameter tuning (sweeps)](https://www.kaggle.com/code/samuelcortinhas/advanced-wandb-hyper-parameter-tuning-sweeps)

**References:**
    
* [Experiment Tracking with Weights and Biases](https://www.kaggle.com/code/ayuraj/experiment-tracking-with-weights-and-biases) 