# TUTORIAL: wandb.ai Weights & Biases integration in notebooks

The purpose of this tutorial is to show how it is possible to use Weights & Biases with AI Training.

### **USE CASE:** Train YOLOv5 models and compare their performance with Weights & Biases

*If you would like to see in more detail how to train YOLOv5 to recognise objects, please refer to the full tutorial in the AI Training documentation.*

<img src="attachment:97b8c0b9-768a-4e12-bf8e-10f2f3fe0efc.jpg" width="600">

## Introduction

**What is Weights & Biases?**

" *Weights & Biases helps you build better models faster with a central dashboard for machine learning projects. Use our tools to log hyperparameters and output metrics from your runs, then visualize and compare results and quickly share findings with your colleagues.* "

We will show how Weights & Biases can be used with the YOLOv5 real-time object detection framework on AI learning. It is based on the YOLOv5 repository by Ultralytics (https://github.com/ultralytics/yolov5). 

In oder to achieve this, we will compare the performance of the YOLOv5 s, m, l and x models on the COCO dataset.

```
OVHcloud disclaims to the fullest extent authorized by law all warranties, whether express or implied, including any implied warranties of title, non-infringement, quiet enjoyment, integration, merchantability or fitness for a particular purpose regarding the use of the COCO dataset in the context of this notebook. The user shall fully comply with the terms of use that appears on the database website (https://cocodataset.org/).
```

## Requirements

First, create a Weights & Biases account: https://wandb.ai/site.

Secondly, to use Weights & Biases on AI Training, create a new job and you will be able to train your model on your dataset.

Thanks to wandb.ai you will be able to display your metrics as you train your model.

## Code

The different steps are as follow:

- Install wandb and login
- Install YOLOv5 dependencies
- Download the COCO dataset
- Define YOLOv5 model
- Install packages for running
- Run YOLOv5 training
- Overview of dynamic display with Weights & Biases
- Use of computing ressources

### Install wandb and login

In [None]:
# install wandb
!pip install wandb

⚠️ Remember to restart the kernel after installation.

In [20]:
# login (to get your password: https://wandb.ai/authorize)
wandb.login()

[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize


[34m[1mwandb[0m: Paste an API key from your profile and hit enter:  ········································


[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /workspace/.netrc


True

**Note:** If you want to connect to an operating terminal, use the following command => *wandb login*


### Install YOLOv5 dependencies

In [4]:
# clone YOLOv5 repository
!git clone https://github.com/ultralytics/yolov5  # clone repo

Cloning into 'yolov5'...
remote: Enumerating objects: 7098, done.[K
remote: Counting objects: 100% (204/204), done.[K
remote: Compressing objects: 100% (118/118), done.[K
remote: Total 7098 (delta 119), reused 151 (delta 86), pack-reused 6894[K
Receiving objects: 100% (7098/7098), 9.16 MiB | 12.04 MiB/s, done.
Resolving deltas: 100% (4857/4857), done.


In [5]:
# YOLOv5 path
%cd yolov5

/workspace/notebook_wandbai_yolov5/yolov5


In [6]:
!git reset --hard 886f1c03d839575afecb059accf74296fad395b6

HEAD is now at 886f1c0 DDP after autoanchor reorder (#2421)


In [None]:
# install dependencies as necessary
!pip install -r requirements.txt

In [9]:
import torch
import wandb

# to display images
from IPython.display import Image, clear_output

# to download models
from utils.google_utils import gdrive_download 

In [10]:
print('Setup complete. Using torch %s %s' % (torch.__version__, torch.cuda.get_device_properties(0) if torch.cuda.is_available() else 'CPU'))

Setup complete. Using torch 1.8.1 _CudaDeviceProperties(name='Tesla V100S-PCIE-32GB', major=7, minor=0, total_memory=32510MB, multi_processor_count=80)


### Download the COCO dataset

In [None]:
# copy and paste the code extract
%cd /workspace/notebook_wandbai_yolov5/
!curl -L "https://public.roboflow.com/ds/IGiAtRcab0?key=teTb0bdAo3" > roboflow.zip; unzip roboflow.zip; rm roboflow.zip

In [42]:
# the yaml file is writen by Roboflow and contains informations about our data 
%cat data.yaml

train: ../train/images
val: ../valid/images

nc: 80
names: ['aeroplane', 'apple', 'backpack', 'banana', 'baseball bat', 'baseball glove', 'bear', 'bed', 'bench', 'bicycle', 'bird', 'boat', 'book', 'bottle', 'bowl', 'broccoli', 'bus', 'cake', 'car', 'carrot', 'cat', 'cell phone', 'chair', 'clock', 'cow', 'cup', 'diningtable', 'dog', 'donut', 'elephant', 'fire hydrant', 'fork', 'frisbee', 'giraffe', 'hair drier', 'handbag', 'horse', 'hot dog', 'keyboard', 'kite', 'knife', 'laptop', 'microwave', 'motorbike', 'mouse', 'orange', 'oven', 'parking meter', 'person', 'pizza', 'pottedplant', 'refrigerator', 'remote', 'sandwich', 'scissors', 'sheep', 'sink', 'skateboard', 'skis', 'snowboard', 'sofa', 'spoon', 'sports ball', 'stop sign', 'suitcase', 'surfboard', 'teddy bear', 'tennis racket', 'tie', 'toaster', 'toilet', 'toothbrush', 'traffic light', 'train', 'truck', 'tvmonitor', 'umbrella', 'vase', 'wine glass', 'zebra']

### Define YOLOv5 model

Define the model you want to train: yolov5 s, m, l or x.

Here our aim is to compare the 4 models. We will therefore run them one after the other on the same dataset.

In [12]:
# define number of classes based on data.yaml (here we got 80)
import yaml

# go to the directory where the data.yaml file is located
%cd /workspace/notebook_wandbai_yolov5
with open("data.yaml", 'r') as stream:
    num_classes = str(yaml.safe_load(stream)['nc'])

/workspace/notebook_wandbai_yolov5


In [48]:
# model configuration used
%cat /workspace/notebook_wandbai_yolov5/yolov5/models/yolov5s.yaml

# parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Focus, [64, 3]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 9, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   [-1, 3, C3, [1024, False]],  # 9
  ]

# YOLOv5 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C

In [44]:
# customize iPython writefile
from IPython.core.magic import register_line_cell_magic

@register_line_cell_magic
def writetemplate(line, cell):
    with open(line, 'w') as f:
        f.write(cell.format(**globals()))

In [49]:
%%writetemplate /workspace/notebook_wandbai_yolov5/yolov5/models/custom_yolov5s.yaml

# parameters
nc: 80  # number of classes
depth_multiple: 0.33  # model depth multiple
width_multiple: 0.50  # layer channel multiple

# anchors
anchors:
  - [10,13, 16,30, 33,23]  # P3/8
  - [30,61, 62,45, 59,119]  # P4/16
  - [116,90, 156,198, 373,326]  # P5/32

# YOLOv5 backbone
backbone:
  # [from, number, module, args]
  [[-1, 1, Focus, [64, 3]],  # 0-P1/2
   [-1, 1, Conv, [128, 3, 2]],  # 1-P2/4
   [-1, 3, C3, [128]],
   [-1, 1, Conv, [256, 3, 2]],  # 3-P3/8
   [-1, 9, C3, [256]],
   [-1, 1, Conv, [512, 3, 2]],  # 5-P4/16
   [-1, 9, C3, [512]],
   [-1, 1, Conv, [1024, 3, 2]],  # 7-P5/32
   [-1, 1, SPP, [1024, [5, 9, 13]]],
   [-1, 3, C3, [1024, False]],  # 9
  ]

# YOLOv5 head
head:
  [[-1, 1, Conv, [512, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 6], 1, Concat, [1]],  # cat backbone P4
   [-1, 3, C3, [512, False]],  # 13

   [-1, 1, Conv, [256, 1, 1]],
   [-1, 1, nn.Upsample, [None, 2, 'nearest']],
   [[-1, 4], 1, Concat, [1]],  # cat backbone P3
   [-1, 3, C3, [256, False]],  # 17 (P3/8-small)

   [-1, 1, Conv, [256, 3, 2]],
   [[-1, 14], 1, Concat, [1]],  # cat head P4
   [-1, 3, C3, [512, False]],  # 20 (P4/16-medium)

   [-1, 1, Conv, [512, 3, 2]],
   [[-1, 10], 1, Concat, [1]],  # cat head P5
   [-1, 3, C3, [1024, False]],  # 23 (P5/32-large)

   [[17, 20, 23], 1, Detect, [nc, anchors]],  # Detect(P3, P4, P5)
  ]

### Install packages for running

The following dependencies are necessary to train your YOLOv5 model.

In [None]:
!pip install --upgrade pip

In [None]:
!pip install opencv-python

In [None]:
!pip install opencv-python--headless

In [None]:
conda install -c conda-forge pycocotools

In [None]:
conda install -c conda-forge/label/qcc7 pycocotools

In [None]:
conda install -c conda-forge/label/cf201901 pycocotools

In [None]:
conda install -c conda-forge/label/cf202003 pycocotools

### Run YOLOv5 training

⚠️ All training will take place over 10 epochs.

**Parameters definitions:**

- img: refers to the input images size.
- batch: refers to the batch size (number of training examples utilized in one iteration).
- epochs: refers to the number of training epochs. An epoch corresponds to one cycle through the full training dataset.
- project: name of the folder in which your results will be stored.
- data: refers to the path to the yaml file.
- cfg: define the model configuration (here, YOLOv5 s, m, l and x).
- name: training name (yolov5s_results, yolov5m_results, yolov5l_results and yolov5x_results).

In [51]:
# train yolov5 on custom data for 10 epochs
# time its performance
%time
%cd /workspace/notebook_wandbai_yolov5/yolov5/
!python train.py --img 416 --batch 16 --epochs 10 --project wandb_yolov5 --data '../data.yaml' --cfg ./models/custom_yolov5s.yaml --weights '' --name yolov5s_results  --cache

CPU times: user 4 µs, sys: 2 µs, total: 6 µs
Wall time: 15.3 µs
/workspace/notebook_wandbai_yolov5/yolov5
[34m[1mgithub: [0mskipping check (Docker image)
YOLOv5 v4.0-126-g886f1c0 torch 1.8.1 CUDA:0 (Tesla V100S-PCIE-32GB, 32510.5MB)
                                     CUDA:1 (Tesla V100S-PCIE-32GB, 32510.5MB)

Namespace(adam=False, batch_size=16, bucket='', cache_images=True, cfg='./models/custom_yolov5s.yaml', data='../data.yaml', device='', entity=None, epochs=1, evolve=False, exist_ok=False, global_rank=-1, hyp='data/hyp.scratch.yaml', image_weights=False, img_size=[416, 416], linear_lr=False, local_rank=-1, log_artifacts=False, log_imgs=16, multi_scale=False, name='yolov5s_results_1epoch', noautoanchor=False, nosave=False, notest=False, project='wandb_yolov5', quad=False, rect=False, resume=False, save_dir='wandb_yolov5/yolov5s_results_1epoch', single_cls=False, sync_bn=False, total_batch_size=16, weights='', workers=8, world_size=1)
Start Tensorboard with "tensorboard --logdir

Here we have an example of training on 1 epoch for the model YOLOv5s.

### Overview of dynamic display with Weights & Biases

You can display several metrics with Weights & Biases:

**Loss functions for the training and the validation sets:**

- Box: loss due to a box prediction not exactly covering an object.
- Objectness: loss due to a wrong box-object IoU prediction.
- Classification: loss due to deviations from predicting ‘1’ for the correct classes and ‘0’ for all the other classes for the object in that box.

<img src="attachment:2c993174-bc8f-41b2-a5e9-e8f1a454436e.png">

**Precision & Recall:**
- Precision: measures how accurate are the predictions. It is the percentage of your correct predictions.
- Recall: measures how good it finds all the positives.

**mAP (mean Average Precision):** compares the ground-truth bounding box to the detected box and returns a score. The higher the score, the more accurate the model is in its detections.
- mAP@ 0.5：when IoU is set to 0.5, the AP of all pictures of each category is calculated, and then all categories are averaged : mAP.
- mAP@ 0.5:0.95：represents the average mAP at different IoU thresholds (from 0.5 to 0.95 in steps of 0.05).

<img src="attachment:863cc0a2-d112-486d-866c-b2afd2af4d3f.png">

You can compare your trainings by creating an "Parallel coordinates" graph on Weights & Biases. You will be able to evaluate the performance of your trainings.

<img src="attachment:9a363a3a-9fd2-4742-afd4-9e1106aa4a27.png">

Here we see that the best model seems to be the YOLOv5x for 10 epochs.

You can also see some images from the training:

<img src="attachment:60f448f3-3805-4c09-ba3b-b0ffd7de5515.png">

### Use of computing ressources

Weights and Biases allows you to study how models actually use their computational resources. You can therefore see in a simple way how your computational resources are used.

Overview:

<img src="attachment:dc688892-52f6-4d08-bfe0-2b95fbd9df06.png">