[![Fixel Algorithms](https://fixelalgorithms.co/images/CCExt.png)](https://fixelalgorithms.gitlab.io)

# Deep Learning Methods

## Deep Learning for Computer Vision - Object Detection - Object Localization

> Notebook by:
> - Royi Avital RoyiAvital@fixelalgorithms.com

## Revision History

| Version | Date       | User        |Content / Changes                                                   |
|---------|------------|-------------|--------------------------------------------------------------------|
| 1.1.000 | 27/01/2026 | Royi Avital | Added the use of Depth Wise Separable Convolution                  |
| 1.1.000 | 27/01/2026 | Royi Avital | The dataset returns a tuple of the label and the box               |
| 1.0.001 | 13/06/2024 | Royi Avital | Fixed issue with the class label of the results                    |
| 1.0.000 | 08/06/2024 | Royi Avital | First version                                                      |

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/FixelAlgorithmsTeam/FixelCourses/blob/master/AIProgram/2024_02/0098DeepLearningObjectLocalization.ipynb)

In [None]:
# Import Packages

# General Tools
import numpy as np
import scipy as sp
import pandas as pd

# Machine Learning
from sklearn.metrics import confusion_matrix, ConfusionMatrixDisplay
from sklearn.model_selection import ParameterGrid

# Deep Learning
import torch
import torch.nn            as nn
import torch.nn.functional as F
import torchinfo
from torchmetrics.classification import MulticlassAccuracy
from torchmetrics.detection.iou import IntersectionOverUnion

import torchvision
from torchvision.transforms import v2 as TorchVisionTrns

# Miscellaneous
import os
from platform import python_version
import random

# Typing
from typing import Callable, Dict, Generator, List, Optional, Self, Set, Tuple, Union
from numpy.typing import NDArray
from torch import Tensor

# Visualization
import matplotlib.pyplot as plt

# Jupyter
from IPython import get_ipython

## Notations

* <font color='red'>(**?**)</font> Question to answer interactively.
* <font color='blue'>(**!**)</font> Simple task to add code for the notebook.
* <font color='green'>(**@**)</font> Optional / Extra self practice.
* <font color='brown'>(**#**)</font> Note / Useful resource / Food for thought.

Code Notations:

```python
someVar    = 2; #<! Notation for a variable
vVector    = np.random.rand(4) #<! Notation for 1D array
mMatrix    = np.random.rand(4, 3) #<! Notation for 2D array
tTensor    = np.random.rand(4, 3, 2, 3) #<! Notation for nD array (Tensor)
tuTuple    = (1, 2, 3) #<! Notation for a tuple
lList      = [1, 2, 3] #<! Notation for a list
dDict      = {1: 3, 2: 2, 3: 1} #<! Notation for a dictionary
oObj       = MyClass() #<! Notation for an object
dfData     = pd.DataFrame() #<! Notation for a data frame
dsData     = pd.Series() #<! Notation for a series
hObj       = plt.Axes() #<! Notation for an object / handler / function handler
```

### Code Exercise

 - Single line fill

```python
valToFill = ???
```

 - Multi Line to Fill (At least one)

```python
# You need to start writing
?????
```

 - Section to Fill

```python
#===========================Fill This===========================#
# 1. Explanation about what to do.
# !! Remarks to follow / take under consideration.
mX = ???

?????
#===============================================================#
```

In [None]:
# Configuration
# %matplotlib inline

seedNum = 512
np.random.seed(seedNum)
random.seed(seedNum)

# Matplotlib default color palette
lMatPltLibclr = ['#1f77b4', '#ff7f0e', '#2ca02c', '#d62728', '#9467bd', '#8c564b', '#e377c2', '#7f7f7f', '#bcbd22', '#17becf']
# sns.set_theme() #>! Apply SeaBorn theme

runInGoogleColab = 'google.colab' in str(get_ipython())

# Improve performance by benchmarking
torch.backends.cudnn.benchmark = True

# Reproducibility (Per PyTorch Version on the same device)
# torch.manual_seed(seedNum)
# torch.backends.cudnn.deterministic = True
# torch.backends.cudnn.benchmark     = False #<! Makes things slower

In [None]:
# Constants

FIG_SIZE_DEF    = (8, 8)
ELM_SIZE_DEF    = 50
CLASS_COLOR     = ('b', 'r')
EDGE_COLOR      = 'k'
MARKER_SIZE_DEF = 10
LINE_WIDTH_DEF  = 2

PROJECT_NAME     = 'FixelCourses'
DATA_FOLDER_NAME = 'DataSets'
BASE_FOLDER_PATH = os.getcwd()[:(len(os.getcwd()) - (os.getcwd()[::-1].lower().find(PROJECT_NAME.lower()[::-1])))]
DATA_FOLDER_PATH = os.path.join(BASE_FOLDER_PATH, DATA_FOLDER_NAME)

TENSOR_BOARD_BASE = 'TB'

D_CLASSES  = {0: 'Red', 1: 'Green', 2: 'Blue'}
L_CLASSES  = ['R', 'G', 'B']
T_IMG_SIZE = (100, 100, 3)

In [None]:
# Download Auxiliary Modules for Google Colab
if runInGoogleColab:
    !wget https://raw.githubusercontent.com/FixelAlgorithmsTeam/FixelCourses/master/AIProgram/2024_02/DataManipulation.py
    !wget https://raw.githubusercontent.com/FixelAlgorithmsTeam/FixelCourses/master/AIProgram/2024_02/DataVisualization.py
    !wget https://raw.githubusercontent.com/FixelAlgorithmsTeam/FixelCourses/master/AIProgram/2024_02/DeepLearningPyTorch.py

In [None]:
# Courses Packages

from DataManipulation import BBoxFormat
from DataManipulation import GenLabeldEllipseImg
from DataVisualization import PlotBox, PlotBBox, PlotLabelsHistogram
from DeepLearningPyTorch import ObjectLocalizationDataset
from DeepLearningPyTorch import GenDataLoaders, GetBatch, InitWeightsKaiNorm, TrainModel, TrainModelSch

* <font color='blue'>(**!**)</font> Go through `GenLabeldDataEllipse()`.
* <font color='blue'>(**!**)</font> Go through `ObjectLocalizationDataset`.

In [None]:
# General Auxiliary Functions

def GenData( numSamples: int, tuImgSize: Tuple[int, int, int], boxFormat: BBoxFormat = BBoxFormat.YOLO ) -> Tuple[NDArray, NDArray, NDArray]:
    """
    Generate synthetic data for object localization..
    Parameters
    ----------
    numSamples : int
        Number of samples to generate.
    tuImgSize : Tuple[int, int, int]
        Image size as (Height, Width, Channels).
    boxFormat : BBoxFormat, optional
        Bounding box format, by default BBoxFormat.YOLO.
    Returns
    -------
    Tuple[np.ndarray, np.ndarray, np.ndarray]
        mX : np.ndarray
            Feature matrix of shape (numSamples, Channels, Height, Width).
        vY : np.ndarray
            Label vector of shape (numSamples,).
        mB : np.ndarray
            Bounding box matrix of shape (numSamples, 4).
    """

    numObj = 1
    
    mX = np.empty(shape = (numSamples, *tuImgSize[::-1]))
    vY = np.empty(shape = numSamples, dtype = np.int_)
    mB = np.empty(shape = (numSamples, 4))

    for ii in range(numSamples):
        mI, vLbl, mBB = GenLabeldEllipseImg(tuImgSize[:2], numObj, boxFormat = boxFormat)
        mX[ii]  = np.transpose(mI, (2, 0, 1)) #<! (C, H, W)
        vY[ii]  = vLbl[0]
        mB[ii]  = mBB[0]
    
    return mX, vY, mB

## Object Localization

The composability of _Deep Learning_ loss allows combining 2 tasks into a _single model_.  
_Object Localization_ is a composition of 2 tasks:

 - Classification: Identify the object class.
 - Regression: Localize the object by a _Bounding Box_ (BB).

This notebook demonstrates:
 - Generating a synthetic data set.
 - Building a model for _object localization_
 - Training a model with a composed objective.

</br>

* <font color='brown'>(**#**)</font> In the notebook context _Object Localization_ assumes the existence of an object in the image and only a single object.
* <font color='brown'>(**#**)</font> In the notebook context _Object Detection_ generalizes the task to support the case of non existence or several objects.
* <font color='brown'>(**#**)</font> The motivation for a synthetic dataset is being able to implement the whole training process (Existing datasets are huge).  
  Yet the ability to create synthetic dataset is a useful skill.
* <font color='brown'>(**#**)</font> There are known datasets for object detection: [COCO Dataset](https://cocodataset.org), [PASCAL VOC](http://host.robots.ox.ac.uk/pascal/VOC/).   
  They also define standards for the labeling system.  
  Training them is on the scale of days.
* <font color='brown'>(**#**)</font> [Object Detection Annotation Formats](https://albumentations.ai/docs/3-basic-usage/bounding-boxes-augmentations).

In [None]:
# Parameters

# Data
numSamplesTrain = 30_000
numSamplesVal   = 10_000
boxFormat       = BBoxFormat.YOLO
numCls          = len(L_CLASSES) #<! Number of classes

# Model
dropP = 0.5 #<! Dropout Layer

# Training
batchSize  = 256
numWorkers = 2 #<! Number of workers
numEpochs  = 35
λ          = 20.0 #<! Localization Loss
ϵ          = 0.1  #<! Label Smoothing

# Visualization
numImg = 3

## Generate / Load Data

The data is synthetic data.  
Each image includes and Ellipse where its color is the class (`R`, `G`, `B`) and the bounding rectangle.

* <font color='brown'>(**#**)</font> The label is a vector of `5`: `[Class, xCenter, yCenter, boxWidth, boxHeight]`.  
* <font color='brown'>(**#**)</font> The label is in `YOLO` format, hence it is normalized to `[0, 1]`.


In [None]:
# Image Sample

mI, vY, mBB = GenLabeldEllipseImg(T_IMG_SIZE[:2], 1, boxFormat = boxFormat)
vBox = mBB[0] #<! Matrix to support multiple objects in a single image
clsIdx = vY[0]
hA = PlotBox(mI, L_CLASSES[clsIdx], vBox)

* <font color='brown'>(**#**)</font> One could use negative values for the bounding box. The model will extrapolate the object dimensions.

In [None]:
# Generate Data

tXTrain, vYTrain, mBBTrain = GenData(numSamplesTrain, T_IMG_SIZE, boxFormat = boxFormat)
tXVal,   vYVal,   mBBVal   = GenData(numSamplesVal, T_IMG_SIZE, boxFormat = boxFormat)

print(f'The training data set data shape: {tXTrain.shape}')
print(f'The training data set labels shape: {vYTrain.shape}')
print(f'The training data set box shape: {mBBTrain.shape}')
print(f'The validation data set data shape: {tXVal.shape}')
print(f'The validation data set labels shape: {vYTrain.shape}')
print(f'The validation data set box shape: {mBBVal.shape}')

In [None]:
# Generate Data

dsTrain = ObjectLocalizationDataset(tXTrain, vYTrain, mBBTrain)
dsVal   = ObjectLocalizationDataset(tXVal, vYVal, mBBVal)
lClass  = dsTrain.GetLabels(uniqueCls = False)

print(f'The training data set data shape: {dsTrain._tX.shape}')
print(f'The test data set data shape: {dsVal._tX.shape}')
print(f'The unique values of the labels: {np.unique(lClass)}')

* <font color='brown'>(**#**)</font> PyTorch with the `v2` transforms deals with bounding boxes using special type: `BoundingBoxes`.
* <font color='brown'>(**#**)</font> For _data augmentation_ see:
    - [Transforming and Augmenting Images](https://pytorch.org/vision/stable/transforms.html).
    - [Getting Started with Transforms v2](https://pytorch.org/vision/stable/auto_examples/transforms/plot_transforms_getting_started.html).
    - [Transforms v2: End to End Object Detection / Segmentation Example](https://pytorch.org/vision/stable/auto_examples/transforms/plot_transforms_e2e.html).
    - [How to Write Your Own v2 Transforms](https://pytorch.org/vision/stable/auto_examples/transforms/plot_custom_transforms.html).

In [None]:
# Element of the Data Set

tX, (valY, vB) = dsTrain[0]

print(f'The features shape: {tX.shape}')
print(f'The label value: {valY}')
print(f'The bounding box value: {vB}')

* <font color='brown'>(**#**)</font> Since the labels are in the same contiguous container as the bounding box parameters, their type is `Float`.
* <font color='brown'>(**#**)</font> The bounding box is using absolute values. In practice it is commonly normalized to the image dimensions.

### Plot the Data

In [None]:
# Plot the Data

hA = PlotBox(np.transpose(tX, (1, 2, 0)), L_CLASSES[valY], vB)

In [None]:
# Histogram of Labels

hA = PlotLabelsHistogram(lClass, lClass = L_CLASSES);

### Data Loaders

This section defines the data loaded.



In [None]:
# Data Loader

# dlTrain = torch.utils.data.DataLoader(dsTrain, shuffle = True, batch_size = 1 * batchSize, num_workers = numWorkers, drop_last = True, persistent_workers = True)
# dlVal   = torch.utils.data.DataLoader(dsVal, shuffle = False, batch_size = 2 * batchSize, num_workers = numWorkers, persistent_workers = True)

# In case of errors with the workers, use the following code to disable them
dlTrain = torch.utils.data.DataLoader(dsTrain, shuffle = True, batch_size = 1 * batchSize, num_workers = 0, drop_last = True, persistent_workers = False)
dlVal   = torch.utils.data.DataLoader(dsVal, shuffle = False, batch_size = 2 * batchSize, num_workers = 0, persistent_workers = False)

In [None]:
# Iterate on the Loader
# The first batch.

tX, (vY, mB) = GetBatch(dlTrain)

print(f'The batch features dimensions: {tX.shape} with dtype {tX.dtype}')
print(f'The batch labels dimensions: {vY.shape} with dtype {vY.dtype}')
print(f'The batch bounding box dimensions: {mB.shape} with dtype {mB.dtype}')

## The Model

This section defines the model.  

* <font color='brown'>(**#**)</font> The following implementation has a model with a single output, both for the regression and the classification.
* <font color='brown'>(**#**)</font> One could create 2 different outputs (_Heads_) for each task.

### Depth Wise Separable Convolution

The Depth Wise Separable Convolution merges 2 concept:
 - Spatial Convolutions in Groups.
 - Projection by `1x1` Convolution.

* <font color='brown'>(**#**)</font> See [Animated AI - Groups, Depthwise and Depthwise Separable Convolution (Neural Networks)](https://www.youtube.com/watch?v=vVaRhZXovbw).
* <font color='brown'>(**#**)</font> The concept was made popular by models such as EfficientNet ([EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks](https://arxiv.org/abs/1905.11946)) and [MobileNet](https://en.wikipedia.org/wiki/MobileNet).

In [None]:
# Depth Wise Separable Convolutional Layer

class DepthWiseSeparableConv2D( nn.Module ):
    def __init__( self, inChannels: int, outChannels: int, kernelSize: int, stride: int = 1, padding: Union[int, str] = 0, bias: bool = True ) -> None:
        super(DepthWiseSeparableConv2D, self).__init__()

        self.oDepthWiseSeparableConv2D = nn.Sequential(
            nn.Conv2d(inChannels, inChannels, kernel_size = kernelSize, stride = stride, padding = padding, groups = inChannels, bias = bias),
            nn.Conv2d(inChannels, outChannels, kernel_size = 1, stride = 1, padding = 0, bias = bias)
        )
    
    def forward( self, tX: Tensor ) -> Tensor:
        
        tX = self.oDepthWiseSeparableConv2D(tX)
        
        return tX

In [None]:
# Model
# Model generating function.

def BuildModel( numCls: int, useDepthWiseSeparable: bool = False ) -> nn.Module:

    if useDepthWiseSeparable:
        oConvLayer = DepthWiseSeparableConv2D
    else:
        oConvLayer = nn.Conv2d

    oModel = nn.Sequential(
        nn.Identity(),
        nn.Conv2d(3,   32,  3, stride = 2, padding = 0, bias = False), nn.BatchNorm2d(32 ), nn.ReLU(),
        oConvLayer(32,  32,  3, stride = 1, padding = 1, bias = False), nn.BatchNorm2d(32 ), nn.ReLU(),
        oConvLayer(32,  32,  3, stride = 2, padding = 0, bias = False), nn.BatchNorm2d(32 ), nn.ReLU(),
        oConvLayer(32,  32,  3, stride = 1, padding = 1, bias = False), nn.BatchNorm2d(32 ), nn.ReLU(),
        oConvLayer(32,  32,  3, stride = 1, padding = 1, bias = False), nn.BatchNorm2d(32 ), nn.ReLU(),
        oConvLayer(32,  64,  3, stride = 2, padding = 1, bias = False), nn.BatchNorm2d(64 ), nn.ReLU(),
        oConvLayer(64,  64,  3, stride = 1, padding = 1, bias = False), nn.BatchNorm2d(64 ), nn.ReLU(),
        oConvLayer(64,  64,  3, stride = 1, padding = 1, bias = False), nn.BatchNorm2d(64 ), nn.ReLU(),
        oConvLayer(64,  64,  3, stride = 1, padding = 1, bias = False), nn.BatchNorm2d(64 ), nn.ReLU(),
        oConvLayer(64,  64,  3, stride = 1, padding = 1, bias = False), nn.BatchNorm2d(64 ), nn.ReLU(),
        oConvLayer(64,  64,  3, stride = 2, padding = 1, bias = False), nn.BatchNorm2d(64 ), nn.ReLU(),
        oConvLayer(64,  128, 3, stride = 1, padding = 0, bias = False), nn.BatchNorm2d(128), nn.ReLU(),
        oConvLayer(128, 256, 3, stride = 1, padding = 0, bias = False), nn.BatchNorm2d(256), nn.ReLU(),
        oConvLayer(256, 512, 2, stride = 1, padding = 0, bias = False), nn.BatchNorm2d(512), nn.ReLU(),
        nn.Conv2d(512, numCls + 4, 1, stride = 1, padding = 0, bias = True),
        nn.Flatten()
    )

    return oModel 

* <font color='red'>(**?**)</font> What's the motivation for the depth of the model (Relatively deep)?
* <font color='red'>(**?**)</font> Explain the actual operation of the last `Conv2D` layer. Can it be replaced with a `Linear` layer?
* <font color='brown'>(**#**)</font> One could set the image to a power of 2. Then all convolution layers could have been with padding and stride of 2 until the size is `1x1`.

In [None]:
# Build the Model

oModel = BuildModel(len(L_CLASSES), useDepthWiseSeparable = True)

In [None]:
# Model Information
# Pay attention to the layers name.
torchinfo.summary(oModel, (batchSize, *(T_IMG_SIZE[::-1])), col_names = ['kernel_size', 'output_size', 'num_params'], device = 'cpu', row_settings = ['depth', 'var_names'])

* <font color='red'>(**?**)</font> Explain the dimensions of the last layer.
* <font color='red'>(**?**)</font> Will the model work with smaller images?
* <font color='blue'>(**!**)</font> Compare the number of parameters when using _Depth Wise Separable Convolution_ and using regular convolution layer. 

## Train the Model

This section trains the model.  

* <font color='brown'>(**#**)</font> The training loop must be adapted to the new loss function.

### Image Localization Loss

The loss is a composite of 2 loss functions:

$$\ell\left(\hat{\boldsymbol{y}},\boldsymbol{y}\right)=\lambda_{\text{MSE}}\cdot\ell_{\text{MSE}}\left(\hat{\boldsymbol{y}}_{\text{bbox}},\boldsymbol{y}_{\text{bbox}}\right)+\lambda_{\text{CE}}\cdot\ell_{\text{CE}}\left(\hat{\boldsymbol{y}}_{\text{label}},\boldsymbol{y}_{\text{label}}\right)$$

Where $\lambda_{\text{MSE}}$ and $\lambda_{\text{CE}}$ are the weights of each loss.

* <font color='brown'>(**#**)</font> In practice a single $\lambda$ is required.
* <font color='brown'>(**#**)</font> The MSE is not optimal loss function. It will be replaced by the _Log Euclidean_ loss.

In [None]:
# Object Localization Loss
class ObjLocLoss( nn.Module ):
    def __init__( self, numCls: int, λ: float, ϵ: float = 0.0 ) -> None:
        super(ObjLocLoss, self).__init__()

        self.numCls   = numCls
        self.λ        = λ
        self.ϵ        = ϵ
        self.oRegLoss = nn.MSELoss()
        self.oClsLoss = nn.CrossEntropyLoss(label_smoothing = ϵ)
    
    def forward( self: Self, mYHat: Tensor, tuY: Tuple[Tensor, Tensor] ) -> Tensor:

        regLoss = self.oRegLoss(mYHat[:, self.numCls:], tuY[1])
        clsLoss = self.oClsLoss(mYHat[:, :self.numCls], tuY[0])

        lossVal = (self.λ * regLoss) + clsLoss
		
        return lossVal

### Image Localization Score

The score is defined by the _IoU_ of a valid classification:

$$\text{Score}=\frac{1}{N}\sum_{i=1}^{N}\mathbb{I}\left\{ \hat{y}_{i}=y_{i}\right\} \cdot\text{IoU}\left(\hat{B}_{i},B_{i}\right)$$

Where:
- $\hat{y}_{i}$ is the predicted label
- $y_{i}$ is the correct label
- $\hat{B}_{i}$ is the predicted bounding box
- $B_{i}$ is the correct bounding box
In other words, the average IoU, considering only correct (label) prediction.

* <font color='red'>(**?**)</font> What are the bounds of the values of the score function?
* <font color='red'>(**?**)</font> Is higher or lower value bette for the score?

In [None]:
# Object Localization Score
class ObjLocScore( nn.Module ):
    def __init__( self, numCls: int ) -> None:
        super(ObjLocScore, self).__init__()

        self.numCls = numCls
    
    def forward( self: Self, mYHat: Tensor, tuY: Tuple[Tensor, Tensor] ) -> Tuple[float, float, float]:

        batchSize = mYHat.shape[0]
        
        vY, mBox = tuY[0], tuY[1]
        vIoU = torch.diag(torchvision.ops.box_iou(torchvision.ops.box_convert(mYHat[:, self.numCls:], 'cxcywh', 'xyxy'), torchvision.ops.box_convert(mBox, 'cxcywh', 'xyxy')))
        vCor = (vY == torch.argmax(mYHat[:, :self.numCls], dim = 1)).to(torch.float32) #<! Correct labels

        # valIoU      = torch.mean(vIoU).item()
        # valAcc      = torch.mean(vCor).item()
        valScore    = torch.inner(vIoU, vCor) / batchSize #<! Only correct predictions contribute to the score
		
        return valScore

In [None]:
# Run Device

runDevice = torch.device('cuda:0' if torch.cuda.is_available() else 'cpu') #<! The 1st CUDA device

In [None]:
# Loss and Score Function

hL = ObjLocLoss(numCls = numCls, λ = λ, ϵ = ϵ)
hS = ObjLocScore(numCls = numCls)

hL = hL.to(runDevice)
hS = hS.to(runDevice)

In [None]:
# Training Loop

oModel = oModel.to(runDevice)
oOpt = torch.optim.AdamW(oModel.parameters(), lr = 1e-5, betas = (0.9, 0.99), weight_decay = 1e-5) #<! Define optimizer
oSch = torch.optim.lr_scheduler.OneCycleLR(oOpt, max_lr = 5e-4, total_steps = numEpochs)
_, lTrainLoss, lTrainScore, lValLoss, lValScore, lLearnRate = TrainModel(oModel, dlTrain, dlVal, oOpt, numEpochs, hL, hS, oSch = oSch)

In [None]:
# Plot Training Phase

hF, vHa = plt.subplots(nrows = 1, ncols = 3, figsize = (12, 5))
vHa = np.ravel(vHa)

hA = vHa[0]
hA.plot(lTrainLoss, lw = 2, label = 'Train')
hA.plot(lValLoss, lw = 2, label = 'Validation')
hA.set_title(f'Object Localization Loss (λ = {λ:0.1f})')
hA.set_xlabel('Epoch')
hA.set_ylabel('Loss')
hA.legend()

hA = vHa[1]
hA.plot(lTrainScore, lw = 2, label = 'Train')
hA.plot(lValScore, lw = 2, label = 'Validation')
hA.set_title('Object Localization Score')
hA.set_xlabel('Epoch')
hA.set_ylabel('Score')
hA.legend()

hA = vHa[2]
hA.plot(lLearnRate, lw = 2)
hA.set_title('Learn Rate Scheduler')
hA.set_xlabel('Epoch')
hA.set_ylabel('Learn Rate');

In [None]:
# Plot Prediction
# TODO: Check classification

rndIdx = np.random.randint(numSamplesVal)

tX, (valY, vB) = dsVal[rndIdx]

with torch.no_grad():
    tX = torch.tensor(tX)
    tX = torch.unsqueeze(tX, 0)
    tX = tX.to(runDevice)
    mYHat = oModel(tX).detach().cpu().numpy()

vYHat   = mYHat[0]
valYHat = np.argmax(vYHat[:numCls])
vBHat   = vYHat[numCls:]

hA = PlotBox(np.transpose(tX, (1, 2, 0)), L_CLASSES[valY], vB)
hA = PlotBBox(hA, L_CLASSES[valYHat], vBHat)

* <font color='red'>(**?**)</font> What would be the results if the generated data had more small ellipses?
* <font color='green'>(**@**)</font> Display the _accuracy_ and _IoU_ scores and _MSE_ and _CE_ loss over the epochs.   
  It will require updating the Loss, Score classes and the training function.