# ML-Assisted Data Annotation toolkit
In this demo, we show a use case of our ML-assisted data anotation pipeline particularly for the 2D object detection task.

## Pipeline

Figure. 1 provides an overview of an annotation pipeline which can be supported by the toolkit. The pipeline consists of the following 6 major steps:
Unlabelled data (Input): The toolkit allows user to load unlabelled data in supported data formats. 

- **Selected pre-trained model (input):** The user selects a pre-trained model from an existing model pool or insert their pre-trained propriety model.  

- **Select a subset of data:** A subset of dataset is selected for the current round of labelling. The order of subset selection can be defined either randomly, by user’s preference (e.g., to prioritise images with new labels) or by active-learning prioritisation.  

- **Run the ML model to label the data:** The ML model processes the subset of data and produces pre-annotation.  

- **Create/Refine/Review labels by user:** The user can manually create, refine, or review the produced labels resulting in the final labels for the current subset.  

- **Evaluate the model:** The ML model performance is evaluated on this subset by comparing the produced labels with the manually refined ones.  

- **Check the performance:** The performance score of the model is checked against a threshold. If the score surpasses the threshold, next round of labelling will be repeated from step 1, otherwise, the model is re-trained in the following step.  

- **Retrain the model:** The model is re-trained on this subset using the hyper-parameters defined by the user. The next round of labelling will be initiated when the training is finished. 

- **Labelled data:** The toolkit allows exporting labelled data in supported data formats.

![pipeline](assets/pipeline.png)

*Figure1. An overview of the ML-assisted data annotation pipeline*

### Importing the packages

In [1]:
from utils.utils import non_max_suppression, bbox_iou_numpy, compute_ap, Visualizer, weights_init_normal, draw_bbox, convert_target_to_detection
from utils.datasets import Image2DAnnotationDataset
from torch.utils.data import Subset, ConcatDataset
from collections import defaultdict
import torch
import pickle
from random import sample
import os
import argparse
import tqdm
import numpy as np
import random
from models import Darknet


### Subset selection

The next code snippet defines a function for selecting a subset of data based on a query method, either 'random' or 'confidence-based'.

In [2]:
def subset_selection(model, train_dataset, subset_size, query_mode, opt):
    model.eval()
    selection_size = min(len(train_dataset), subset_size)
    if query_mode == 'random':
        subset_indices = np.random.choice(np.arange(len(train_dataset)), selection_size, replace=False)
        
    elif query_mode == 'conf':
        sample_scores = np.ones(len(train_dataset), dtype=np.float32)
        dataloader = torch.utils.data.DataLoader(train_dataset, batch_size=opt.batch_size, shuffle=False, num_workers=opt.n_cpu)
        output_list = []
        with torch.no_grad():
            for batch_i, (_, imgs, _) in enumerate(tqdm.tqdm(dataloader, desc="Selecting the most informative samples")):
                imgs = imgs.to('cuda')
                outputs = model(imgs)
                outputs = non_max_suppression(outputs, 80, conf_thres=opt.conf_thres, nms_thres=opt.nms_thres)
                output_list.extend(outputs)
        for i, output in enumerate(output_list):
            if  output is not None:
                box_scores = output[:, 4].cpu().numpy()
                sample_scores[i] = box_scores.mean()
        subset_indices = np.argsort(sample_scores)[:selection_size]
    labelled_sub_dataset = Subset(train_dataset, subset_indices)

    return labelled_sub_dataset, subset_indices

### Visualisation

The following code snippet defines a function for visualisation of an image and its annotation.

In [3]:
def visualise_annotation(model, img_path, img, target, classes_to_labels, opt, labels, img_size=416, resize_tuple=None, show_target=False):
    model.eval()
    img = img.to('cuda').unsqueeze(axis=0)
    with torch.no_grad():
        output = model(img)
        output = non_max_suppression(output, 80, classes_to_labels, conf_thres=opt.conf_thres, nms_thres=opt.nms_thres)
    if show_target:
        output_ = convert_target_to_detection(target, img_size)
        if output_ is not None:
            draw_bbox(img_path, output_, labels, img_size, resize_tuple)
    if output[0] is not None:
        draw_bbox(img_path, output[0], labels, img_size, resize_tuple)

### Evaluation
The next code snippet defines a function for evaluation of model based on mean average precision (mAP).

In [4]:
def evaluate(model, data_loader, num_classes, opt, best_mAP, classes_to_labels, labels_to_classes, avg_precision, labels, vis, iteration, is_val=True):
    model.eval()
    all_detections = []
    all_annotations = []
    for batch_i, (imgs_path, imgs, targets) in enumerate(tqdm.tqdm(data_loader, desc="mAP_evaluation")):

        imgs = imgs.to('cuda')
        with torch.no_grad():
            outputs = model(imgs)
            outputs = non_max_suppression(outputs, 80, classes_to_labels, conf_thres=opt.conf_thres, nms_thres=opt.nms_thres)

        for output, annotations in zip(outputs, targets):
            all_detections.append([np.array([]) for _ in range(num_classes)])
            if output is not None:
                # Get predicted boxes, confidence scores and labels
                pred_boxes = output[:, :5].cpu().numpy()
                scores = output[:, 4].cpu().numpy()
                pred_labels = output[:, -1].cpu().numpy()

                # Order by confidence
                sort_i = np.argsort(scores)
                pred_labels = pred_labels[sort_i]
                pred_boxes = pred_boxes[sort_i]

                for label in range(num_classes):
                    all_detections[-1][label] = pred_boxes[pred_labels == label]

            all_annotations.append([np.array([]) for _ in range(num_classes)])
            if any(annotations[:, -1] > 0):

                annotation_labels = annotations[annotations[:, -1] > 0, 0].numpy()
                _annotation_boxes = annotations[annotations[:, -1] > 0, 1:]

                # Reformat to x1, y1, x2, y2 and rescale to image dimensions
                annotation_boxes = np.empty_like(_annotation_boxes)
                annotation_boxes[:, 0] = _annotation_boxes[:, 0] - _annotation_boxes[:, 2] / 2
                annotation_boxes[:, 1] = _annotation_boxes[:, 1] - _annotation_boxes[:, 3] / 2
                annotation_boxes[:, 2] = _annotation_boxes[:, 0] + _annotation_boxes[:, 2] / 2
                annotation_boxes[:, 3] = _annotation_boxes[:, 1] + _annotation_boxes[:, 3] / 2
                annotation_boxes *= opt.img_size

                for label in range(num_classes):
                    all_annotations[-1][label] = annotation_boxes[annotation_labels == label, :]

    average_precisions = {}
    for label in range(num_classes):
        if labels_to_classes[label] != -1:
            true_positives = []
            scores = []
            num_annotations = 0

            for i in range(len(all_annotations)):
                detections = all_detections[i][label]
                annotations = all_annotations[i][label]

                num_annotations += annotations.shape[0]
                detected_annotations = []

                for *bbox, score in detections:
                    scores.append(score)

                    if annotations.shape[0] == 0:
                        true_positives.append(0)
                        continue

                    overlaps = bbox_iou_numpy(np.expand_dims(bbox, axis=0), annotations)
                    assigned_annotation = np.argmax(overlaps, axis=1)
                    max_overlap = overlaps[0, assigned_annotation]

                    if max_overlap >= opt.iou_thres and assigned_annotation not in detected_annotations:
                        true_positives.append(1)
                        detected_annotations.append(assigned_annotation)
                    else:
                        true_positives.append(0)

            # no annotations -> AP for this class is 0
            if num_annotations == 0:
                average_precisions[label] = 0
                continue

            true_positives = np.array(true_positives)
            false_positives = np.ones_like(true_positives) - true_positives
            # sort by score
            indices = np.argsort(-np.array(scores))
            false_positives = false_positives[indices]
            true_positives = true_positives[indices]
            # compute false positives and true positives
            false_positives = np.cumsum(false_positives)
            true_positives = np.cumsum(true_positives)

            # compute recall and precision
            recall = true_positives / num_annotations
            precision = true_positives / np.maximum(true_positives + false_positives, np.finfo(np.float64).eps)

            # compute average precision
            average_precision = compute_ap(recall, precision)
            average_precisions[label] = average_precision

    mAP = np.mean(list(average_precisions.values()))

    if(mAP > best_mAP):
        best_mAP = mAP
        model.save_weights("%s/kitti_best.weights" % (opt.checkpoint_dir))
        print("New Best AP appear !!! %f" % best_mAP)

    batch_i += 1
    tag = 'val' if is_val else 'subset'
    for k , v in average_precisions.items():
        avg_precision[tag +'_mAP_' + labels[k]].append(v) 
    avg_precision[tag + '_mAP'].append(mAP)
    print_str = 'mAP on validation' if is_val else 'mAP on subset'
    print(print_str)
    for k , v in avg_precision.items():
        print('Iteration:', iteration, k,': ', v[-1])
        vis.plot(k, v[-1], iteration)

    with open(os.path.join(opt.checkpoint_dir, opt.query_mode + '_' + tag + '_avg_p_dict.pkl'), 'wb') as f:
            pickle.dump(avg_precision, f)

    return best_mAP, avg_precision

### Train
The next code snippet defines a function for re-training the mode. The code can also be used to evaluate the model accoring the loss function.

In [5]:
def calc_loss(model, dataloader, epochs, optimizer, vis, total_steps, checkpoint_dir, iteration, labels_to_classes =None, is_training=True):
    model.train(is_training)
    freeze_backbone = 1 if iteration == 0 or iteration==1 else 0
    accumulated_batches = 4
    losses_list = defaultdict(list)
    for epoch in range(epochs if is_training else 1):
            # train
        # Freeze darknet53.conv.74 layers for first some epochs
        if freeze_backbone and is_training:
            if epoch < 60:
                for i, (name, p) in enumerate(model.named_parameters()):
                    if int(name.split('.')[1]) < 75:  # if layer < 75
                        p.requires_grad = False
            elif epoch >= 60:
                for i, (name, p) in enumerate(model.named_parameters()):
                    if int(name.split('.')[1]) < 75:  # if layer < 75
                        p.requires_grad = True
        if is_training:                
            optimizer.zero_grad()   

        for batch_i, (_, imgs, targets) in enumerate(tqdm.tqdm(dataloader, desc='training' if is_training else 'Loss_evaluation')): 
            imgs = imgs.to('cuda')
            targets = targets.to('cuda')
            if labels_to_classes is not None:
                no_annotation_ind = targets.sum(-1) != 0.0
                tensor_labels_to_class = torch.tensor(labels_to_classes).to('cuda')
                x, y = torch.nonzero(no_annotation_ind, as_tuple=True)
                targets[x, y, 0] = tensor_labels_to_class[targets[x, y, 0].long()].double()

            if is_training:
                loss = model(imgs, targets)
                loss.backward()
                if ((batch_i + 1) % accumulated_batches == 0) or (batch_i == len(dataloader) - 1):
                    optimizer.step()
                    optimizer.zero_grad()
                total_steps += 1   
                
                if (batch_i+1) % 1 == 0:
                    for tag, value in model.losses.items():
                        vis.plot('losses_' + tag, value, total_steps)
                    vis.plot('total_loss', loss.item(), total_steps)
                    print(
                        "[Epoch %d/%d, Batch %d/%d] [Losses: x %f, y %f, w %f, h %f, conf %f, cls %f, total %f, recall: %.5f, precision: %.5f]"
                        % (
                            epoch,
                            epochs,
                            batch_i,
                            len(dataloader),
                            model.losses["x"],
                            model.losses["y"],
                            model.losses["w"],
                            model.losses["h"],
                            model.losses["conf"],
                            model.losses["cls"],
                            loss.item(),
                            model.losses["recall"],
                            model.losses["precision"],
                        )
                    )
            else:
                with torch.no_grad():
                    loss = model(imgs, targets)
                    for k , v in model.losses.items():
                        losses_list[k].append(v)
                    losses_list['total_loss'].append(loss.item())
            # model.seen += imgs.size(0)
        losses_list = None if is_training else {k: np.array(v).mean() for k , v in losses_list.items()}   
    return total_steps, losses_list


### Main
The next code snippet use the mentioned function to running the iterations of ML-assisted data annotation.

First, we set the parameters regarding the pre-trained model, dataset for the annotation, and training hyper-parameters.

In [6]:
###### Argparse class
class MargParse():
    def __init__(self, opt):
        for k , v in opt.items():
            setattr(self, k, v)
# Initialise Parameters
opt = {}
####### dataset & evaluation settings ############################################
opt['model_name'] = "yolov3-coco" # path to current model config file
opt['data_dir'] = "./data/kitti_tiny/training" # path to data config file
opt['data_format'] = "kitti" # data format for loading data choose from ['kitti', 'coco', 'nuimage']
opt['iou_thres'] = 0.5 # iou threshold required to qualify as detected
opt['conf_thres'] = 0.8 # object confidence threshold
opt['nms_thres'] = 0.4 # iou thresshold for non-maximum suppression
####### training settings ########################################
opt['epochs'] = 3 # Number of epochs
opt['batch_size'] = 8 # size of each image batch
opt['subset_size'] = 25 # size of the subset
opt['performance_thres'] = 0.5 # performance threshold for continuing the model refinement
opt['img_size'] = 416 # size of each image dimension
opt['checkpoint_dir'] = 'checkpoints' # directory where model checkpoints are saved
opt['query_mode'] = 'random' # mode of active learning subset selection
opt['exp_name'] = 'random'  # name of this experiment
####### general settings
opt['use_cuda'] = True ## use cuda for processings
opt['visualise_detection'] = False ## visualise detection for debugging
opt['n_cpu'] = 4 # number of cpu threads to use during batch generation

opt = MargParse(opt)
# parse arguments
print(opt)
img_size = opt.img_size
## Setting the random seed


<__main__.MargParse object at 0x000001876EEC8D48>


Fixing the random seed and instantiate the model.

In [7]:
seed = 0
random.seed(seed)
np.random.seed(seed)
torch.manual_seed(seed)
torch.backends.cudnn.deterministic = True
torch.backends.cudnn.benchmark = False
cfg = f'config/{opt.model_name}.cfg'
weights_path = f'checkpoints/{opt.model_name}.weights'
###########################  creating class-to-label and lable-to-class maps
model = Darknet(cfg, img_size=img_size)
classes = model.hyperparams['classes'].split(',')
model.apply(weights_init_normal)
model.load_weights(weights_path)
# resize_tuple = (1224, 370)
resize_tuple = None
########################





Here,  we list the classes of the pre-trained model. The user must define its own labels and map each label to a class shown in the classes-to-labels list.

In [8]:
classes_dict = {c:i for i, c in enumerate(classes)}
print('Classes of the pretrained model:\n',classes)
###### The user should define its own labels and the labels-to-class map. If doesn't match any class put -1 in the list.
labels = ['Car','Truck', 'Pedestrian', 'static_object']
labels_to_classes = ['car', 'truck', 'person', -1]
#######################################################################
classes_to_labels = -1 * np.ones(len(classes_dict))

for i, l in enumerate(labels_to_classes):
    if l != -1:
        labels_to_classes[i] = classes_dict[l]
        classes_to_labels[classes_dict[l]] = i

num_labels = len(labels)


Classes of the pretrained model:
 ['person', 'bicycle', 'car', 'motorbike', 'aeroplane', 'bus', 'train', 'truck', 'boat', 'traffic_light', 'fire hydrant', 'stop_sign', 'parking_meter', 'bench', 'bird', 'cat', 'dog', 'horse', 'sheep', 'cow', 'elephant', 'bear', 'zebra', 'giraffe', 'backpack', 'umbrella', 'handbag', 'tie', 'suitcase', 'frisbee', 'skis', 'snowboard', 'sports ball', 'kite', 'baseball bat', 'baseball glove', 'skateboard', 'surfboard', 'tennis_racket', 'bottle', 'wine glass', 'cup', 'fork', 'knife', 'spoon', 'bowl', 'banana', 'apple', 'sandwich', 'orange', 'broccoli', 'carrot', 'hot dog', 'pizza', 'donut', 'cake', 'chair', 'sofa', 'pottedplant', 'bed', 'diningtable', 'toilet', 'tvmonitor', 'laptop', 'mouse', 'remote', 'keyboard', 'cell phone', 'microwave', 'oven', 'toaster', 'sink', 'refrigerator', 'book', 'clock', 'vase', 'scissors', 'teddy bear', 'hair drier', 'toothbrush']


In [9]:



############################ Instantiating visualiser
opt.checkpoint_dir = opt.checkpoint_dir + '/' + opt.exp_name
vis = Visualizer(opt.checkpoint_dir)
os.makedirs(opt.checkpoint_dir, exist_ok=True)
############################ Instantiating and loading the model
cuda = torch.cuda.is_available() and opt.use_cuda
if cuda:
    model = model.cuda()
    classes_to_labels = torch.from_numpy(classes_to_labels).to('cuda')
    print("CUDA is ready")

unlabelled_set = Image2DAnnotationDataset(opt.data_dir, labels, labels_to_classes, opt.data_format, img_size, resize_tuple)
optimizer = torch.optim.Adam(filter(lambda p: p.requires_grad, model.parameters()))

val_best_mAP = 0.0
subset_best_mAP = 0.0

subset_size = opt.subset_size
print("start training")

#### Start the annotation ... 
total_steps = 0
random_ordered_indices = np.arange(len(unlabelled_set))
random.shuffle(random_ordered_indices)
len_val_set = int(0.2 * len(unlabelled_set))
val_set = Subset(unlabelled_set, random_ordered_indices[ :len_val_set])
val_loader = torch.utils.data.DataLoader(val_set, batch_size=opt.batch_size, shuffle=False, num_workers=opt.n_cpu)
unlabelled_set = Subset(unlabelled_set, random_ordered_indices[len_val_set:])

val_avg_p_dict = defaultdict(list)
subset_avg_p_dict = defaultdict(list)
len_test_dataset = len(unlabelled_set)
labelled_subset_list = []
avg_losses_list = defaultdict(list)
for i in tqdm.tqdm(range(int(np.ceil(len_test_dataset / subset_size))), desc='model refining iterations'):
    ################### Subset selection ###################################
    labelled_sub_dataset, subset_indices = subset_selection(model, unlabelled_set, subset_size, opt.query_mode, opt)
    
    remaining_indices = set(np.arange(len(unlabelled_set))).difference(subset_indices)
    unlabelled_set = Subset(unlabelled_set, list(remaining_indices))
    ################### Visualise Detection #################################
    # np.random.randint(0, len(labelled_sub_dataset))
    if opt.visualise_detection:
        img_path, img, target = unlabelled_set[0]
        visualise_annotation(model, img_path, img, target,classes_to_labels, opt, labels, img_size, resize_tuple, show_target=True)
    # ################### Evaluation ########################################
    labelled_sub_set_loader = torch.utils.data.DataLoader(labelled_sub_dataset, batch_size=opt.batch_size, shuffle=False, num_workers=opt.n_cpu)
    ### Calculate MAP
    val_best_mAP, val_avg_p_dict = evaluate(model, val_loader, num_labels, opt, val_best_mAP, classes_to_labels, labels_to_classes, val_avg_p_dict, labels, vis, i, is_val=True)

    subset_best_mAP, subset_avg_p_dict = evaluate(model, labelled_sub_set_loader, num_labels, opt, subset_best_mAP, classes_to_labels, labels_to_classes, subset_avg_p_dict, labels, vis, i, is_val=False)

    # _, losses_list = calc_loss(model, val_loader, opt.epochs, optimizer, vis, total_steps, opt.checkpoint_dir, i, labels_to_classes, is_training=False)
    # for k , v in losses_list.items():
    #     avg_losses_list[k].append(v)
    # for tag, loss in avg_losses_list.items():
    #     vis.plot('losses_' + tag, loss[-1], i)
    
    ###### check performance #######################################################
    if val_best_mAP > opt.performance_thres:
        break
    ##################### Training #############################################
    labelled_subset_list.append(labelled_sub_dataset)
    train_set = ConcatDataset(labelled_subset_list)
    train_loader = torch.utils.data.DataLoader(train_set, batch_size=opt.batch_size, shuffle=True, num_workers=opt.n_cpu)
    total_steps, _ = calc_loss(model, train_loader, opt.epochs, optimizer, vis, total_steps, opt.checkpoint_dir, i, labels_to_classes)
            
    ################### Saving the Model ####################################
    model.save_weights("%s/best.weights" % (opt.checkpoint_dir))
    
### Calculate MAP afte the last training    
_, val_avg_p_dict = evaluate(model, val_loader, num_labels, opt, val_best_mAP, classes_to_labels, labels_to_classes, val_avg_p_dict, labels, vis, i, is_val=True)

CUDA is ready
start training


mAP_evaluation: 100%|██████████| 2/2 [00:05<00:00,  2.82s/it]/s]


New Best AP appear !!! 0.429522
mAP on validation
Iteration: 0 val_mAP_Car :  0.4804861111111112
Iteration: 0 val_mAP_Truck :  0.4444444444444444
Iteration: 0 val_mAP_Pedestrian :  0.36363636363636365
Iteration: 0 val_mAP :  0.42952230639730643


mAP_evaluation: 100%|██████████| 4/4 [00:03<00:00,  1.18it/s]


New Best AP appear !!! 0.473806
mAP on subset
Iteration: 0 subset_mAP_Car :  0.5785616963869359
Iteration: 0 subset_mAP_Truck :  0.5928571428571427
Iteration: 0 subset_mAP_Pedestrian :  0.25
Iteration: 0 subset_mAP :  0.47380627974802625




[Epoch 0/3, Batch 0/4] [Losses: x 0.213886, y 0.099815, w 5.332868, h 2.312359, conf 18.083305, cls 1.375719, total 27.417952, recall: 0.26667, precision: 0.39444]




[Epoch 0/3, Batch 1/4] [Losses: x 0.252892, y 0.086150, w 5.951696, h 2.691932, conf 20.822782, cls 1.376088, total 31.181541, recall: 0.30108, precision: 0.43684]




[Epoch 0/3, Batch 2/4] [Losses: x 0.224635, y 0.155890, w 6.150463, h 3.010690, conf 18.333493, cls 1.376956, total 29.252127, recall: 0.22807, precision: 0.23864]
[Epoch 0/3, Batch 3/4] [Losses: x 0.112565, y 0.137960, w 5.379817, h 3.041387, conf 18.146183, cls 10.705959, total 37.523869, recall: 0.40000, precision: 0.47619]


training: 100%|██████████| 4/4 [00:04<00:00,  1.10s/it]


[Epoch 1/3, Batch 0/4] [Losses: x 0.216084, y 0.112215, w 0.992746, h 1.702456, conf 0.893667, cls 1.372383, total 5.289552, recall: 0.66667, precision: 0.12229]




[Epoch 1/3, Batch 1/4] [Losses: x 0.170177, y 0.091810, w 0.877093, h 1.749662, conf 0.953945, cls 1.358816, total 5.201501, recall: 0.65714, precision: 0.12943]




[Epoch 1/3, Batch 2/4] [Losses: x 0.242648, y 0.068673, w 0.932186, h 1.348275, conf 0.642783, cls 1.397177, total 4.631743, recall: 0.63333, precision: 0.11547]
[Epoch 1/3, Batch 3/4] [Losses: x 0.221220, y 0.080360, w 0.274331, h 1.027227, conf 1.118091, cls 10.941295, total 13.662525, recall: 0.70370, precision: 0.20667]


training: 100%|██████████| 4/4 [00:03<00:00,  1.06it/s]


[Epoch 2/3, Batch 0/4] [Losses: x 0.198397, y 0.088250, w 8.994543, h 12.400980, conf 0.902146, cls 1.315388, total 23.899706, recall: 0.53659, precision: 0.03627]




[Epoch 2/3, Batch 1/4] [Losses: x 0.216735, y 0.087507, w 9.049634, h 10.557042, conf 1.073534, cls 1.299259, total 22.283710, recall: 0.55556, precision: 0.01597]




[Epoch 2/3, Batch 2/4] [Losses: x 0.266999, y 0.105179, w 9.679086, h 12.457331, conf 0.834132, cls 1.301159, total 24.643887, recall: 0.49630, precision: 0.03279]
[Epoch 2/3, Batch 3/4] [Losses: x 0.149421, y 0.110406, w 9.338990, h 12.038698, conf 0.973941, cls 10.339824, total 32.951283, recall: 0.46667, precision: 0.02626]


training: 100%|██████████| 4/4 [00:03<00:00,  1.02it/s]
mAP_evaluation: 100%|██████████| 2/2 [00:03<00:00,  1.98s/it] 21.63s/it]


mAP on validation
Iteration: 1 val_mAP_Car :  0.020066781134745942
Iteration: 1 val_mAP_Truck :  0.0
Iteration: 1 val_mAP_Pedestrian :  0.30303030303030304
Iteration: 1 val_mAP :  0.10769902805501634


mAP_evaluation: 100%|██████████| 4/4 [00:09<00:00,  2.29s/it]


mAP on subset
Iteration: 1 subset_mAP_Car :  0.017678033862670497
Iteration: 1 subset_mAP_Truck :  0.0
Iteration: 1 subset_mAP_Pedestrian :  0.09564777327935223
Iteration: 1 subset_mAP :  0.03777526904734091




[Epoch 0/3, Batch 0/7] [Losses: x 0.251542, y 0.087521, w 1.233335, h 0.742251, conf 0.738591, cls 1.353442, total 4.406682, recall: 0.60000, precision: 0.02279]




[Epoch 0/3, Batch 1/7] [Losses: x 0.266293, y 0.097094, w 1.235161, h 0.474597, conf 0.653649, cls 1.306750, total 4.033544, recall: 0.75000, precision: 0.03102]




[Epoch 0/3, Batch 2/7] [Losses: x 0.252858, y 0.100730, w 0.911937, h 0.890914, conf 0.804977, cls 1.331387, total 4.292803, recall: 0.60256, precision: 0.01880]




[Epoch 0/3, Batch 3/7] [Losses: x 0.244066, y 0.116481, w 1.062871, h 0.693380, conf 0.685960, cls 1.319428, total 4.122186, recall: 0.69697, precision: 0.02906]




[Epoch 0/3, Batch 4/7] [Losses: x 0.203612, y 0.145949, w 1.799565, h 2.321453, conf 0.866822, cls 1.356935, total 6.694335, recall: 0.43333, precision: 0.00757]




[Epoch 0/3, Batch 5/7] [Losses: x 0.206619, y 0.102362, w 1.963921, h 1.931263, conf 0.667683, cls 1.318665, total 6.190511, recall: 0.53086, precision: 0.01130]
[Epoch 0/3, Batch 6/7] [Losses: x 0.161115, y 0.128571, w 2.280692, h 2.456379, conf 0.766230, cls 5.304130, total 11.097119, recall: 0.66667, precision: 0.00655]


training: 100%|██████████| 7/7 [00:09<00:00,  1.34s/it]


[Epoch 1/3, Batch 0/7] [Losses: x 0.167877, y 0.087706, w 1.146786, h 1.826270, conf 0.620702, cls 1.295142, total 5.144484, recall: 0.60000, precision: 0.02252]




[Epoch 1/3, Batch 1/7] [Losses: x 0.221629, y 0.100538, w 1.639547, h 1.915967, conf 0.601553, cls 1.319939, total 5.799173, recall: 0.60317, precision: 0.02331]




[Epoch 1/3, Batch 2/7] [Losses: x 0.203060, y 0.097439, w 1.939681, h 2.481500, conf 0.625762, cls 1.329924, total 6.677365, recall: 0.46667, precision: 0.00942]




[Epoch 1/3, Batch 3/7] [Losses: x 0.192088, y 0.111622, w 2.050146, h 2.339673, conf 0.628411, cls 1.304491, total 6.626431, recall: 0.52222, precision: 0.01429]





[Epoch 1/3, Batch 4/7] [Losses: x 0.174728, y 0.152425, w 1.942549, h 2.470106, conf 0.566234, cls 1.356633, total 6.662675, recall: 0.51111, precision: 0.00732]
[Epoch 1/3, Batch 5/7] [Losses: x 0.193348, y 0.107483, w 1.681719, h 0.952852, conf 0.599990, cls 1.413838, total 4.949229, recall: 0.50000, precision: 0.00671]




[Epoch 1/3, Batch 6/7] [Losses: x 0.157235, y 0.163720, w 0.492739, h 0.313991, conf 0.594899, cls 5.437663, total 7.160246, recall: 0.66667, precision: 0.01152]


training: 100%|██████████| 7/7 [00:08<00:00,  1.28s/it]


[Epoch 2/3, Batch 0/7] [Losses: x 0.140944, y 0.108797, w 1.145240, h 1.801347, conf 0.473326, cls 1.293779, total 4.963431, recall: 0.68056, precision: 0.02195]




[Epoch 2/3, Batch 1/7] [Losses: x 0.225392, y 0.129332, w 1.014208, h 1.180577, conf 0.498544, cls 1.308389, total 4.356441, recall: 0.64583, precision: 0.04583]




[Epoch 2/3, Batch 2/7] [Losses: x 0.234929, y 0.116277, w 1.100287, h 1.599968, conf 0.491621, cls 1.324650, total 4.867731, recall: 0.57333, precision: 0.01866]




[Epoch 2/3, Batch 3/7] [Losses: x 0.190304, y 0.093081, w 1.279493, h 0.970975, conf 0.499528, cls 1.351460, total 4.384840, recall: 0.61728, precision: 0.02158]




[Epoch 2/3, Batch 4/7] [Losses: x 0.134906, y 0.141532, w 0.982407, h 1.353345, conf 0.454776, cls 1.316079, total 4.383045, recall: 0.66667, precision: 0.01665]




[Epoch 2/3, Batch 5/7] [Losses: x 0.224885, y 0.099081, w 0.851179, h 0.811205, conf 0.399696, cls 1.359652, total 3.745697, recall: 0.60714, precision: 0.02403]
[Epoch 2/3, Batch 6/7] [Losses: x 0.273313, y 0.027740, w 0.294893, h 0.037206, conf 0.504498, cls 5.105692, total 6.243343, recall: 1.00000, precision: 0.02211]


training: 100%|██████████| 7/7 [00:09<00:00,  1.29s/it]
mAP_evaluation: 100%|██████████| 2/2 [00:03<00:00,  1.73s/it] 32.94s/it]


mAP on validation
Iteration: 2 val_mAP_Car :  0.29752870275193893
Iteration: 2 val_mAP_Truck :  0.0
Iteration: 2 val_mAP_Pedestrian :  0.4568181818181818
Iteration: 2 val_mAP :  0.25144896152337354


mAP_evaluation: 100%|██████████| 2/2 [00:08<00:00,  4.03s/it]


mAP on subset
Iteration: 2 subset_mAP_Car :  0.6275967583281106
Iteration: 2 subset_mAP_Truck :  0.0
Iteration: 2 subset_mAP_Pedestrian :  0.6666666666666666
Iteration: 2 subset_mAP :  0.4314211416649257




[Epoch 0/3, Batch 0/8] [Losses: x 0.144126, y 0.125506, w 0.495413, h 0.778619, conf 0.375145, cls 1.316189, total 3.234998, recall: 0.81481, precision: 0.04808]




[Epoch 0/3, Batch 1/8] [Losses: x 0.166302, y 0.123787, w 0.745127, h 1.006504, conf 0.420249, cls 1.377044, total 3.839013, recall: 0.60870, precision: 0.02565]




[Epoch 0/3, Batch 2/8] [Losses: x 0.210604, y 0.083907, w 0.428952, h 0.573253, conf 0.353687, cls 1.312813, total 2.963216, recall: 0.76543, precision: 0.04235]




[Epoch 0/3, Batch 3/8] [Losses: x 0.241386, y 0.104334, w 0.732711, h 0.597403, conf 0.376146, cls 1.309541, total 3.361521, recall: 0.80392, precision: 0.05995]




[Epoch 0/3, Batch 4/8] [Losses: x 0.186461, y 0.091185, w 0.637446, h 0.650490, conf 0.348792, cls 1.336110, total 3.250483, recall: 0.74074, precision: 0.03024]




[Epoch 0/3, Batch 5/8] [Losses: x 0.189249, y 0.071275, w 0.694201, h 0.623760, conf 0.369457, cls 1.306233, total 3.254175, recall: 0.83333, precision: 0.04442]




[Epoch 0/3, Batch 6/8] [Losses: x 0.221516, y 0.099868, w 0.483521, h 0.380591, conf 0.341520, cls 1.292165, total 2.819181, recall: 0.85556, precision: 0.06117]
[Epoch 0/3, Batch 7/8] [Losses: x 0.150293, y 0.048694, w 0.522420, h 0.656290, conf 0.392670, cls 2.630042, total 4.400410, recall: 0.80000, precision: 0.07643]


training: 100%|██████████| 8/8 [00:09<00:00,  1.20s/it]


[Epoch 1/3, Batch 0/8] [Losses: x 0.168846, y 0.081569, w 0.277309, h 0.385341, conf 0.387788, cls 1.340274, total 2.641127, recall: 0.73611, precision: 0.04190]




[Epoch 1/3, Batch 1/8] [Losses: x 0.199104, y 0.076670, w 0.458319, h 0.382543, conf 0.310090, cls 1.301562, total 2.728288, recall: 0.79310, precision: 0.05564]




[Epoch 1/3, Batch 2/8] [Losses: x 0.157177, y 0.086795, w 0.482217, h 0.380354, conf 0.318600, cls 1.319536, total 2.744679, recall: 0.81159, precision: 0.04707]




[Epoch 1/3, Batch 3/8] [Losses: x 0.193747, y 0.041522, w 0.569772, h 0.385189, conf 0.355332, cls 1.335017, total 2.880579, recall: 0.77273, precision: 0.04669]




[Epoch 1/3, Batch 4/8] [Losses: x 0.165885, y 0.061159, w 0.513723, h 0.415704, conf 0.251728, cls 1.299265, total 2.707463, recall: 0.88889, precision: 0.06309]




[Epoch 1/3, Batch 5/8] [Losses: x 0.144488, y 0.065823, w 0.915573, h 0.677056, conf 0.350280, cls 1.388649, total 3.541868, recall: 0.54762, precision: 0.01654]




[Epoch 1/3, Batch 6/8] [Losses: x 0.199086, y 0.087420, w 0.439580, h 0.299684, conf 0.346093, cls 1.294404, total 2.666267, recall: 0.88667, precision: 0.11233]
[Epoch 1/3, Batch 7/8] [Losses: x 0.215999, y 0.061627, w 1.030073, h 0.292084, conf 0.231578, cls 2.595105, total 4.426465, recall: 0.75556, precision: 0.06445]


training: 100%|██████████| 8/8 [00:09<00:00,  1.16s/it]


[Epoch 2/3, Batch 0/8] [Losses: x 0.200021, y 0.068315, w 0.421213, h 0.351939, conf 0.322740, cls 1.298150, total 2.662378, recall: 0.88596, precision: 0.09439]




[Epoch 2/3, Batch 1/8] [Losses: x 0.159162, y 0.060847, w 0.727462, h 0.494973, conf 0.269082, cls 1.317893, total 3.029419, recall: 0.84444, precision: 0.03932]




[Epoch 2/3, Batch 2/8] [Losses: x 0.177791, y 0.087217, w 0.399082, h 0.303450, conf 0.287584, cls 1.311349, total 2.566473, recall: 0.81481, precision: 0.08127]




[Epoch 2/3, Batch 3/8] [Losses: x 0.179924, y 0.073809, w 0.498677, h 0.371128, conf 0.285906, cls 1.341311, total 2.750754, recall: 0.69048, precision: 0.05090]




[Epoch 2/3, Batch 4/8] [Losses: x 0.193364, y 0.090550, w 0.503781, h 0.579965, conf 0.304267, cls 1.322467, total 2.994393, recall: 0.71930, precision: 0.03868]




[Epoch 2/3, Batch 5/8] [Losses: x 0.213649, y 0.135141, w 0.187920, h 0.298896, conf 0.277077, cls 1.304579, total 2.417262, recall: 0.73913, precision: 0.04609]




[Epoch 2/3, Batch 6/8] [Losses: x 0.233165, y 0.094271, w 0.351770, h 0.403305, conf 0.271071, cls 1.298930, total 2.652513, recall: 0.78161, precision: 0.06445]
[Epoch 2/3, Batch 7/8] [Losses: x 0.128279, y 0.099183, w 0.360044, h 0.475192, conf 0.331503, cls 2.575007, total 3.969209, recall: 0.84615, precision: 0.06266]


training: 100%|██████████| 8/8 [00:09<00:00,  1.17s/it]
model refining iterations: 100%|██████████| 3/3 [01:42<00:00, 34.19s/it]
mAP_evaluation: 100%|██████████| 2/2 [00:03<00:00,  1.69s/it]

mAP on validation
Iteration: 2 val_mAP_Car :  0.22690266529609884
Iteration: 2 val_mAP_Truck :  0.0
Iteration: 2 val_mAP_Pedestrian :  0.5393939393939394
Iteration: 2 val_mAP :  0.25543220156334606



