<p> <center> <a href="../Start_here.ipynb">Home Page</a> </center> </p>

<div>
    <span style="float: left; width: 52%; text-align: right;">
        <a >1</a>
        <a href="2.Object_detection_using_TAO_YOLOv4.ipynb">2</a>
        <a href="3.Model_deployment_with_Triton_Inference_Server.ipynb">3</a>
        <a href="4.Model_deployment_with_DeepStream.ipynb">4</a>
        <a href="5.Measure_object_size_using_OpenCV.ipynb">5</a>
        <a href="6.Challenge_DeepStream.ipynb">6</a>
        <a href="7.Challenge_Triton.ipynb">7</a>
    </span>
    <span style="float: left; width: 48%; text-align: right;"><a href="2.Object_detection_using_TAO_YOLOv4.ipynb">Next Notebook</a></span>
</div>

# Data labeling and preprocessing

***

**The goal of this notebook is to make you understand how to:**

- Label data for object detection applications
- Convert a dataset into KITTI format

**Contents of this notebook:**

- [Custom data labeling](#Custom-data-labeling)
    - [Labeling with Label Studio](#Labeling-with-Label-Studio)
    - [Labeling with Yolo Mark](#Labeling-with-Yolo-Mark)
- [Download data for the lab](#Download-data-for-the-lab)
- [Conversion to KITTI format](#Conversion-to-KITTI-format)
    - [Load the dataset](#Load-the-dataset)
    - [Export to KITTI](#Export-to-KITTI)

## Custom data labeling

Training a deep learning model for an object detection task requires a meaningful amount of annotated data. A dataset for a specific domain application may not be available often or if it is, chances are it may not be labeled or adequate in size. In this notebook, we show how to annotate a custom dataset with bounding boxes and convert it into KITTI file format, useful to expand the number of samples with offline data augmentation or to train a model with transfer learning.

<img src="images/prep_pipeline.png" width="720">

We present two tools for data labeling operations:
- Label Studio
- Yolo Mark

We recommend using Label Studio because of the more intuitive user interface and a better overall labeling experience.

### Labeling with Label Studio

[Label Studio](https://labelstud.io/) is an open-source, flexible, quickly installable data labeling tool with a very convenient user interface. The tool natively comes with a Python module available to install via the pip package manager, but can also be installed in alternative ways, all available [here](https://labelstud.io/guide/install.html), so feel free to pick the one you are most comfortable with.

To get started with the Python module, open a terminal window in your preferred environment (ideally, create a fresh virtual one) and run the command `pip install -U label-studio`. Once installed, start the server with the command `label-studio`. This will automatically open a user interface on the default web browser on port 8080, accessible at `http://localhost:8080` if you are working on your local machine, unless another port is specified.

To proceed, follow these steps and visual explanations:
- Sign up with an email address and create a password (that these credentials are stored locally on the Label Studio server and can be whatever you prefer).
<img src="images/label_studio_1.png" width="720">

- Create a new project.
<img src="images/label_studio_2.png" width="720">

- Give it a title and optionally a brief description.
<img src="images/label_studio_3.png" width="720">

- Drag and drop images to upload.
<img src="images/label_studio_4.png" width="720">

- Select an object detection task with bounding boxes.
<img src="images/label_studio_5.png" width="720">

- Set the class names.
<img src="images/label_studio_6.png" width="720">

If you plan on tagging a significant amount of data, you will likely need to separate it into multiple chunks to avoid hitting the per-project memory limit.

Once the previous steps are completed, you can start with the labeling process. From the project menu, click on `Label All Tasks` at the top.

<img src="images/label_studio_7.png" width="720">

Then, for every image, do the following operations:
- Select an appropriate class.
- Draw all the bounding boxes for that class.
- Repeat for other classes.
- Click `Submit`.

<img src="images/label_studio_8.png" width="720">

This will automatically load the next image until there are no images left. While labeling, you can stop at any time and when you resume, you will continue exactly where you left off.

<img src="images/label_studio_9.png" width="720">

As soon as you have completed the labeling activity, either because you have run out of images or because you are satisfied with how many you have, you can go back to the home page of the project, apply filters to the annotations, and export them by clicking on `Export`. Make sure to scroll down and select the YOLO format when you do so.

<img src="images/label_studio_10.png" width="720">

For more in-depth information and an additional visual explanation of the previous steps, explore this [dedicated tutorial](https://labelstud.io/blog/Quickly-Create-Datasets-for-Training-YOLO-Object-Detection.html) on how to label images for YOLO applications on the Label Studio blog.

The exported data has a similar structure to this one by default, after unzipping the downloaded file:
```
project-1-at-2022-09-20-15-20-f6c05363.zip
    notes.json
    classes.txt
    labels
        image_filename1.txt
        image_filename2.txt
        image_filename3.txt
        ...
    images
        image_filename1.<ext>
        image_filename2.<ext>
        image_filename3.<ext>
        ...
```
<img src="images/label_studio_11.png" width="720">

The TXT files in the `labels` folder are space-delimited files where each row corresponds to an object in the image with the same name in the `images` folder, in the standard YOLO format:
```
<target> <x-center> <y-center> <width> <height> <confidence>
```
<img src="images/yolo_label.png" width="720">

where `<target>` is the zero-based integer index of the object class label from `classes.txt`, the bounding box coordinates are expressed as relative coordinates in `[0, 1] x [0, 1]`, and `<confidence>` is an optional detection confidence in `[0, 1]`, left blank by Label Studio.

### Labeling with Yolo Mark

Another popular data labeling tool is [Yolo Mark](https://github.com/AlexeyAB/Yolo_mark), a Windows and Linux GUI for marking bounded boxes of objects in images for training Yolo. Its use is not as straightforward as Label Studio, as it needs to be compiled from source and does not come with a Python module, but is still as an option to consider for a project.

In order to use Yolo Mark, [download](https://github.com/AlexeyAB/Yolo_mark) the repository from GitHub and follow the instructions in the README file to get the executable program, depending on your operating system. Note that a working installation of [OpenCV](https://opencv.org/) is required to run the program successfully. If you are a Windows user you might consider a tool like [MS Visual Studio](https://visualstudio.microsoft.com/vs/) to compile the project, while for Linux users, you will just need to type the commands `cmake .` and then `make` after moving into the project directory.

At this point, to use the tool to label your custom images, place them in the `x64/Release/data/img` directory, change the number of classes in `x64/Release/data/obj.data` as well as the class names in `x64/Release/data/obj.names`, and run `x64/Release/yolo_mark.cmd` on Windows or `./linux_mark.sh` on Linux to start labeling.

<img src="images/yolo_mark.png" width="720">

The resulting YOLO dataset in `x64/Release/data` will have the following structure:
```
data
    obj.data
    obj.names
    train.txt
    img
        image_filename1.<ext>
        image_filename1.txt
        image_filename2.<ext>
        image_filename2.txt
        image_filename3.<ext>
        image_filename3.txt
        ...
```            
with images and corresponding labels in the same folder, `obj.names` with the class names, and a `train.txt` file with the paths to the labeled images. The format of the TXT annotation files in the `img` folder is the same YOLO format as described before.

## Download data for the lab

In this lab, we will provide you with a labeled version of a dataset containing three types of fruit - `apples`, `bananas`, and `oranges` - each fresh or rotten, for a total of six classes. The dataset was labeled using Label Studio, as explained above. The project folder has been renamed to `label-studio`. Running the following cell will make the data available in the `/workspace/data` directory.

In [None]:
!python3 ../source_code/dataset.py

## Conversion to KITTI format

Regardless of whether Label Studio or Yolo Mark was used, or a dataset already labeled in YOLO format was provided, conversion to KITTI format is required to experiment with the NVIDIA® TAO Toolkit in the next notebook. The KITTI format not only allows you to unleash the power of transfer learning and pre-trained models available within the TAO Toolkit but also is used to perform offline data augmentation and dramatically increase the size of the dataset.

The KITTI format organizes the data directories of images and corresponding labels into a structure similar to Label Studio, namely:
```
dataset_dir
    data
        image_filename1.<ext>
        image_filename2.<ext>
        image_filename3.<ext>
        ...
    labels
        image_filename1.txt
        image_filename2.txt
        image_filename3.txt
        ...
```  
The main difference is that in the KITTI format the labels TXT files are space-delimited files where each row corresponds to an object and **the bounding box is stored using 15 (and optional 16th confidence) columns**. The meaning of each of the 15 required columns is described [here](https://docs.nvidia.com/tao/tao-toolkit/text/data_annotation_format.html#label-files). In particular, the first item is the object label and from the fifth to the eighth position we have the bounding box coordinates expressed in pixels **[x-top-left, y-top-left, x-bottom-right, y-bottom-right]**. Note that this is different from the YOLO format since we now use corners to identify the box and it is not resizing invariant.

<img src="images/yolo_kitti.png" width="720">

To perform the conversion between dataset formats, we will use [FiftyOne](https://voxel51.com/docs/fiftyone/), an open-source Python tool for handling computer vision datasets. FiftyOne allows loading a YOLO dataset and exporting it as KITTI in a few lines of code.

### Load the dataset

The generic `Dataset.from_dir()` method (documentation available [here](https://voxel51.com/docs/fiftyone/api/fiftyone.core.dataset.html#fiftyone.core.dataset.Dataset.from_dir)) loads a dataset from disk and depending on the format, additional parameters can be passed to customize the data import. When dealing with a YOLO data format like in our case, these parameters are inherited from the [YOLOv4DatasetImporter](https://voxel51.com/docs/fiftyone/api/fiftyone.utils.yolo.html#fiftyone.utils.yolo.YOLOv4DatasetImporter) class and a customized import would require the following arguments:
- `dataset_dir`: the dataset directory.
- `dataset_type`: the `fiftyone.types.dataset_types.Dataset` type of the dataset.
- `data_path`: to enable explicit control over the location of the media.
- `labels_path`: to enable explicit control over the location of the labels.
- `images_path`: to enable explicit control over the location of the image listing file.
- `objects_path`: to enable explicit control over the location of the object names file.

If your data stored on disk is not in YOLO format but in one of the [many common formats](https://voxel51.com/docs/fiftyone/user_guide/dataset_creation/datasets.html#supported-import-formats) supported natively by FiftyOne, then you can automatically load your data with minimal code changes in terms of additional parameters.

To install the FiftyOne Python module, run `pip install fiftyone` in your preferred environment (ideally, a virtual one). In this lab, we have already installed it for you.

Let's now load a YOLO dataset generated with Label Studio into FiftyOne. In this case, we have an object names file but we don't have an image listing file, so we just ignore the `images_path` argument and let FiftyOne list the data directory for us.

In [None]:
import fiftyone as fo

dataset_dir = "../data/label-studio/"
data_path = "images/"
labels_path = "labels/"
objects_path = "classes.txt"

# Create the dataset
dataset = fo.Dataset.from_dir(
    dataset_dir=dataset_dir,
    data_path=data_path,
    labels_path=labels_path,
    objects_path=objects_path,
    dataset_type=fo.types.YOLOv4Dataset
)

# View summary info about the dataset
print(dataset)

# Print the first few samples in the dataset
print(dataset.head(2))

Instead, if we were trying to load a dataset generated with Yolo Mark into FiftyOne, saved into a folder named `yolo-mark` that isn't available for the lab, images and labels would now be in the same folder and we would have both an object names file and an image listing file. However, the `train.txt` image listing file contains paths from the executable file directory and not from the dataset home directory, so FiftyOne will not find the images unless we substitute all paths with relative paths in the form `img/image_filename.<ext>`. We can do that with some simple code that generates a new `images.txt` file with the right paths.
```python
# Read the file
with open("../data/yolo-mark/train.txt", "r") as file :
    filedata = file.read()
    
# Replace the target string
# On Linux
filedata = filedata.replace("x64/Release/data/img/", "img/")
# On Windows
#filedata = filedata.replace("data/img/", "img/")

# Write the file out again
with open("../data/yolo-mark/images.txt", "w") as file:
    file.write(filedata)
```    

Alternatively, we can again ignore the `images_path` argument and let FiftyOne list all the data directory for us.

In [None]:
# If you use a dataset labeled with Yolo Mark, you will need a yolo-mark folder to run the code below to load it into FiftyOne

# dataset_dir = "../data/yolo-mark/"
# data_path = "img/"
# images_path = "images.txt"
# objects_path = "obj.names"

# Create the dataset
# dataset = fo.Dataset.from_dir(
#     dataset_dir=dataset_dir,
#     data_path=data_path,
#     images_path=images_path,
#     objects_path=objects_path,
#     dataset_type=fo.types.YOLOv4Dataset
# )

# View summary info about the dataset
# print(dataset)

# Print the first few samples in the dataset
# print(dataset.head(2))

### Export to KITTI

Once the dataset is loaded into FiftyOne, conversion to KITTI format is immediate with an export command. The `Dataset.export()` method (documentation available [here](https://voxel51.com/docs/fiftyone/api/fiftyone.core.dataset.html#fiftyone.core.dataset.Dataset.export)) writes the samples to disk and a customized export to KITTI format would require the following arguments:
- `export_dir`: the dataset export directory.
- `dataset_type`: the `fiftyone.types.dataset_types.Dataset` type of the dataset.
- `data_path`: to enable explicit control over the location of the exported media.
- `labels_path`: to enable explicit control over the location of the exported labels.

Providing only `export_dir` and `dataset_type` would result in an export of the content to a directory following the default layout for the specified format.

In [None]:
export_dir = "../data/training/"
data_path = "image_2/"
labels_path = "label_2/"

# Export the dataset
dataset.export(
    export_dir=export_dir,
    data_path=data_path,
    labels_path=labels_path,
    dataset_type=fo.types.KITTIDetectionDataset
)

We can now view some images of our dataset before moving on to the next notebook.

In [None]:
# Simple grid visualizer
import matplotlib.pyplot as plt
import os
from math import ceil
valid_image_ext = ['.jpg', '.png', '.jpeg', '.ppm']

def visualize_images(img_path, num_cols=4, num_images=10):
    num_rows = int(ceil(float(num_images) / float(num_cols)))
    f, axarr = plt.subplots(num_rows, num_cols, figsize=[80,30])
    f.tight_layout()
    a = [os.path.join(img_path, image) for image in os.listdir(img_path) 
         if os.path.splitext(image)[1].lower() in valid_image_ext]
    for idx, img_path in enumerate(a[:num_images]):
        col_id = idx % num_cols
        row_id = idx // num_cols
        img = plt.imread(img_path)
        axarr[row_id, col_id].imshow(img) 

In [None]:
# Visualizing the sample images
IMG_PATH = '../data/training/image_2'
COLS = 3 # number of columns in the visualizer grid
IMAGES = 9 # number of images to visualize

visualize_images(IMG_PATH, num_cols=COLS, num_images=IMAGES)

In this notebook, we have seen how to label a raw dataset and export it into KITTI format. Next, we will train an object detection model using the TAO Toolkit. Please go to the next notebook by clicking on the `Next Notebook` button below.

***

## Licensing

Copyright © 2022 OpenACC-Standard.org. This material is released by OpenACC-Standard.org, in collaboration with NVIDIA Corporation, under the Creative Commons Attribution 4.0 International (CC BY 4.0). These materials include references to hardware and software developed by other entities; all applicable licensing and copyrights apply.

<br>
<div>
    <span style="float: left; width: 52%; text-align: right;">
        <a >1</a>
        <a href="2.Object_detection_using_TAO_YOLOv4.ipynb">2</a>
        <a href="3.Model_deployment_with_Triton_Inference_Server.ipynb">3</a>
        <a href="4.Model_deployment_with_DeepStream.ipynb">4</a>
        <a href="5.Measure_object_size_using_OpenCV.ipynb">5</a>
        <a href="6.Challenge_DeepStream.ipynb">6</a>
        <a href="7.Challenge_Triton.ipynb">7</a>
    </span>
    <span style="float: left; width: 48%; text-align: right;"><a href="2.Object_detection_using_TAO_YOLOv4.ipynb">Next Notebook</a></span>
</div>

<br>
<p> <center> <a href="../Start_here.ipynb">Home Page</a> </center> </p>