<a href="https://colab.research.google.com/github/valentindbdg/Improve-Yolo-Perfomance-Data-Centric-Approach/blob/main/Model_1_P2_Prepare_Dataset_for_analysis.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Training Yolo on Custom Data and Improve Model's Performance using a Data-centric Approach
##Part 2 : Prepare dataset for analysis using Fiftyone


*Summary*:

* Part 1: Training a Yolo Model with a custom dataset
* **Part 2: Convert predictions**
* Part 3: Improving the dataset to improve performances

## 1) Copy files from google drive to this notebook


###1.1 Mount Drive:

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


###1.2 Import model predictions output file:
The file with the predictions is copied from the drive to the local directory 'yolov3'

In [None]:
%cp -av /content/drive/MyDrive/yolov3/ /content/yolov3

'/content/drive/MyDrive/yolov3/' -> '/content/yolov3'
'/content/drive/MyDrive/yolov3/yolov3-tiny.cfg' -> '/content/yolov3/yolov3-tiny.cfg'
'/content/drive/MyDrive/yolov3/obj.names' -> '/content/yolov3/obj.names'
'/content/drive/MyDrive/yolov3/result2_2000_608_608.txt' -> '/content/yolov3/result2_2000_608_608.txt'


### 1.3 Importer the dataset with ground_truth labels
The ground truth are copied from the drive to the local directory 'yolodataset'

In [None]:
%cp -av /content/drive/MyDrive/yolodataset/ /content/yolodataset

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
'/content/drive/MyDrive/yolodataset/data/000000441491.txt' -> '/content/yolodataset/data/000000441491.txt'
'/content/drive/MyDrive/yolodataset/data/000000441286.txt' -> '/content/yolodataset/data/000000441286.txt'
'/content/drive/MyDrive/yolodataset/data/000000450686.txt' -> '/content/yolodataset/data/000000450686.txt'
'/content/drive/MyDrive/yolodataset/data/000000441543.txt' -> '/content/yolodataset/data/000000441543.txt'
'/content/drive/MyDrive/yolodataset/data/000000434230.txt' -> '/content/yolodataset/data/000000434230.txt'
'/content/drive/MyDrive/yolodataset/data/000000439180.txt' -> '/content/yolodataset/data/000000439180.txt'
'/content/drive/MyDrive/yolodataset/data/000000449579.txt' -> '/content/yolodataset/data/000000449579.txt'
'/content/drive/MyDrive/yolodataset/data/000000428454.txt' -> '/content/yolodataset/data/000000428454.txt'
'/content/drive/MyDrive/yolodataset/data/000000445999.txt' -> '/content/yolodat

### 1.4 Prepare dataset ground truth and prediction folders
The location of ground truth and prediction labels are changed to match the location required by fiftyone to add both ground truth and prediction labels to the dataset loaded into Fiftyone:



```
/path/to/images
    image1.ext
    image2.ext
    ...

/path/to/ground_truth
    image1.txt
    image2.txt
    ...

/path/to/predictions
    image1.txt
    image2.txt
    ...
    
```



Two folders are created to contain predictions and ground truth labels:

In [None]:
%cd "/content/yolodataset/"
!mkdir predictions
!mkdir groundtruth
!ls

/content/yolodataset
mkdir: cannot create directory ‘predictions’: File exists
data  groundtruth  images.txt  obj.names  predictions


In [None]:
%cd /content

/content


Copy all txt ground truth files contained in data to the new location:

In [None]:
%cp /content/yolodataset/data/*.txt /content/yolodataset/groundtruth

Delete the remaining .txt ground truth label files in the image .jpg directory:

In [None]:
!rm -rf /content/yolodataset/data/*.txt

## 2) Parse the file 'result2_2000_608_608.txt' in the folder yolov3
The output file from the model predictions on the test set is parsed in this section


In [None]:
from typing import List, Tuple

In [None]:
with open('/content/yolov3/result2.txt', encoding="utf8") as f: #try adding encoding='utf8'
  text_content = f.read().replace("\t", " ")

In [None]:
text_lines = text_content.split("\n")

In [None]:
text_lines[0]

' CUDNN_HALF=1 '

In [None]:
from dataclasses import dataclass

@dataclass
class DetectionPrediction:
  frame: str
  prediction_class: str
  confidence: float
  left_x: int
  top_y: int
  width: int
  height: int

In [None]:
import re

PATH_MARK = "/content"
END_PREDICTION_MARK = "Enter"
FRAME_REGEX = r"\/content\/val2017\/(?P<frame>[0-9]*?\.jpg)"
PREDICTION_REGEX = (
    r"(?P<prediction_class>[a-z]*?): +"
    r"(?P<confidence>[0-9]{1,2})% +"
    r"\(left_x: +(?P<left_x>-?[0-9]*) +"
    r"top_y: +(?P<top_y>-?[0-9]*) +"
    r"width: +(?P<width>[0-9]*) +"
    r"height: +(?P<height>[0-9]*)\)"
)
prediction_list = []
i = 0

In [None]:
while i < len(text_lines):
  line = text_lines[i]
  if not line.startswith(PATH_MARK):
    i += 1
    continue

  frame_match = re.search(FRAME_REGEX, line)
  if not frame_match:
    raise Exception(f"Impossible to find frame on line {i}: {line}")
  
  frame = frame_match.group('frame')
  
  i += 1
  line = text_lines[i]

  while i < len(text_lines) and not line.startswith(END_PREDICTION_MARK):
  
    prediction_match = re.search(PREDICTION_REGEX, line)
    if not prediction_match:
      raise Exception(f"Impossible to find prediction on line {i}: {line}")
    
    prediction_description = DetectionPrediction(
        frame=frame,
        prediction_class=prediction_match.group('prediction_class'),
        confidence=float(prediction_match.group('confidence'))/100,
        left_x=int(prediction_match.group('left_x')),
        top_y=int(prediction_match.group('top_y')),
        width=int(prediction_match.group('width')),
        height=int(prediction_match.group('height')),
    )
    
    prediction_list.append(prediction_description)

    i += 1
    line = text_lines[i]
  


In [None]:
prediction_list[0:5]

[DetectionPrediction(frame='000000086755.jpg', prediction_class='person', confidence=0.7, left_x=320, top_y=211, width=76, height=98),
 DetectionPrediction(frame='000000441468.jpg', prediction_class='person', confidence=0.54, left_x=240, top_y=388, width=122, height=198),
 DetectionPrediction(frame='000000441468.jpg', prediction_class='person', confidence=0.57, left_x=373, top_y=124, width=11, height=45),
 DetectionPrediction(frame='000000133244.jpg', prediction_class='person', confidence=0.37, left_x=6, top_y=44, width=29, height=39),
 DetectionPrediction(frame='000000133244.jpg', prediction_class='person', confidence=0.36, left_x=56, top_y=49, width=43, height=30)]

## 3) Convert predictions from .txt to .csv


In [None]:
def prediction_to_csv(prediction_list: List[DetectionPrediction], filepath: str):
  with open(filepath, 'w', encoding="utf8") as csv_file: # try adding encoding='utf8'
    csv_file.write("frame,prediction_class,confidence,left_x,top_y,width,height\n")

    for p in prediction_list:
      csv_file.write(f"{p.frame},{p.prediction_class},{p.confidence},{p.left_x},{p.top_y},{p.width},{p.height}\n")

The file is converted and saved:

In [None]:
%cd /content

/content


In [None]:
prediction_to_csv(prediction_list=prediction_list, filepath="predictions.csv")

In [None]:
import pandas as pd
df_predictions = pd.read_csv('predictions.csv', sep=",", encoding="utf8") # try adding encoding='utf8'
df_predictions.head(10)

Unnamed: 0,frame,prediction_class,confidence,left_x,top_y,width,height
0,000000086755.jpg,person,0.7,320,211,76,98
1,000000441468.jpg,person,0.54,240,388,122,198
2,000000441468.jpg,person,0.57,373,124,11,45
3,000000133244.jpg,person,0.37,6,44,29,39
4,000000133244.jpg,person,0.36,56,49,43,30
5,000000133244.jpg,person,0.47,89,49,49,29
6,000000133244.jpg,person,0.62,121,45,42,34
7,000000133244.jpg,person,0.29,190,44,33,38
8,000000133244.jpg,person,0.55,240,109,70,186
9,000000133244.jpg,person,0.42,251,44,41,37


Checking the value type of each column

In [None]:
df_predictions.dtypes

frame                object
prediction_class     object
confidence          float64
left_x                int64
top_y                 int64
width                 int64
height                int64
dtype: object

## 4) Transform the data
The data are transformed to be converted into a yolo format, which can be understood by the fiftyone package:


### 4.1 Get image size
First, a function is created to get the size (width and height) of each image

In [None]:
from PIL import Image

def get_image_size(image_path: str) -> Tuple[int, int]:
  img = Image.open(image_path)
  width,height = img.size

  return (width,height)

For example:

In [None]:
filepath = "/content/yolodataset/data/000000002431.jpg"
get_image_size(filepath)

(457, 640)

### 4.2 Convert predictions to yolo format + add confidence function:
Then, a function is created to convert the data into yolo format based on image size:


`
<target> <x-center> <y-center> <width> <height>
`

In this function, the confidence is also added in an extra column that can be processed and used by Fiftyone:

`<target> <x-center> <y-center> <width> <height> <confidence>`

In [None]:
BASE_PATH = "/content/yolodataset/data/" #changed the directory

def get_yolo_labels(row):
  (w, h) = get_image_size(BASE_PATH + row["frame"])

  ratio_w = 1. / w
  ratio_h = 1. / h

  x_center = (row["left_x"] + row["width"] /2) * ratio_w
  y_center = (row["top_y"] + row["height"] /2) * ratio_h
  normalize_width = row["width"] * ratio_w
  normalized_height = row["height"] * ratio_h
  confidence = row['confidence']

  return (0, x_center, y_center, normalize_width, normalized_height, confidence)

For example:

In [None]:
get_yolo_labels(df_predictions.iloc[0])

(0,
 0.5593750000000001,
 0.5416666666666666,
 0.11875000000000001,
 0.20416666666666666,
 0.7)

Here the obj.names is created manually since there is a single class (person)

In [None]:
df_predictions.groupby("prediction_class").groups.keys()


dict_keys(['person'])

In [None]:
df_predictions.head()

Unnamed: 0,frame,prediction_class,confidence,left_x,top_y,width,height
0,000000086755.jpg,person,0.7,320,211,76,98
1,000000441468.jpg,person,0.54,240,388,122,198
2,000000441468.jpg,person,0.57,373,124,11,45
3,000000133244.jpg,person,0.37,6,44,29,39
4,000000133244.jpg,person,0.36,56,49,43,30


## 5) Create the .txt files for the predictions

### 5.1 Create a function to write a .txt files for each image
A function is created to write the yolo labels into a new file for each frame number:

In [None]:
def write_yolo_labels(frame_group):
  frame = frame_group.frame.iloc[0]

  label_lines = [" ".join([str(r) for r in get_yolo_labels(row)]) for row in frame_group.iloc]
  label_content = "\n".join(label_lines)

  filepath = BASE_PATH + frame.replace("jpg", "txt")

  with open(filepath, "w", encoding="utf8") as label_file: #added encoding='utf8'
    label_file.write(label_content)

  return filepath 

Frames are identified as unique values of the "frame" column

In [None]:
frames = df_predictions["frame"].unique()
frame_groups = df_predictions.groupby("frame")

### 5.2 Data conversion and txt files creation
Then, the function is called, the data is converted and written into the txt file corresponding to one .jpg image

In [None]:
for f in frames:
  write_yolo_labels(frame_groups.get_group(f))

for example:

In [None]:
!cat /content/yolodataset/data/000000007816.txt

0 0.553125 0.4414519906323185 0.225 0.42857142857142855 0.32
0 0.6351562500000001 0.4519906323185012 0.20468750000000002 0.477751756440281 0.3
0 0.6726562500000001 0.33840749414519905 0.060937500000000006 0.11943793911007025 0.26
0 0.7898437500000001 0.2786885245901639 0.045312500000000006 0.26229508196721313 0.42
0 0.815625 0.28337236533957844 0.059375000000000004 0.2716627634660422 0.78
0 0.9226562500000001 0.28337236533957844 0.045312500000000006 0.2903981264637002 0.76

### 5.3 Change predictions txt files location
All the prediction .txt files are copied to the newly created folder "predictions" and the labels remaining in the image files .jpg are deleted:

In [None]:
!cp /content/yolodataset/data/*.txt /content/yolodataset/predictions/

In [None]:
!rm -rf /content/yolodataset/data/*.txt

The folder yolodataset now contains one folder with the images (.jpg), one folder with the ground truth labels (.txt), and one last folder with the prediction labels (.txt)

Then, the folder 'yolodataset', containing the images and both the ground truth and the prediction labels is copied to the drive in the folder 'finaldataset'

In [None]:
%cp -av /content/yolodataset/ /content/drive/MyDrive/finaldataset

[1;30;43mStreaming output truncated to the last 5000 lines.[0m
'/content/yolodataset/data/000000365095.jpg' -> '/content/drive/MyDrive/finaldataset/data/000000365095.jpg'
'/content/yolodataset/data/000000365208.jpg' -> '/content/drive/MyDrive/finaldataset/data/000000365208.jpg'
'/content/yolodataset/data/000000365521.jpg' -> '/content/drive/MyDrive/finaldataset/data/000000365521.jpg'
'/content/yolodataset/data/000000365642.jpg' -> '/content/drive/MyDrive/finaldataset/data/000000365642.jpg'
'/content/yolodataset/data/000000365655.jpg' -> '/content/drive/MyDrive/finaldataset/data/000000365655.jpg'
'/content/yolodataset/data/000000365745.jpg' -> '/content/drive/MyDrive/finaldataset/data/000000365745.jpg'
'/content/yolodataset/data/000000365886.jpg' -> '/content/drive/MyDrive/finaldataset/data/000000365886.jpg'
'/content/yolodataset/data/000000366711.jpg' -> '/content/drive/MyDrive/finaldataset/data/000000366711.jpg'
'/content/yolodataset/data/000000366884.jpg' -> '/content/drive/MyDrive

##6) Load the dataset into Fiftyone
In this section, a dataset is created with both ground truth and predictions to make sure that everything is there before the next step of the project

### 6.1 Install (from source) and import Fiftyone 


In [None]:
!pip uninstall opencv_python_headless



In [None]:
!pip install opencv-python-headless==4.5.4.60

Collecting opencv-python-headless==4.5.4.60
  Downloading opencv_python_headless-4.5.4.60-cp37-cp37m-manylinux_2_17_x86_64.manylinux2014_x86_64.whl (47.6 MB)
[K     |████████████████████████████████| 47.6 MB 128 kB/s 
Installing collected packages: opencv-python-headless
Successfully installed opencv-python-headless-4.5.4.60


Support for loading confidence from YOLO TXT files was just added in Fiftyone. As a result it was not released yet so a source install of fiftyone is performed to be able to add the confidence in the txt files of the predictions in yolo format. 

https://github.com/voxel51/fiftyone#source-installs-in-google-colab

Note: the runtime has to be restarted after running the following cell:

In [None]:
%%shell

git clone --depth 1 https://github.com/voxel51/fiftyone.git
cd fiftyone
bash install.bash

Cloning into 'fiftyone'...
remote: Enumerating objects: 916, done.[K
remote: Counting objects: 100% (916/916), done.[K
remote: Compressing objects: 100% (852/852), done.[K
remote: Total 916 (delta 57), reused 480 (delta 27), pack-reused 0[K
Receiving objects: 100% (916/916), 267.15 MiB | 27.93 MiB/s, done.
Resolving deltas: 100% (57/57), done.
Checking out files: 100% (838/838), done.
***** INSTALLING FIFTYONE-DB *****
Collecting fiftyone-db
  Downloading fiftyone_db-0.3.0-py3-none-manylinux1_x86_64.whl (29.2 MB)
[K     |████████████████████████████████| 29.2 MB 1.4 MB/s 
[?25hInstalling collected packages: fiftyone-db
Successfully installed fiftyone-db-0.3.0
***** INSTALLING FIFTYONE-APP *****
  % Total    % Received % Xferd  Average Speed   Time    Time     Time  Current
                                 Dload  Upload   Total   Spent    Left  Speed
100 13527  100 13527    0     0   357k      0 --:--:-- --:--:-- --:--:--  357k
=> Downloading nvm from git to '/root/.nvm'
=> Clonin



### 6.2 Load labels into fiftyone:
Both ground truth and prediction labels are added to the label field of a dataset created in fiftyone:

In [None]:
import fiftyone as fo
import fiftyone.utils.yolo as fouy

dataset = fo.Dataset.from_dir(
    data_path= "/content/yolodataset/data",
    labels_path="/content/yolodataset/groundtruth",
    dataset_type=fo.types.YOLOv4Dataset,
    label_field="ground_truth",
    classes = 'person'
)

fouy.add_yolo_labels(dataset, "predictions", "/content/yolodataset/predictions", classes = "person")

  defaults = yaml.load(f)


Images file 'None' not found. Listing data directory '/content/yolodataset/data' instead
 100% |███████████████| 2693/2693 [17.0s elapsed, 0s remaining, 231.1 samples/s]      


## 7) Check
In this part, the import in fiftyone is checked to make sure everything is there

First, a summary info about the dataset is shown to make sure: 
* "ground_truth" and "predictions" are added to the sample fields; 
* "Num samples" matches the number of images in the dataset.

In [None]:
print(dataset)

Name:        2022.01.04.11.27.34
Media type:  image
Num samples: 2693
Persistent:  False
Tags:        []
Sample fields:
    id:           fiftyone.core.fields.ObjectIdField
    filepath:     fiftyone.core.fields.StringField
    tags:         fiftyone.core.fields.ListField(fiftyone.core.fields.StringField)
    metadata:     fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.metadata.Metadata)
    ground_truth: fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)
    predictions:  fiftyone.core.fields.EmbeddedDocumentField(fiftyone.core.labels.Detections)


Then, a few samples are shown: 

In [None]:
print(dataset.head())

[<Sample: {
    'id': '61d42f26e8b6261c49d09ebf',
    'media_type': 'image',
    'filepath': '/content/yolodataset/data/000000000139.jpg',
    'tags': BaseList([]),
    'metadata': None,
    'ground_truth': <Detections: {
        'detections': BaseList([
            <Detection: {
                'id': '61d42f26e8b6261c49d09ebd',
                'attributes': BaseDict({}),
                'tags': BaseList([]),
                'label': 'p',
                'bounding_box': BaseList([0.6449995, 0.3699765, 0.082891, 0.323967]),
                'mask': None,
                'confidence': None,
                'index': None,
            }>,
            <Detection: {
                'id': '61d42f26e8b6261c49d09ebe',
                'attributes': BaseDict({}),
                'tags': BaseList([]),
                'label': 'p',
                'bounding_box': BaseList([0.6006715, 0.4042485, 0.023625, 0.083897]),
                'mask': None,
                'confidence': None,
                'i

Finally, the app is launched to visualize the dataset with both ground truth and predictions of the yolo model. On the samples with predictions, the confidence should be shown as well.

In [None]:
session = fo.launch_app(dataset)

## 8) Save dataset with ground truth and predictions to disk
The dataset with both predictions and ground truth, which is now ready to be used in fiftyone, is saved to disk (drive) to be used in the next part of this project:

First, the local folder is copied and renamed 'dataset_prepared' to avoid replacing the previously downloaded file 'yolodataset' located in the drive: