## Converting from CSV file to Yolo format:
Data is in .csv format where each row describes an object in a scene. In this scenario, no sequence information is utilized, hence the data can be converted directly to the yolo format, where each training image has a corresponding label file. The labels are automatically found by replacing "image" with the word labels. Hence the format we are looking for looks like the following:

Training set
* Images
    + P3_Female_casual
        + Random Identifier 1 
            + cam000001.png  
            + cam000002.png

        + Random Identifier 2  
            + ... 
     
    + ... 
    
* Labels
    + P3_Female_casual
        + Random Identifier 1 
            + cam000001.txt  
            + cam000002.txt

        + Random Identifier 2  
            + ... 
     
    + ... 
    

The images are provided in this format, but the labels are not. Below we are converting the .csv file to a .txt file per image. 

In [1]:
import pandas as pd 
import numpy as np 
import matplotlib.pyplot as plt 
import os 

In [2]:
# Location of label.csv file 

# INLIERS
# label_file = '../Data/Development/male_casual/labels.csv'
#label_file = '../Data/Development/female_business/labels.csv'



# OUTLIERS 
label_file = '../Data/non-pedestrian/cylinder/labels.csv'


# Read in data, and print example's 

df = pd.read_csv(label_file)
df.head()

Unnamed: 0,file,frame_index,image_width,image_height,x_min,x_max,y_min,y_max,is_occluded,current_distance,class_text,run_id,scenario_type,object_type,start_distance_from_car,speed,angle,offset_from_road_center
0,qFaN520H78j9p7FUN0gv5/cam000001.png,1,752,480,,,,,False,12.68,background,qFaN520H78j9p7FUN0gv5,left,cylinder,10,4,90,
1,qFaN520H78j9p7FUN0gv5/cam000002.png,2,752,480,,,,,False,12.63,background,qFaN520H78j9p7FUN0gv5,left,cylinder,10,4,90,
2,qFaN520H78j9p7FUN0gv5/cam000003.png,3,752,480,,,,,False,12.54,background,qFaN520H78j9p7FUN0gv5,left,cylinder,10,4,90,
3,qFaN520H78j9p7FUN0gv5/cam000004.png,4,752,480,,,,,False,12.42,background,qFaN520H78j9p7FUN0gv5,left,cylinder,10,4,90,
4,qFaN520H78j9p7FUN0gv5/cam000005.png,5,752,480,,,,,False,12.28,background,qFaN520H78j9p7FUN0gv5,left,cylinder,10,4,90,


In [3]:
ctr = 0 
for index, row in df.iterrows():
    parts = row['file'].replace('\\','/').split('/')
    path = ''
    for part in parts:
        path = os.path.join(path, part)
        if not os.path.isdir(path) and not path.endswith('.png'):
            os.mkdir(path)
    # Extract BB information
    bounding_box = row.loc[(['x_min','x_max','y_min','y_max'])]
    xmin, xmax, ymin, ymax = bounding_box.values
    width, height = row['image_width'], row['image_height']    
    
    # Write the label file 
    label_path = path.replace('.png', '.txt')
    
    # if os.path.exists(label_path):  # file exists! Append new object
    file_object = open(label_path, 'a')
    
    # Check if they're NaN --> Take next if true. 
    if bounding_box.isnull().values.any(): 
        file_object.close()
        continue 
    else:
        # YOLO formating: 
        yolo_x_centre = ((xmin+xmax)/2) / width
        yolo_y_centre = ((ymin+ymax)/2) / height
        yolo_width = (xmax-xmin) / width
        yolo_height = (ymax-ymin) / height
        
        file_object.write(' '.join([str(x) for x in [0, yolo_x_centre, yolo_y_centre, yolo_width, yolo_height]]))
        file_object.close()
    
    # For status updates: Print a progress bar 
    ctr += 1
    if ctr%500==0:
        print('Done {} of {}'.format(ctr, len(df)), end='\r', flush=True)


**Repeat read in step and conversion step for every label file.** 