## Traffic Light Detection Sub-Project

Project Description:

* Data exploration and analysis for training a set of traffic light images
* Faster RCNN Inception v2 trained on Coco database
* LISA Traffic Light Database by MarkP.Philipsen, MortenB.Jensen, AndreasMøgelmose, ThomasB.Moeslund and MohanM.Trivedi
* Image folder dayClip3. 643 frames 1280 x 960
* Tensorflow 1.12, CUDA 9.0, CuDNN 7.3.1, Anaconda environment
* Intel Core i7, Nvidia GeForce 845M (my trusty laptop)
* Cannot use filename as image name because it's not unique, used panda index instead
* Images are small when converted to JPG

TODO:
- gather simulator data, annotate and train
- gather Carla data, annotate and train

In [1]:
import matplotlib.image as mpimg
import matplotlib.pyplot as plt
import numpy as np
import pandas as pd
import random
import csv
import cv2

%matplotlib inline

### DATA EXPLORATION

In [2]:
basedir = './images/'

In [3]:
traffic_df = pd.read_csv("./images/dayClip3/trainAnnotationsBOX.csv", delimiter=';')
#traffic_df = pd.read_csv("./images/dayClip3/trainAnnotationsBULB.csv", delimiter=';')
traffic_df.head()

Unnamed: 0,Filename,Annotation tag,Upper left corner X,Upper left corner Y,Lower right corner X,Lower right corner Y,Origin file,Origin frame number,Origin track,Origin track frame number
0,dayTraining/dayClip3--00001.png,warning,622,361,637,379,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,1,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,1
1,dayTraining/dayClip3--00002.png,warning,622,358,637,380,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,2,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,2
2,dayTraining/dayClip3--00003.png,warning,506,382,518,404,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,3,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,3
3,dayTraining/dayClip3--00004.png,warning,620,358,638,380,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4
4,dayTraining/dayClip3--00004.png,warning,504,382,519,404,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4


In [4]:
traffic_df["Upper left corner X"] = traffic_df["Upper left corner X"].apply(str)
traffic_df["Upper left corner Y"] = traffic_df["Upper left corner Y"].apply(str)
traffic_df["Lower right corner X"] = traffic_df["Lower right corner X"].apply(str)
traffic_df["Lower right corner Y"] = traffic_df["Lower right corner Y"].apply(str)
traffic_df["Lower right corner Y"].describe()

count     1131
unique     226
top        304
freq       175
Name: Lower right corner Y, dtype: object

Filenames are not unique

In [6]:
print('Number of images: ', traffic_df.Filename.size)
print('Number of unique images: ', traffic_df.Filename.nunique())

Number of images:  1131
Number of unique images:  547


In [7]:
print('Number of unique annotation tags: ', traffic_df['Annotation tag'].nunique())

Number of unique annotation tags:  3


In [8]:
traffic_df.Filename[traffic_df.Filename == "dayTraining/dayClip3--00004.png"]

3    dayTraining/dayClip3--00004.png
4    dayTraining/dayClip3--00004.png
Name: Filename, dtype: object

In [9]:
same_filenames = traffic_df.index[traffic_df['Filename'] == 'dayTraining/dayClip3--00004.png'].tolist()
same_filenames

[3, 4]

In [10]:
print('Number of images with green light: ', traffic_df.Filename[traffic_df['Labels'] == 2].size)
print('Number of images with yellow light: ', traffic_df.Filename[traffic_df['Labels'] == 1].size)
print('Number of images with red light: ', traffic_df.Filename[traffic_df['Labels'] == 0].size)

Number of images with green light:  118
Number of images with yellow light:  60
Number of images with red light:  953


Convert the dayClip folder of images from .png to .jpg

In [None]:
from PIL import Image
from glob import glob

def png_to_jpeg():
    for obj in glob("./images/dayTraining/dayClip3--*.png"):
        img = Image.open(obj)
        obj = obj[:-3] + "jpg"
        obj = obj [:9] + "dayTrain" + obj[-20:]
        img.save(obj)
        
png_to_jpeg()

Change image names in traffic_df.Filenames to JPG 

In [11]:
for i, img in enumerate(traffic_df.Filename):
    traffic_df.at[i, 'Filename'] = img[:-3] + "jpg"

In [12]:
traffic_df.head()

Unnamed: 0,Filename,Annotation tag,Upper left corner X,Upper left corner Y,Lower right corner X,Lower right corner Y,Origin file,Origin frame number,Origin track,Origin track frame number,Labels
0,dayTraining/dayClip3--00001.jpg,warning,622,361,637,379,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,1,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,1,1
1,dayTraining/dayClip3--00002.jpg,warning,622,358,637,380,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,2,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,2,1
2,dayTraining/dayClip3--00003.jpg,warning,506,382,518,404,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,3,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,3,1
3,dayTraining/dayClip3--00004.jpg,warning,620,358,638,380,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,1
4,dayTraining/dayClip3--00004.jpg,warning,504,382,519,404,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,1


Some images have the same filenames so, get a unique name for each image that will be used in the train folder and test folder.

In [13]:
traffic_df["file_name"] = traffic_df.index
traffic_df["file_name"] = traffic_df["file_name"].apply(str)
traffic_df["file_name"] = 'img' + traffic_df["file_name"] + '.jpg'
traffic_df.head()

Unnamed: 0,Filename,Annotation tag,Upper left corner X,Upper left corner Y,Lower right corner X,Lower right corner Y,Origin file,Origin frame number,Origin track,Origin track frame number,Labels,file_name
0,dayTraining/dayClip3--00001.jpg,warning,622,361,637,379,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,1,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,1,1,img0.jpg
1,dayTraining/dayClip3--00002.jpg,warning,622,358,637,380,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,2,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,2,1,img1.jpg
2,dayTraining/dayClip3--00003.jpg,warning,506,382,518,404,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,3,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,3,1,img2.jpg
3,dayTraining/dayClip3--00004.jpg,warning,620,358,638,380,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,1,img3.jpg
4,dayTraining/dayClip3--00004.jpg,warning,504,382,519,404,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,dayTraining/dayClip3/dayClip3Shutter0.000800-G...,4,1,img4.jpg


Copy the old filenames for reference when getting the images from the orig folder.

Copy the new filenames to be used in identifying images in the train and test folders.

Create the labels csv file for both train and test folder images.

In [14]:
samples = traffic_df.Filename
samples = np.array(samples)
samples[0]

'dayTraining/dayClip3--00001.jpg'

In [15]:
samples_newname = traffic_df.file_name
samples_newname = np.array(samples_newname)
samples_newname[0]

'img0.jpg'

In [18]:
labels_df = traffic_df[['file_name', 'Annotation tag', 'Upper left corner X', 'Upper left corner Y', 'Lower right corner X', 'Lower right corner Y']].copy()

In [19]:
labels_df.columns = ['file_name', 'class', 'xmin', 'ymin', 'xmax', 'ymax']

In [20]:
labels_df.head()

Unnamed: 0,file_name,class,xmin,ymin,xmax,ymax
0,img0.jpg,warning,622,361,637,379
1,img1.jpg,warning,622,358,637,380
2,img2.jpg,warning,506,382,518,404
3,img3.jpg,warning,620,358,638,380
4,img4.jpg,warning,504,382,519,404


In [21]:
labels_df.to_csv(r'./images/labels.csv', index = False, header=True)

Reminder: 

There is no need to resize for LISA database as the images are quite small (~<200KB) when converted to JPG. However, data from other sources added  and used may need resizing and conversion.

In [22]:
from sklearn.model_selection import train_test_split
from sklearn.utils import shuffle

In [23]:
# rand_state = np.random.randint(0,100)
rand_state = 100
FN_train, FN_test, X_train, X_test, y_train_df, y_test_df = train_test_split(samples, samples_newname, labels_df, test_size=0.2, random_state=rand_state)

In [24]:
y_train_df = y_train_df.copy()
y_test_df = y_test_df.copy()

In [25]:
print('FN_train size: ', FN_train.size)
print('FN_test size: ', FN_test.size)
print('X_train size: ', X_train.size)
print('X_test size: ', X_test.size)
print('y_train_df size: ', y_train_df.size)
print('y_test_df size: ', y_test_df.size)

FN_train size:  904
FN_test size:  227
X_train size:  904
X_test size:  227
y_train_df size:  5424
y_test_df size:  1362


Copy images to train and test folders

Create folders first before proceeding

In [26]:
new_basedir = './images/train/'
height_train = [] 
width_train = []
for index in range(X_train.size):
    name = basedir + FN_train[index]
    X_train[index] = new_basedir + X_train[index]
    image = cv2.imread(name) # using the orig name, read the image
    cv2.imwrite(X_train[index], image) # using a new unique name, save the image
    height_train.append(image.shape[0])
    width_train.append(image.shape[1])

In [27]:
new_basedir = './images/test/'
height_test = [] 
width_test = []
for index in range(X_test.size):
    name = basedir + FN_test[index]
    X_test[index] = new_basedir + X_test[index]
    image = cv2.imread(name) # using the orig name, read the image
    cv2.imwrite(X_test[index], image) # using a new unique name, save the image
    height_test.append(image.shape[0])
    width_test.append(image.shape[1])

In [37]:
y_train_df['filename'] = y_train_df['file_name']
y_train_df['height'] = height_train
y_train_df['width'] = width_train
y_train_df['height'] = y_train_df['height'].apply(str)
y_train_df['width'] = y_train_df['width'].apply(str)
y_train = y_train_df[['filename', 'width', 'height','class', 'xmin', 'ymin', 'xmax', 'ymax']].copy()

In [38]:
y_test_df['filename'] = y_test_df['file_name']
y_test_df['height'] = height_test
y_test_df['width'] = width_test
y_test_df['height'] = y_test_df['height'].apply(str)
y_test_df['width'] = y_test_df['width'].apply(str)
y_test = y_test_df[['filename', 'width', 'height', 'class', 'xmin', 'ymin', 'xmax', 'ymax']].copy()

Check array values

In [39]:
for i in range(5):
    print(X_train[i])

./images/train/img580.jpg
./images/train/img515.jpg
./images/train/img46.jpg
./images/train/img424.jpg
./images/train/img826.jpg


In [40]:
for i in range(5):
    print(X_test[i])

./images/test/img1074.jpg
./images/test/img816.jpg
./images/test/img954.jpg
./images/test/img241.jpg
./images/test/img239.jpg


In [41]:
y_train.head()

Unnamed: 0,filename,width,height,class,xmin,ymin,xmax,ymax
580,img580.jpg,1280,960,stop,619,77,661,140
515,img515.jpg,1280,960,stop,893,256,932,328
46,img46.jpg,1280,960,warning,780,356,792,374
424,img424.jpg,1280,960,stop,628,202,661,251
826,img826.jpg,1280,960,stop,960,205,1020,304


In [42]:
y_test.head()

Unnamed: 0,filename,width,height,class,xmin,ymin,xmax,ymax
1074,img1074.jpg,1280,960,go,975,210,1023,290
816,img816.jpg,1280,960,stop,960,205,1020,304
954,img954.jpg,1280,960,stop,960,205,1020,304
241,img241.jpg,1280,960,stop,888,313,903,344
239,img239.jpg,1280,960,stop,779,266,794,284


In [43]:
y_test.describe()

Unnamed: 0,filename,width,height,class,xmin,ymin,xmax,ymax
count,227,227,227,227,227,227,227,227
unique,227,1,1,3,143,124,134,105
top,img532.jpg,1280,960,stop,614,42,662,114
freq,1,227,227,184,29,28,28,28


Copy labels to train and test files

In [44]:
y_train.to_csv(r'./images/train_labels.csv', index = False, header=True)

In [45]:
y_test.to_csv(r'./images/test_labels.csv', index = False, header=True)

   ### THE END

HELPER SCRIPT

Convert an image from .png to .jpg

In [None]:
from PIL import Image

ima = Image.open("./images/dayTraining/dayClip3--00000.png")
ima.save("./images/dayTraining/ima.jpg")