# Data challenge

Your task is to build an algorithm for prohibitory traffic sign recognition. We have
created a training data set (20 images) that can be used for building and fine-tuning a model.
However, you are encouraged to collect more images, since it will allow you to improve your
model’s performance.

There are 6 traffic sign categories your model should be able to recognize:
* Category A - no right, left, or U-turn
* Category B - speed limit (regardless of the indicated value)
* Category C - road closed
* Category D - no entry
* Category E - no stopping, no parking
* Category F - other types of prohibitory traffic signs

## Not I'll try to implement the [YOLOv8][1] in KerasCV, an extension of Keras for computer vision

Since pytorch is still a bit new to me and the origninal docs kind of forced me to upload our traffic to the roboflow website where they augmented the data without any code. I felt like that ruined the purpose of the data challenge. So now I'll follow this tutorial:

https://keras.io/examples/vision/yolov8/

[1]: https://keras.io/api/keras_cv/models/backbones/yolo_v8/


In [66]:
!pip install kerascv



In [67]:
# Import packages
import os
import pandas as pd
import tensorflow as tf
from tensorflow import keras
import keras_cv
from keras_cv import bounding_box
from keras_cv import visualization

# Define hyperparameters
SPLIT_RATIO = 0.5
BATCH_SIZE = 4
LEARNING_RATE = 0.001
EPOCH = 5
GLOBAL_CLIPNORM = 10.0

# Load in the Traffic Sign Data

In [68]:
class_ids = [
    'no-right-left-or-u-turn',
    'speed-limit',
    'road-closed',
    'no-entry',
    'no-stopping-no-parking',
    'other'
]

class_mapping = {class_name: class_id for class_id, class_name in enumerate(class_ids)}
print(class_mapping)

# Path to images
path_images = '/kaggle/input/traffic-signs-data-challenge/train'

# Get all JPEG image file paths in path_images and sort them
jpg_files = sorted(
    [
        os.path.join(path_images, file_name)
        for file_name in os.listdir(path_images)
        if file_name.endswith(".jpg")
    ]
)

{'no-right-left-or-u-turn': 0, 'speed-limit': 1, 'road-closed': 2, 'no-entry': 3, 'no-stopping-no-parking': 4, 'other': 5}


In [69]:
# Read the annotation file inot a pandas dataframe
annot_file = '/kaggle/input/traffic-signs-data-challenge/train/_annotations.csv'
annot_df = pd.read_csv(annot_file)

annot_df.head()

Unnamed: 0,filename,width,height,class,xmin,ymin,xmax,ymax
0,17_jpg.rf.06cf807529216c05e2421a97e9a3c9aa.jpg,600,300,no-entry,22,149,60,177
1,17_jpg.rf.06cf807529216c05e2421a97e9a3c9aa.jpg,600,300,no-stopping-no-parking,520,160,570,198
2,17_jpg.rf.06cf807529216c05e2421a97e9a3c9aa.jpg,600,300,other,522,236,571,274
3,6_jpg.rf.8249332fe6bb156f9730dd96d112223f.jpg,600,300,road-closed,506,121,542,153
4,6_jpg.rf.8249332fe6bb156f9730dd96d112223f.jpg,600,300,other,555,97,597,131


In [70]:
grouped_df = annot_df.groupby(['filename']).agg({col:lambda x: list(x) for col in annot_df.columns[1:]}).reset_index()

grouped_df.head()

Unnamed: 0,filename,width,height,class,xmin,ymin,xmax,ymax
0,10_jpg.rf.73da5bf11a250d18d212bd276be2a02e.jpg,"[600, 600, 600]","[300, 300, 300]","[road-closed, no-stopping-no-parking, speed-li...","[211, 515, 515]","[175, 107, 68]","[240, 557, 555]","[203, 143, 104]"
1,11_jpg.rf.788b737ab908f716125157158a741b1d.jpg,"[600, 600]","[300, 300]","[no-stopping-no-parking, no-entry]","[543, 6]","[67, 112]","[600, 42]","[142, 146]"
2,12_jpg.rf.f2fd7c2e462831e1225e480312289ad9.jpg,"[600, 600]","[300, 300]","[speed-limit, no-stopping-no-parking]","[521, 524]","[147, 186]","[556, 556]","[177, 214]"
3,13_jpg.rf.d5301f84fec055b793adabfb5e4ef329.jpg,"[600, 600, 600]","[300, 300, 300]","[speed-limit, other, no-stopping-no-parking]","[431, 432, 430]","[149, 120, 91]","[459, 460, 460]","[176, 149, 120]"
4,14_jpg.rf.b69dd2f82bdc031b0ccd416aaa8ffaf5.jpg,"[600, 600, 600]","[300, 300, 300]","[other, other, speed-limit]","[463, 456, 319]","[104, 158, 124]","[487, 496, 327]","[158, 213, 134]"


In [71]:
# Initialize lists to store information
image_paths = []
bbox = []
classes = []


def parse_annotation_csv(row):
    # Get the path pf the image
    image_name = row['filename']
    image_path = os.path.join(path_images, image_name)
    
    # Get the bounding box of the traffic sign
    boxes = [row['xmin'], row['ymin'], row['xmax'], row['ymax']]
    
    # Get the class id of the traffic sign
    class_ids = row['class']
    
    return image_path, boxes, class_ids

# Process annotations for each row in the DataFrame
for index, row in grouped_df.iterrows():
    image_path, boxes, class_ids = parse_annotation_csv(row)
    
    image_paths.append(image_path)
    bbox.append(boxes)
    classes.append(class_ids)

Ragged tensors are used to create a tf.data.Dataset using the from_tensor_slices method. This method creates a dataset from the input tensors by slicing them along the first dimension. By using ragged tensors, the dataset can handle varying lengths of data for each image and provide a flexible input pipeline for further processing.

In [72]:
bbox = tf.ragged.constant(bbox)
classes = tf.ragged.constant(classes)
image_paths = tf.ragged.constant(image_paths)

data = tf.data.Dataset.from_tensor_slices((image_paths, classes, bbox))

## Train Validation split

In [73]:
# Determine the number of validation samples
num_val = int(len(grouped_df) * SPLIT_RATIO)

# Split the dataset into train and validation sets
val_data = data.take(num_val)
train_data = data.skip(num_val)