# Traffic Signs Detection using YOLOv3

Important: The following publicly available datasets on kaggle.com are required to run this notebook:
- [Models, YOLOv3 weights and config, videos and utils](https://www.kaggle.com/eiriksteira/gtsrb-models-idatt-2502)

Resources for how to get started with datasets on kaggle can be found [here](https://www.kaggle.com/docs/datasets).


In this notebook, a trained model from the Darknet framework detects traffic signs among 4 categories. Then a trained model in Keras classifies the detected traffic signs into one of 43 classes.

The YOLOv3 detector is trained beforehand by using the Darknet framework by running the following command:

<code>>> ./darknet detector train ts_data.data yolov3_ts_train.cfg darknet53.conv.74 yolov3.weights -gpus 0</code>

Which generates the model weight-files.

# Importing libraries

In [1]:
import numpy as np 
import pandas as pd
import cv2
import time
from timeit import default_timer as timer
import matplotlib.pyplot as plt
import pickle

from keras.models import load_model

import os


# Loading labels

In [2]:
labels = pd.read_csv('../input/gtsrb-models-idatt-2502/label_names_yolo_v3.csv')

# Loading the trained CNN model for Classification

In [3]:
from keras.models import load_model

model_path =  '../input/gtsrb-models-idatt-2502/gtsrb-final.h5'
model = load_model(model_path)

model.summary()


User settings:

   KMP_AFFINITY=granularity=fine,verbose,compact,1,0
   KMP_BLOCKTIME=0
   KMP_SETTINGS=1

Effective settings:

   KMP_ABORT_DELAY=0
   KMP_ADAPTIVE_LOCK_PROPS='1,1024'
   KMP_ALIGN_ALLOC=64
   KMP_ALL_THREADPRIVATE=128
   KMP_ATOMIC_MODE=2
   KMP_BLOCKTIME=0
   KMP_CPUINFO_FILE: value is not defined
   KMP_DETERMINISTIC_REDUCTION=false
   KMP_DEVICE_THREAD_LIMIT=2147483647
   KMP_DISP_NUM_BUFFERS=7
   KMP_DUPLICATE_LIB_OK=false
   KMP_ENABLE_TASK_THROTTLING=true
   KMP_FORCE_REDUCTION: value is not defined
   KMP_FOREIGN_THREADS_THREADPRIVATE=true
   KMP_FORKJOIN_BARRIER='2,2'
   KMP_FORKJOIN_BARRIER_PATTERN='hyper,hyper'
   KMP_GTID_MODE=3
   KMP_HANDLE_SIGNALS=false
   KMP_HOT_TEAMS_MAX_LEVEL=1
   KMP_HOT_TEAMS_MODE=0
   KMP_INIT_AT_FORK=true
   KMP_LIBRARY=throughput
   KMP_LOCK_KIND=queuing
   KMP_MALLOC_POOL_INCR=1M
   KMP_NUM_LOCKS_IN_BLOCK=1
   KMP_PLAIN_BARRIER='2,2'
   KMP_PLAIN_BARRIER_PATTERN='hyper,hyper'
   KMP_REDUCTION_BARRIER='1,1'
   KMP_REDUCTION_BAR

Model: "model"
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_1 (InputLayer)            [(None, 48, 48, 3)]  0                                            
__________________________________________________________________________________________________
conv_1_3x3 (Conv2D)             (None, 48, 48, 32)   864         input_1[0][0]                    
__________________________________________________________________________________________________
batch_normalization (BatchNorma (None, 48, 48, 32)   128         conv_1_3x3[0][0]                 
__________________________________________________________________________________________________
activation (Activation)         (None, 48, 48, 32)   0           batch_normalization[0][0]        
______________________________________________________________________________________________

# Loading YOLOv3 network with OpenCV dnn library

## Loading the trained weights and cfg file into the Network

In [4]:
path_to_weights = '../input/gtsrb-models-idatt-2502/yolov3_ts_train_best.weights'
path_to_cfg = '../input/gtsrb-models-idatt-2502/yolov3_ts_train.cfg'

network = cv2.dnn.readNetFromDarknet(path_to_cfg, path_to_weights)

# To enable usage with GPU
network.setPreferableBackend(cv2.dnn.DNN_BACKEND_OPENCV)
network.setPreferableTarget(cv2.dnn.DNN_TARGET_OPENCL_FP16)


## Getting output layers where detections are made

In [5]:
# Getting names of all YOLOv3 layers
layers_all = network.getLayerNames()

# YOLOv3 detection layers that are 82, 94 and 106
# These are the names of layers for which the output is to be computed
layers_names_output = [layers_all[i - 1] for i in network.getUnconnectedOutLayers()]

print(layers_names_output)


['yolo_82', 'yolo_94', 'yolo_106']


## Setting probability, threshold and colour for the bounding boxes

In [6]:
# Minimum probability to eliminate weak detections
probability_minimum = 0.2

# Setting threshold to filtering weak bounding boxes by non-maximum suppression
threshold = 0.2

# Generating colours for bounding boxes
colours = np.random.randint(0, 255, size=(len(labels), 3), dtype='uint8')

# Reading input video

In [7]:
# Reading video from a file
video = cv2.VideoCapture('../input/gtsrb-models-idatt-2502/traffic-sign-test-video.mp4')

# Writer that will be used to write processed frames
writer = None

# Variables for spatial dimensions of the frames
h, w = None, None


# Processing the video frames
The forward() function of cv2.dnn module returns a nested list containing information about all the detected objects which includes the x and y coordinates of the centre of the object detected, height and width of the bounding box, confidence and scores for all the classes of objects listed in coco.names. The class with the highest score is considered to be the predicted class.

In [8]:
%matplotlib inline

# Set default size of plots
plt.rcParams['figure.figsize'] = (3, 3)

# Variable for counting total amount of frames
f = 0

# Variable for counting total processing time
t = 0

# Catch frames in the loop
while True:
    # Capture frames one-by-one
    ret, frame = video.read()

    # If the frame was not retrieved
    if not ret:
        break
       
    # Get spatial dimensions of the frame for the first time
    if w is None or h is None:
        h, w = frame.shape[:2]

    # Blob from current frame
    blob = cv2.dnn.blobFromImage(frame, 1 / 255, (416, 416), swapRB=True, crop=False)

    # Forward pass with blob through output layers
    network.setInput(blob)
    start = time.time()
    output_from_network = network.forward(layers_names_output)
    end = time.time()

    # Increase counters
    f += 1
    t += end - start

    print('Frame number {0} took {1:.5f} seconds'.format(f, end - start))

    bounding_boxes = []
    confidences = []
    class_numbers = []

    # Go through all output layers after feed forward pass
    for result in output_from_network:
        
        # Go through all detections from current output layer
        for detected_objects in result:
            
            # 80 classes' probabilities for current detected object
            scores = detected_objects[5:]
            
            # Index of the class with the maximum value of probability
            class_current = np.argmax(scores)
            
            # Get value of probability for defined class
            confidence_current = scores[class_current]

            # Eliminate weak predictions by minimum probability
            if confidence_current > probability_minimum:
                
                # Scale bounding box coordinates to the initial frame size
                box_current = detected_objects[0:4] * np.array([w, h, w, h])

                # Top left corner coordinates
                x_center, y_center, box_width, box_height = box_current
                x_min = int(x_center - (box_width / 2))
                y_min = int(y_center - (box_height / 2))

                # Add results into prepared lists
                bounding_boxes.append([x_min, y_min, int(box_width), int(box_height)])
                confidences.append(float(confidence_current))
                class_numbers.append(class_current)
                

    # Non-maximum suppression of given bounding boxes
    # which removes redundant overlapping bounding boxes.
    results = cv2.dnn.NMSBoxes(bounding_boxes, confidences, probability_minimum, threshold)

    is_any_detected_objects_left = len(results) > 0

    if is_any_detected_objects_left:
        
        for i in results.flatten():
            
            # Bounding box coordinates - width and height
            x_min, y_min = bounding_boxes[i][0], bounding_boxes[i][1]
            box_width, box_height = bounding_boxes[i][2], bounding_boxes[i][3]
            
            # Cut fragment with Traffic Sign
            c_ts = frame[y_min:y_min + int(box_height), x_min:x_min + int(box_width), :]
            
            if c_ts.shape[:1] == (0,) or c_ts.shape[1:2] == (0,):
                pass
            else:
                # Get preprocessed blob with Traffic Sign of required shape
                blob_ts = cv2.dnn.blobFromImage(c_ts, scalefactor=1, size=(48, 48), swapRB=True, crop=False)
                blob_ts = blob_ts.transpose(0, 2, 3, 1)
      
                # Feed the CNN model to get predicted label among 43 classes
                predictions = model.predict(blob_ts)

                # Get the class with the maximum value
                prediction = np.argmax(predictions)

                # Colour for current bounding box
                colour_box_current = colours[class_numbers[i]].tolist()

                # Draw bounding box on the original current frame
                cv2.rectangle(frame, (x_min, y_min),
                              (x_min + box_width, y_min + box_height),
                              colour_box_current, 2)

                # Prepare text with label and confidence for current bounding box
                text_box_current = '{}: {:.4f}'.format(labels['SignName'][prediction],
                                                       confidences[i])

                # Put text with label and confidence on the original image
                cv2.putText(frame, text_box_current, (x_min, y_min - 5),
                            cv2.FONT_HERSHEY_SIMPLEX, 0.5, colour_box_current, 2)


    # Initialize writer only once
    if writer is None:
        fourcc = cv2.VideoWriter_fourcc(*'mp4v')

        # Writecurrent processed frame to the video file
        writer = cv2.VideoWriter('result.mp4', fourcc, 25,
                                 (frame.shape[1], frame.shape[0]), True)

    # Write processed current frame to the file
    writer.write(frame)


# Release video reader and writer
video.release()
writer.release()


[ WARN:0] global /tmp/pip-req-build-0culq997/opencv/modules/dnn/src/dnn.cpp (1422) setUpNet DNN: OpenCL target is not supported with current OpenCL device (tested with GPUs only), switching to CPU.


Frame number 1 took 0.97270 seconds


2021-11-28 10:41:40.976711: I tensorflow/compiler/mlir/mlir_graph_optimization_pass.cc:185] None of the MLIR Optimization Passes are enabled (registered 2)


Frame number 2 took 0.62464 seconds
Frame number 3 took 0.61929 seconds
Frame number 4 took 0.94200 seconds
Frame number 5 took 0.62285 seconds
Frame number 6 took 0.63592 seconds
Frame number 7 took 0.63684 seconds
Frame number 8 took 0.62177 seconds
Frame number 9 took 0.62528 seconds
Frame number 10 took 0.62592 seconds
Frame number 11 took 0.63127 seconds
Frame number 12 took 0.62715 seconds
Frame number 13 took 0.62412 seconds
Frame number 14 took 0.61850 seconds
Frame number 15 took 0.62691 seconds
Frame number 16 took 0.61947 seconds
Frame number 17 took 0.62048 seconds
Frame number 18 took 0.62384 seconds
Frame number 19 took 0.62417 seconds
Frame number 20 took 0.63115 seconds
Frame number 21 took 0.62574 seconds
Frame number 22 took 0.62499 seconds
Frame number 23 took 0.62690 seconds
Frame number 24 took 0.62043 seconds
Frame number 25 took 0.61799 seconds
Frame number 26 took 0.62202 seconds
Frame number 27 took 0.62283 seconds
Frame number 28 took 0.61869 seconds
Frame num

## Results

In [9]:
print('Total number of frames', f)
print('Total amount of time {:.5f} seconds'.format(t))
print('FPS:', round((f / t), 1))  

Total number of frames 56
Total amount of time 35.98161 seconds
FPS: 1.6


In [10]:
# Saving locally without committing
from IPython.display import FileLink

FileLink('result.mp4')

# Example results from frame

<a href="https://ibb.co/nRTbx17"><img src="https://i.ibb.co/PtRNqWc/Va-r.png" alt="Va-r" border="0"></a>