# Advanced Certification in AIML
## A Program by IIIT-H and TalentSprint

## Learning Objectives





At the end of the experiment you will be able to :

* load and extract features of images

* implement convolutional neural networks using Keras

In [None]:
#@title Experiment Walkthrough Video

from IPython.display import HTML

HTML("""<video width="854" height="480" controls>
  <source src="https://cdn.exec.talentsprint.com/non-processed/Keras_Traffic_Sign_Detection.mp4" type="video/mp4">
</video>
""")

## Introduction

Traffic sign recognition is a challenging, real-world problem relevant for AI based transportation systems. Traffic signs show a wide range of variations between classes in terms of color, shape, and the presence of pictograms or text. However, there exist subsets of
classes (e.g., speed limit signs) that are very similar to each other Using a comprehensive traffic sign detection dataset, here we will perform CNN to classify each different road sign for us.


![alt text](https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/Images/img.png)

 

## Dataset
The dataset choosen for this experiment is  GTSRB dataset, a dataset with over 1213 images of German Traffic Signs. There are 43 classes (43 different types of signs that we’re going to have to classify). 

**Labels / Classes**

    1:'Speed limit (20km/h)',
    2:'Speed limit (30km/h)', 
    3:'Speed limit (50km/h)', 
    4:'Speed limit (60km/h)', 
    5:'Speed limit (70km/h)', 
    6:'Speed limit (80km/h)', 
    7:'End of speed limit (80km/h)', 
    8:'Speed limit (100km/h)', 
    9:'Speed limit (120km/h)', 
    10:'No passing', 
    11:'No passing veh over 3.5 tons', 
    12:'Right-of-way at intersection', 
    13:'Priority road', 
    14:'Yield', 
    15:'Stop', 
    16:'No vehicles', 
    17:'Veh > 3.5 tons prohibited', 
    18:'No entry', 
    19:'General caution', 
    20:'Dangerous curve left', 
    21:'Dangerous curve right', 
    22:'Double curve', 
    23:'Bumpy road', 
    24:'Slippery road', 
    25:'Road narrows on the right', 
    26:'Road work', 
    27:'Traffic signals', 
    28:'Pedestrians', 
    29:'Children crossing', 
    30:'Bicycles crossing', 
    31:'Beware of ice/snow',
    32:'Wild animals crossing', 
    33:'End speed + passing limits', 
    34:'Turn right ahead', 
    35:'Turn left ahead', 
    36:'Ahead only', 
    37:'Go straight or right', 
    38:'Go straight or left', 
    39:'Keep right', 
    40:'Keep left', 
    41:'Roundabout mandatory', 
    42:'End of no passing', 
    43:'End no passing veh > 3.5 tons' 


In [None]:
! apt-get -qq install -y libfluidsynth1
! wget -qq  https://cdn.iiith.talentsprint.com/aiml/Experiment_related_data/FullIJCNN2013.zip
! unzip -qq FullIJCNN2013.zip

### Import Required packages

In [None]:
import pandas as pd
import numpy as np
from sklearn.model_selection import train_test_split
import matplotlib.pyplot as plt
import glob
from PIL import Image
# Keras
import tensorflow as tf 
from tensorflow import keras
from tensorflow.keras.utils import to_categorical
from keras.models import Sequential
from keras.layers import Dense, Conv2D, Flatten, MaxPool2D,Dropout

### Load the data

#### About glob.iglob:

The glob library  provides methods for traversing the file system and returning files that matched a defined set of glob patterns.

**Note:** Refer to  [glob.iglob](https://docs.python.org/3/library/glob.html)

In [None]:
images_data = glob.glob("/content/FullIJCNN2013/*/*.ppm")
len(images_data)

1213

### Extract features and labels from the data

In [None]:
data, labels = [], []
for i in images_data:
    try:
        img = Image.open(i)
        img = img.resize((30,30))#.reshape(30*30*3)
        labels.append(int(i.split("/")[3]))
        data.append(np.array(img))
    except:
        pass

In [None]:
#Converting lists into numpy arrays
data = np.array(data)
print("data:",data.shape)
labels = np.array(labels)
print("labels:",labels)

data: (1213, 30, 30, 3)
labels: [ 9  9  9 ... 36 36 36]


### Split the data into train and test sets


In [None]:
#Splitting training and testing dataset
X_train, X_test, y_train, y_test = train_test_split(data, labels, test_size=0.2, random_state=42)

In [None]:
print(X_train.shape, X_test.shape, y_train.shape, y_test.shape)


### Data Pre-Processing

In [None]:
# Converts a class vector (integers) to binary class matrix
y_train = to_categorical(y_train, 43)
y_test = to_categorical(y_test, 43)

### Visualize the sample image of each class

In [None]:
n_classes = len(set(labels))
n_classes
targets = np.array(labels)
plt.figure(figsize=(16, 16))
for c in range(n_classes):
    i = np.random.choice(np.where(targets == c)[0])
    plt.subplot(8, 8, c+1)
    plt.axis('off')
    plt.title('class: {}'.format(c))
    plt.imshow(data[i])

### Preparing the model

Initialize our CNN model by creating an instance of Sequential. The Sequential function initializes a linear stack of layers of the network from input to output.

In [None]:
model = Sequential()

Adding First Convolution Layer to CNN

*   First parameter is number of output_channels
*   Second parameter is the kernel_size
*   Third parameter is activation function. Use rectifier function, shortened as relu.
*   Final parameter is input_shape, which is the height and width of the rgb image 

In [None]:
model.add(Conv2D(filters=32, kernel_size=(5,5), activation='relu', input_shape=(30, 30, 3)))
model.add(MaxPool2D(pool_size=(2, 2)))
model.add(Dropout(rate=0.25))


Adding Second Convolution Layer to CNN

*   First parameter is number of output_channels
*   Second parameter is the kernel_size
*   Third parameter is activation function. Use rectifier function, shortened as relu.

In [None]:
model.add(Conv2D(filters=64, kernel_size=(3, 3), activation='relu'))
model.add(MaxPool2D(pool_size=(2, 2))) # Max Pool with kernel size (2, 2)
model.add(Dropout(rate=0.25))
model.add(Flatten())

Adding Fully Connected Layer

*   First parameter is number of nodes in each layer.
*   Second parameter is activation function. Use rectifier function, shortened as relu.


In [None]:
model.add(Dense(256, activation='relu'))

Adding Output Layer 

* First parameter of output layer is changed because the expected nodes at the output layer is 43 nodes, here we’re dealing with a classification problem that has 43 categories to classify in the GTSRB dataset.

* Change the activation function to Softmax to get the probabilities of those 43 classes. Softmax is a sigmoid function applied to an independent variable with more than two categories

In [None]:
model.add(Dense(43, activation='softmax'))

Compiling the CNN
 
* The first parameter is to get the optimal set of weights in the neural network. Adam is really efficient  to use for large amount of data.
* The second parameter is the loss function. Since our classes are categorical, we use  categorical_crossentopy loss function. 
* The final argument is the criterion to evaluate the model. In this case we use the accuracy.

In [None]:
model.compile(optimizer='adam', loss='categorical_crossentropy', metrics=['accuracy'])

To view the model summary

In [None]:
model.summary()

### Training the Model

* First parameter contains train images
* Second parameter contains labels
* Epochs represents the number of times we’re going to pass our full dataset through the CNN.

In [None]:
history2 = model.fit(X_train, y_train, epochs=15)

### Evaluate the model

* First parameter is predicted labels of the test images
* Second parameter is actual labels of the test labels

In [None]:
test_loss, test_acc = model.evaluate(X_test, y_test)