# Traffic Sign Recognition using CNN

This is my implementation of the Tutorial [Traffic Sign Recognision using CNN](https://data-flair.training/blogs/python-project-traffic-signs-recognition/). It is one of the projects in [DataFlair’s series of machine learning projects](https://data-flair.training/blogs/python-machine-learning-project-detecting-parkinson-disease/).
In this notebook, we will develop and train the ML model using PyTorch which we can later package into a standalone applications.

## Dataset

For training the model we will use the [Traffic Sign Dataset](http://benchmark.ini.rub.de/?section=gtsrb&subsection=dataset). It was made available in 2011 as part of a competition at the International Joint Conference on Neural Networks (IJCNN). The dataset contains  43 classes and over 50,000 images in total.  It is about 300 MB in size and the Training set can be downloaded from this [link](http://benchmark.ini.rub.de/Dataset/GTSRB_Final_Training_Images.zip) and the test set can be downloaded from this [link](http://benchmark.ini.rub.de/Dataset/GTSRB_Final_Test_Images.zip).

Once you download the dataset, it contains 3 folders: Train, Test and Meta.

<img src="images/dataset folders.png" width="200px" />

Inside the *Train* folder there are 43 folders labeled from 0 to 42 and each folder contains images for a specific class.

## Loading the training data

In [4]:
import numpy as np
import matplotlib.pyplot as plt
import os
from PIL import Image

In [78]:
X = [] #images stored as np array
y = [] #class labels (0 to 42)

classes = 43
current_path = os.getcwd()

for i in range(classes):
    path = os.path.join(current_path, 'data', 'train', str(i))
    images = os.listdir(path)
    
    for img_name in images:
        image = Image.open(path + '\\' + img_name)
        image = image.resize((30, 30))
        image = np.array(image)
        X.append(image)
        y.append(i)
        
X = np.array(X)
y = np.array(y)

Each image is resized to be of size 30x30 because that is what out neural network is expecting the input to be. Because each image is colored it has three channels (R,G,B), we expected our traning data to have the shape (N, 30, 30, 3) where N = total number of training instance (39,209).

In [8]:
# shape of training data
X.shape

(39209, 30, 30, 3)

## Train/Test Split

At this point, it is a good idea to split the data into training data and testing data where the former is used in training the model while the later is used to evaluate its performance on unseen data before deploying it into a production environment.

In [10]:
def train_test_split(data, labels, test_size=0.2, random_state=0):
    np.random.seed(random_state)
    N = labels.shape[0]
    idx = np.random.permutation(N)
    train_size = int(np.ceil((1-test_size)*N))
    X_train = data[idx[:train_size]]
    y_train = labels[idx[:train_size]]
    X_test = data[idx[train_size:]]
    y_test = labels[idx[train_size:]]
    return X_train, X_test, y_train, y_test

In [79]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=113)
print(f"X_train shape: {X_train.shape}")
print(f"X_test shape: {X_test.shape}")
print(f"y_train shape: {y_train.shape}")
print(f"y_test shape: {y_test.shape}")

X_train shape: (31368, 30, 30, 3)
X_test shape: (7841, 30, 30, 3)
y_train shape: (31368,)
y_test shape: (7841,)


The last step is data preparation is convert the class labels into a vector representation using one-hot encoding. Where for label $r$, it is represented by a vector of length 43 where all the entries are zero except the $r^{th}$ entry.

In [15]:
def to_categorical(y, num_classes=None):
    N = y.shape[0]
    y_max, y_min = np.max(y), np.min(y)
    if num_classes == None:
        num_classes = y_max - y_min + 1
    y_matrix = np.zeros((N, num_classes),dtype='float32')
    for i in range(N):
        label = y[i] - y_min
        y_matrix[i,label] = 1.0
    return y_matrix

In [16]:
y_train = to_categorical(y_train, 43)
y_test = to_categorical(y_test, 43)

print(f"y_train shape: {y_train.shape}")
print(f"y_test shape: {y_test.shape}")

y_train shape: (31368, 43)
y_test shape: (7841, 43)


## Building the Model

to classify the images we will build a CNN model that has the following architecture:

`CONV2D -> MAXPOOL -> DROPOUT -> CONV2D -> MAXPOOL -> DROPOUT --> FLATTEN -> FULLYCONNECTED -> DROPOUT -> FULLYCONNECTED`

In [80]:
import torch
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim
from torch.utils.data import TensorDataset
from torch.utils.data import DataLoader

In [81]:
class Net(nn.Module):
    
    def __init__(self):
        super(Net, self).__init__()
        self.conv1 = nn.Conv2d(3, 32, 5) #output (26,26,32)
        self.pool = nn.MaxPool2d(2,2)
        self.dropout = nn.Dropout(0.25)
        self.conv2 = nn.Conv2d(32, 64, 3) #output(10,10,64)
        self.fc1 = nn.Linear(5*5*64,256)
        self.fc2 = nn.Linear(256, 43)
        
    def forward(self, x):
        x = self.pool(F.relu(self.conv1(x)))
        x = self.dropout(x)
        x = self.pool(F.relu(self.conv2(x)))
        x = self.dropout(x)
        x = x.view(-1, 5*5*64)
        x = F.relu(self.fc1(x))
        x = self.dropout(x)
        x = self.fc2(x)
        return x

In [82]:
model = Net()
model

Net(
  (conv1): Conv2d(3, 32, kernel_size=(5, 5), stride=(1, 1))
  (pool): MaxPool2d(kernel_size=2, stride=2, padding=0, dilation=1, ceil_mode=False)
  (dropout): Dropout(p=0.25, inplace=False)
  (conv2): Conv2d(32, 64, kernel_size=(3, 3), stride=(1, 1))
  (fc1): Linear(in_features=1600, out_features=256, bias=True)
  (fc2): Linear(in_features=256, out_features=43, bias=True)
)

we will use the cross entroy loss and the adam optimizer

In [83]:
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(net.parameters(), lr=0.000)

## Train the model

we will train the model using batch size of 64 and for 15 epochs

In [87]:
# data loaded as (batch, width, height, channels) 
# but pytorch expects (batch,channels, width, height) 
X_train_tensor = torch.from_numpy(np.transpose(X_train, (0, 3, 1, 2))).float()
y_train_tensor = torch.from_numpy(y_train).long()
train_data = TensorDataset(X_train_tensor, y_train_tensor)
dataloader = DataLoader(dataset=train_data, batch_size=4096)

In [88]:
epochs = 15

for epoch in range(epochs):  # loop over the dataset multiple times

    running_loss = 0.0
    for i, data in enumerate(dataloader, 0):
        inputs, labels = data
        # zero the parameter gradients
        optimizer.zero_grad()
        # forward + backward + optimize
        outputs = model(inputs)
        loss = criterion(outputs, labels)
        loss.backward()
        optimizer.step()

        # print statistics
        running_loss += loss.item()
        if i % 2000 == 1999:    # print every 2000 mini-batches
            print('[%d, %5d] loss: %.3f' %
                  (epoch + 1, i + 1, running_loss / 2000))
            running_loss = 0.0

print('Finished Training')

KeyboardInterrupt: 

our model get x% accuracy on the training dataset. we can plot a graph of the loss. 

In [None]:
# plot loss

## Testing the model