<a href="https://colab.research.google.com/github/RobInLabUJI/MultilayerPerceptron/blob/main/Traffic_Sign_Classification_Evaluation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Traffic Sign Classification: Validation

Our dataset comes from [**GTSRB** - The German Traffic Sign Recognition Benchmark](https://benchmark.ini.rub.de/gtsrb_news.html). It contains more than 40 classes and more than 50,000 images of traffic signs, which are prepared for multi-class, single-image classification challenge. Based  on it we'll build a simple classifier.

In [None]:
import os

TMP_DATA_DIR = "dataset/tmp"
TMP_LABELS_DIR = os.path.join(TMP_DATA_DIR, "GTSRB/Final_Test")

TESTING_DATA_DIR = "dataset/testing"

In [None]:
#Fetch images deom GTSRB website
#Images for validation    
!curl -LOC - https://sid.erda.dk/public/archives/daaeac0d7ce1152aea9b61d9f1e19370/GTSRB_Final_Test_Images.zip
#Labels for validation
!curl -LOC - https://sid.erda.dk/public/archives/daaeac0d7ce1152aea9b61d9f1e19370/GTSRB_Final_Test_GT.zip

In [None]:
import zipfile

In [None]:
%%time
to_unpack = [
    ("GTSRB_Final_Test_Images.zip", TMP_DATA_DIR),
    ("GTSRB_Final_Test_GT.zip", TMP_LABELS_DIR)
]
 
for file, directory in to_unpack:
    print("Unzipping {} to {}...".format(file, directory))
    with zipfile.ZipFile(file,"r") as zip_ref:
        zip_ref.extractall(directory)

## Prepare validation and labels

Originally dataset operates on numerical labels (e.g. "00000"). Let's make it human-readable. Labels will be saved in `/output` directory.

In [None]:
OUTPUT_ROOT_DIR = "output/"
OUTPUT_LABELS = os.path.join(OUTPUT_ROOT_DIR, "retrained_labels_mv1_100_224.txt")
OUTPUT_READABLE_LABELS = os.path.join(OUTPUT_ROOT_DIR, "labels_readable.txt")

In [None]:
label_map = {
    0: '20_speed',
    1: '30_speed',
    2: '50_speed',
    3: '60_speed',
    4: '70_speed',
    5: '80_speed',
    6: '80_lifted',
    7: '100_speed',
    8: '120_speed',
    9: 'no_overtaking_general',
    10: 'no_overtaking_trucks',
    11: 'right_of_way_crossing',
    12: 'right_of_way_general',
    13: 'give_way',
    14: 'stop',
    15: 'no_way_general',
    16: 'no_way_trucks',
    17: 'no_way_one_way',
    18: 'attention_general',
    19: 'attention_left_turn',
    20: 'attention_right_turn',
    21: 'attention_curvy',
    22: 'attention_bumpers',
    23: 'attention_slippery',
    24: 'attention_bottleneck',
    25: 'attention_construction',
    26: 'attention_traffic_light',
    27: 'attention_pedestrian',
    28: 'attention_children',
    29: 'attention_bikes',
    30: 'attention_snowflake',
    31: 'attention_deer',
    32: 'lifted_general',
    33: 'turn_right',
    34: 'turn_left',
    35: 'turn_straight',
    36: 'turn_straight_right',
    37: 'turn_straight_left',
    38: 'turn_right_down',
    39: 'turn_left_down',
    40: 'turn_circle',
    41: 'lifted_no_overtaking_general',
    42: 'lifted_no_overtaking_trucks'
}

if not os.path.exists(OUTPUT_ROOT_DIR):
        os.makedirs(OUTPUT_ROOT_DIR)

file = open(OUTPUT_READABLE_LABELS, 'w')
for key, val in sorted(label_map.items()):
    file.write("{}\n".format(val))
file.close()

In [None]:
from PIL import Image

In [None]:
import matplotlib.pyplot as plt

All of TensorFlow's image modules expect float inputs in the `[0, 1]` range. Use the `ImageDataGenerator`'s `rescale` parameter to achieve this.

The image size will be handled later.

In [None]:
import tensorflow as tf
from tensorflow import keras
import numpy as np

In [None]:
IMAGE_SHAPE = (224, 224)
image_generator = keras.preprocessing.image.ImageDataGenerator(rescale=1/255)

Let's convert the validation dataset from *.ppm to *.jpg.

In [None]:
#Load testing set
tmp_test_data_dir = os.path.join(TMP_DATA_DIR, "GTSRB/Final_Test/Images")

In [None]:
%%time

tmp_test_data_files = [f for f in os.listdir(tmp_test_data_dir) if f.endswith(".ppm")]
test_images = []

#export as JPGs
for ppm_file in tmp_test_data_files:
    image_dir = os.path.join(tmp_test_data_dir, ppm_file) 
    image = Image.open(image_dir)
    directory = TESTING_DATA_DIR
    image_filename = "{}.jpg".format(os.path.splitext(os.path.basename(ppm_file))[0])

    if not os.path.exists(directory):
        os.makedirs(directory)
        
    final_image = os.path.join(directory, image_filename)
    image.save(final_image)

    test_images.append(final_image)
    test_images.sort()
    
print("Test images count:", len(test_images))

## Model evaluation

In [None]:
from google.colab import files
files.upload()

Replace the filename with the appropriate value.

In [None]:
!tar xf model1675845690.tar.gz

In [None]:
model = keras.models.load_model('output/model1675845690')

### Transformation of the evaluation dataset

Load the validation data from the CSV file into a pandas data frame. 
Also change the file ending from *.ppm to *.jpg in the `Filename` column.
Then also rescale the values for the model. 
The `ClassId` must be a string and with leading zeros, because this classification naming was also used for the subfolders in the training dataset. 

In [None]:
import pandas as pd

tmp_test_labels_csv = os.path.join(TMP_LABELS_DIR, "GT-final_test.csv")
test_data_frame = pd.read_csv(tmp_test_labels_csv, header=0, sep=';')
test_data_frame['Filename'] = test_data_frame['Filename'].str.replace('.ppm','.jpg')
test_data_frame['ClassId'] = test_data_frame['ClassId'].astype(str).str.zfill(5)

image_test_data = image_generator.flow_from_dataframe(test_data_frame, x_col="Filename", directory=TESTING_DATA_DIR, y_col="ClassId", target_size=IMAGE_SHAPE)

In [None]:
for image_test_batch, label_test_batch in image_test_data:
  print("Image batch shape: ", image_test_batch.shape)
  print("Label batch shape: ", label_test_batch.shape)
  break

### Display some prediction result


Run one image batch through the model and extract the predicted traffic sign ID. 


In [None]:
predicted_batch = model.predict(image_test_batch)
predicted_id = np.argmax(predicted_batch, axis=-1)
label_id = np.argmax(label_test_batch, axis=-1)

Display the images in one batch with the prediction result. 
Print the accuracy for the displayed batch. 

In [None]:
from sklearn.metrics import accuracy_score

batch_size = image_test_batch.shape[0]
num_plot_column = 5
num_plot_row = batch_size // num_plot_column + (batch_size % num_plot_column > 0)

plt.figure(figsize=(10,9))
plt.subplots_adjust(hspace=0.5)
for n in range(batch_size):
  plt.subplot(num_plot_row,num_plot_column,n+1)
  plt.imshow(image_test_batch[n])
  color = "green" if predicted_id[n] == label_id[n] else "red"
  plt.title(label_map[predicted_id[n]].title(), color=color)
  plt.axis('off')
_ = plt.suptitle("Model predictions (green: correct, red: incorrect)")

print("Accuracy of the shown eval batch:")
accuracy_score(label_id, predicted_id)

### Calculate validation dataset metrics 

Calculate the prediction accuracy and loss for all images in the test dataset

In [None]:
score = model.evaluate(x=image_test_data, batch_size=image_test_data.batch_size, steps=image_test_data.samples/image_test_data.batch_size)
print("Loss: ", score[0], "Accuracy: ", score[1])


That's it, congratulations! :-)