# Universidade Federal do Rio Grande do Norte


## Programa de Pós-Graduação em Engenharia Elétrica e de Computação
## EEC1509 - Aprendizagem de Máquina


# Group

## João Lucas Correia Barbosa de Farias

## Júlio Freire Peixoto Gomes


# Project 2 - Traffic Sign Recognition


## About the Project
This project is divided in 6 files including this one, where each one represents one step in the process of deploying a machine learning algorithm. In this case, we chose a Neural Network algorithm as Classifier. The goal is to explore learning, generalization and batch-normalization techniques and compare results.

The dataset has over 50k images of traffic signs. Our goal is to predict which sign a specific image refers to.


### The details about the dataset are shown below.

The German Traffic Sign Benchmark is a multi-class, single-image classification challenge held at the International Joint Conference on Neural Networks (IJCNN) 2011.

*   Single-image, multi-class classification problem
*   More than 40 classes
*   More than 50,000 images in total
*   Large, lifelike database

For more information, visit:

https://www.kaggle.com/datasets/meowmeowmeowmeowmeow/gtsrb-german-traffic-sign

Also, for each class, that is a respective shape, color and sign id's. They are describred as follows:



1.   Shape ID
  *   0: red
  *   1: blue
  *   2: yellow
  *   3: white
2.   Color ID
  *   0: triangle
  *   1: circle
  *   2: diamond
  *   3: hexagon
  *   4: inverse-triangle
3.   Sign ID
  *   float: value according to Ukranian Traffic Rule

## The dataset was taken from Kaggle:
https://www.kaggle.com/datasets/uciml/red-wine-quality-cortez-et-al-2009

# 1.0 Install and Load Libraries


In [None]:
%%capture
# install wandb
!pip install wandb

In [None]:
import logging
import tempfile
import pandas as pd
import os
import wandb
from sklearn.model_selection import train_test_split

# 2.0 Data Segretation
In this step we will segregate our data in train and test sets.

## 2.1 Login to Weights & Biases

In [None]:
# login to wandb
!wandb login --relogin

[34m[1mwandb[0m: Logging into wandb.ai. (Learn how to deploy a W&B server locally: https://wandb.me/wandb-server)
[34m[1mwandb[0m: You can find your API key in your browser here: https://wandb.ai/authorize
[34m[1mwandb[0m: Paste an API key from your profile and hit enter, or press ctrl+c to quit: 
[34m[1mwandb[0m: Appending key for api.wandb.ai to your netrc file: /root/.netrc


## 2.2 Define the ratios with which to segregate the dataset

Since the dataset downloaded was already segregated in train and test sets, here we only export these sets to W&B as segregated_data type.

In [None]:
# configure logging
logging.basicConfig(level=logging.INFO,
                    format="%(asctime)s %(message)s",
                    datefmt='%d-%m-%Y %H:%M:%S')

# reference for a logging obj
logger = logging.getLogger()

In [None]:
# name of the input artifacts
artifact_input_name_train = "traffic_sign_recognition/preprocessed_data_train.h5:latest"
artifact_input_name_labels = "traffic_sign_recognition/preprocessed_data_train_labels.csv:latest"
artifact_input_name_test = "traffic_sign_recognition/raw_data_test.h5:latest"
artifact_input_name_test_labels = "traffic_sign_recognition/raw_data_test_labels.csv:latest"

In [None]:
# type of the artifact
artifact_type = "segregated_data"

# name of the output artifacts
artifact_output_name_train = "train.h5"
artifact_output_name_labels = "train_labels.csv"
artifact_output_name_test = "test.h5"
artifact_output_name_test_labels = "test_labels.csv"

In [None]:
# initiate wandb project
run = wandb.init(project="traffic_sign_recognition", job_type="split_data")

[34m[1mwandb[0m: Currently logged in as: [33mjotafarias[0m ([33mppgeec-ml-jj[0m). Use [1m`wandb login --relogin`[0m to force relogin


In [None]:
logger.info("Downloading and reading artifact...")
artifact_train = run.use_artifact(artifact_input_name_train)
artifact_train_path = artifact_train.file()

artifact_labels = run.use_artifact(artifact_input_name_labels)
artifact_labels_path = artifact_labels.file()

artifact_test = run.use_artifact(artifact_input_name_test)
artifact_test_path = artifact_test.file()

artifact_test_labels = run.use_artifact(artifact_input_name_test_labels)
artifact_test_labels_path = artifact_test_labels.file()

26-07-2022 03:03:32 Downloading and reading artifact...


In [None]:
!cp $artifact_train_path $artifact_output_name_train
!cp $artifact_labels_path $artifact_output_name_labels
!cp $artifact_test_path $artifact_output_name_test
!cp $artifact_test_labels_path $artifact_output_name_test_labels

In [None]:
# uploading artifacts to W&B

artifact_train = wandb.Artifact(name=artifact_output_name_train,
                          type=artifact_type,
                          description="Train data after segregation")
artifact_train.add_file(artifact_output_name_train)

artifact_labels = wandb.Artifact(name=artifact_output_name_labels,
                          type=artifact_type,
                          description="Labels of train data after segregation")
artifact_labels.add_file(artifact_output_name_labels)

artifact_test = wandb.Artifact(name=artifact_output_name_test,
                          type=artifact_type,
                          description="Test data after segregation")
artifact_test.add_file(artifact_output_name_test)

artifact_test_labels = wandb.Artifact(name=artifact_output_name_test_labels,
                          type=artifact_type,
                          description="Labels of test data after segregation")
artifact_test_labels.add_file(artifact_output_name_test_labels)

<ManifestEntry digest: BobYBvm1vBjnP0h+IgeMJQ==>

In [None]:
run.log_artifact(artifact_train)
run.log_artifact(artifact_labels)
run.log_artifact(artifact_test)
run.log_artifact(artifact_test_labels)

artifact_train.wait()
artifact_labels.wait()
artifact_test.wait()
artifact_test_labels.wait()

<Artifact QXJ0aWZhY3Q6MTYzMDM0NDY1>

In [None]:
# finishing the run
run.finish()

VBox(children=(Label(value='34.897 MB of 34.897 MB uploaded (0.000 MB deduped)\r'), FloatProgress(value=1.0, m…