# Car Classification and Generation

[Martina Cioffi](https://github.com/martinacioffi) – 3010036

[Edoardo Manieri](https://github.com/edoardomanieri) – 3084469

[Valentina Parietti](https://github.com/ValentinaParietti) – 3007385

[Edoardo Pericoli](https://github.com/Edoardopericoli) –  3001596


## Table of contents
1. [Datasets](#datasets)
    1. [Stanford Dataset](#stanford)
    2. [Our Dataset](#our)

2. [Classification](#classification)
    1. [From Scratch](#scratch)
    2. [EfficientNet](#effnet)
    3. [YOLO](#yolo)
    
3. [Generation](#generation)

    1. [Data Preparation](#datapreparation)
    2. [StyleGAN](#stylegan)

4. [References](#refs)

Please, note that the whole code, together with a more detailed explanation on how to run it can be found on GitHub at the following [link](https://github.com/Edoardopericoli/Car_Prediction).

## 1. Datasets <a name="datasets"></a>

Our main idea for this project was to train a model to be able to classify cars starting from pictures of them. Following is a brief explanation of the steps followed both in terms of the collection and building of the datasets and of the model(s) used for classification. Moreover, we also generate some new images of cars starting from our own pictures.

### 1.1. Stanford Dataset <a name="stanford"></a>

Our first trial consisted in trying to predict the make, model and year of a car using images from the [Stanford Dataset](https://ai.stanford.edu/~jkrause/cars/car_dataset.html). This contains slightly more than 16,000 images of which, however, only half are labelled. Therefore, in order to train our initial model, we used those 8,144 images to which we added four classes corresponding to cars we were more familiar with, which we downloaded with [Fatkun Batch Download Images](https://chrome.google.com/webstore/detail/fatkun-batch-download-ima/nnjjahlikiabnchcpehcpkdeckfgnohf?hl=en) and annotated with bounding boxes using [VGG Image Annotator (VIA)](http://www.robots.ox.ac.uk/~vgg/software/via/). 

This last step was performed in prevision of the fact that, given that some images had a relatively small car in it, with the background occupying the largest portion of the picture, a possible improvement to the model could be attained by first having the model draw bounding boxes around the car, then crop the image (keeping only the first one in case the picture contained more than one car) and then feeding these cropped images rather than the original ones to the model.

Before our addition, the model contained 196 classes; below is a graphical representation of the distirbution of the different brands. Note, however, that the graph below does not imply imbalance between the classes: indeed, we predict the car's model and year rather than merely the make. Still, given that cars of the same make are inevitably more similar between each other than cars from different makes, the picture is useful in understanding the difficulty of the task.

Please, see the [Classification](#classification) section for a summary of the obtained results

In [None]:
## INSERIRE ESEMPIO DI MACCHINE ANNOTATE DA NOI CON BOUNDING BOXES (VALE runna con path giusto)
#e avendo chiamato all_labels_final il csv con bounding boxes di qualsiasi macchina

for x in range(8209, 8274):
    im = np.array(Image.open(f'/Users/martinacioffi/PycharmProjects/cars/Car_Prediction/mini_cooper_clubman_2019___Google_Search/0{x}.jpg'), dtype=np.uint8)
    fig, ax = plt.subplots(1)
    ax.imshow(im)
    ics = all_labels_final.bbox_x1[x-1]
    y = all_labels_final.bbox_y1[x-1]
    width = all_labels_final.bbox_x2[x-1]
    heigth = all_labels_final.bbox_y2[x-1]
    rect = patches.Rectangle((ics, y),width-ics, heigth-y,linewidth=1,edgecolor='r',facecolor='none')
    ax.add_patch(rect)
    plt.show()

In [None]:
# QUI QUALCUNO CHE HA LE ANNOTATIONS (EDO P.?) dovrebbe runnare per il summary graph

import seaborn as sns
import matplotlib.pyplot as plt
brands = df['brand'].value_counts()
plt.figure(figsize=(10,10))
plot = sns.barplot(brands.values, brands.index, alpha=0.8, orient='h')
plt.title('Distribution of Brands - Stanford Dataset', fontsize=16)
plt.xlabel('Number of Occurrences', fontsize=12)
plt.show()
fig = plot.get_figure()
fig.savefig('brands_stanford.png')

In [None]:
# E VOLENDO ANCHE QUESTO PER FAR VEDERE CHE ALCUNI MODELLI SI RIPETONO PER ANNI DIVERSI

brands = df['model'].value_counts()
brands = brands[:20]
plt.figure(figsize=(10,10))
plot = sns.barplot(brands.values, brands.index, alpha=0.8, orient='h')
plt.title('Distribution of Models - Stanford Dataset', fontsize=16)
plt.xlabel('Number of Occurrences', fontsize=12)
plt.xticks(np.arange(min(brands.values), max(brands.values)+1, 1.0))
plt.show()
fig = plot.get_figure()
fig.savefig('models_stanford.png')

In [None]:
#e anche, se vogliamo, un count per far vedere il num di classi e avg number of pics per class

### 1.2. Our Dataset <a name="our"></a>

Given, however, that the Stanford dataset has relatively few images per class, and a very high number of classes, we decided to build a new dataset from scratch, containing (i) cars we were more familiar with (i.e. mostly sold in Europe rather than in the United States), and (ii) more images per class (eventually, we had around 200 images on average per each car's model).

In [None]:
#STESSA COSA CON NOSTRI LABEL FINALI

brands = df['brand'].value_counts()
plt.figure(figsize=(10,10))
plot = sns.barplot(brands.values, brands.index, alpha=0.8, orient='h')
plt.title('Distribution of Brands - Our Dataset', fontsize=16)
plt.xlabel('Number of Occurrences', fontsize=12)
plt.show()
fig = plot.get_figure()
fig.savefig('brands_our.png')

## 2. Classification <a name="classification"></a>

### 2.1.From Scratch <a name="scratch"></a>

### 2.2. EfficientNet <a name="effnt"></a>

### 2.3. YOLO <a name="yolo"></a>

In [None]:
#FORSE LA STORIA DELLE BOUNDING BOXES FATTA DA NOI STA MEGLIO QUA 

## 3. Generation <a name="generation"></a>

### 3.1. Data Preparation <a name="datapreparation"></a>

In [1]:
#Import necessary libraries

from PIL import Image
import os

In [4]:
#Read raw images 

files = os.listdir('data/raw_data/StyleGAN/StyleGAN_raw')
files.sort()
files=files[1:]

In [5]:
#Define a function that adds white borders to non square images and rescales them to 256x256

def make_square(im, min_size=256, fill_color=(255, 255, 255, 0)):
    x, y = im.size
    size = max(min_size, x, y)
    new_im = Image.new('RGB', (size, size), fill_color)
    new_im.paste(im, (int((size - x) / 2), int((size - y) / 2)))
    return new_im

In [6]:
#Apply the make_square function to the raw images and create a folder with the final images 

for i in files:
    im = Image.open('data/raw_data/StyleGAN/StyleGAN_raw/'+str(i))
    new_im=make_square(im)
    new_size=(256,256)
    new_im = new_im.resize(new_size)
    new_im.save('data/raw_data/StyleGAN/StyleGAN_final/'+str(i))

In [8]:
#Clone the repository needed to generate cars from our dataset

!git clone https://github.com/ValentinaParietti/stylegan.git #This repository has been forked from the
                                                             #original StyleGAN repository and some changes have
                                                             #been made to it in order to run StyleGAN on our dataset. 
                                                             #More on this later

Cloning into 'stylegan'...
remote: Enumerating objects: 419, done.[K
remote: Total 419 (delta 0), reused 0 (delta 0), pack-reused 419[K
Receiving objects: 100% (419/419), 20.69 MiB | 3.60 MiB/s, done.
Resolving deltas: 100% (245/245), done.


In [9]:
#Run the following command to convert the images into .tfrecords (format required by StyleGAN)

!python stylegan/dataset_tool.py create_from_images stylegan/datasets/custom_datasets data/raw_data/StyleGAN/StyleGAN_final

###COMMENT FOR US: the folder datasets should not be uploaded to the repo when pushing it (too heavy)

Loading images from "data/raw_data/StyleGAN/StyleGAN_final"
Creating dataset "stylegan/datasets/custom_datasets"
Added 2108 images.                      


In [10]:
#Zip the newly filled datasets folder

os.chdir('stylegan')
!zip -r datasets_zip datasets

  adding: datasets/ (stored 0%)
  adding: datasets/custom_datasets/ (stored 0%)
  adding: datasets/custom_datasets/custom_datasets-r04.tfrecords (deflated 38%)
  adding: datasets/custom_datasets/custom_datasets-r02.tfrecords (deflated 48%)
  adding: datasets/custom_datasets/custom_datasets-r08.tfrecords (deflated 48%)
  adding: datasets/custom_datasets/custom_datasets-r05.tfrecords (deflated 40%)
  adding: datasets/custom_datasets/custom_datasets-r03.tfrecords (deflated 41%)
  adding: datasets/custom_datasets/custom_datasets-r06.tfrecords (deflated 43%)
  adding: datasets/custom_datasets/custom_datasets-r07.tfrecords (deflated 47%)


### 3.2. StyleGAN <a name="stylegan"></a>

In order to train StyleGAN and generate new images a GPU is needed.

Therefore, the rest of the code for our generation task can be found on the Google Colab Notebook at this link: https://colab.research.google.com/drive/1FE9GBqh0qBQ8nUDDIDjWhy5R2sdgiqD0

## 4. References <a name="refs"></a>

[**3D Object Representations for Fine-Grained Categorization**](https://ai.stanford.edu/~jkrause/cars/car_dataset.html). Jonathan Krause, Michael Stark, Jia Deng, Li Fei-Fei.
*4th IEEE Workshop on 3D Representation and Recognition, at ICCV 2013 (3dRR-13).* Sydney, Australia. Dec. 8, 2013.

[**A Style-Based Generator Architecture for Generative Adversarial Networks**](https://arxiv.org/abs/1812.04948). Tero Karras (NVIDIA), Samuli Laine (NVIDIA), Timo Aila (NVIDIA)

[**EfficientNet: Rethinking Model Scaling for Convolutional Neural Networks**](https://arxiv.org/abs/1905.11946). Mingxing Tan, Quoc V. Le (Google Research, Brain Team, Mountain View, CA.)

[**YOLOv3: An Incremental Improvement**](https://arxiv.org/abs/1804.02767). Joseph Redmon, Ali Farhadi. 2018. *arXiv:1804.02767*.