# __Welcome to the AUDI Machine Learning Challenge!__

## General Information

This notebook provides you with some code to setup the classification problem and a raw skeleton of what we want you to do. There is no optimal solutions, there are many good answers.

We are using the [Cars Dataset](http://ai.stanford.edu/~jkrause/cars/car_dataset.html) and want to classify images of cars into 196 categories of different models of cars (including Audi's!). 

This fine-grained classification problem is not as easy as MNIST or CIFAR10 - we are excited to see what you can come up with.

## How to work with this notebook

Please follow the following rules when working with this notebook:


1) Only use the dataset which is downloaded in Ch.1 and Ch.2. You are not allowed to use any other data source. 

2) Do not change the code in Ch.1 and Ch.2 

3) Please answer all the questions

4) This notebook requires an Ubuntu workstation with a GPU. If you do not have access to one, we recommend using the free [Google Colab](https://colab.research.google.com/) service.

__First question__
- Can you think of a real life use case where we could use a model trained on this dataset? 
- For your use case, what would be the best metric?

# Seutp project

## (Optional for Google Colab) Upload the helper files to the project directory

Upload the provided files to the project directory:

- utils.py 
- idx2classes.json
- example.jpg
- inference.py 

Navigate to the *Files* tab on the righthand side and click the *Upload* dialogue.

## Download the necessary files  

**Important: You must not change the following lines of code**

Just run the code below. It should download the data and put it into the folder './data/'. This can take 5-10 minutes depending on your connection.

In [3]:
# Create the identification hash for your submission 

import hashlib
import glob
submission_hash = hashlib.md5().hexdigest()
is_hash = glob.glob("./hash.txt")
if len(is_hash)==0:
    with open("./hash.txt", "w") as f:
        f.write(submission_hash)

In [4]:
from IPython.display import clear_output

!wget http://imagenet.stanford.edu/internal/car196/cars_train.tgz
!wget http://ai.stanford.edu/~jkrause/cars/car_devkit.tgz
!wget http://imagenet.stanford.edu/internal/car196/cars_annos.mat

!mkdir data
!tar -xvzf ./cars_train.tgz -C ./data/
!rm -r ./cars_train.tgz
!tar -xvzf ./car_devkit.tgz -C ./data/
!rm -r ./car_devkit.tgz
clear_output()
print("Finished downloading files")

Finished downloading files


## Prepare the Dataset
Run this cell to create the dataset. The training data you should use will loceted in: `./data/data_in_class_folder`

In [1]:
from utils import create_dataset
create_dataset()

creating folder AM General Hummer SUV 2000
creating folder Acura RL Sedan 2012
creating folder Acura TL Sedan 2012
creating folder Acura TL Type-S 2008
creating folder Acura TSX Sedan 2012
creating folder Acura Integra Type R 2001
creating folder Acura ZDX Hatchback 2012
creating folder Aston Martin V8 Vantage Convertible 2012
creating folder Aston Martin V8 Vantage Coupe 2012
creating folder Aston Martin Virage Convertible 2012
creating folder Aston Martin Virage Coupe 2012
creating folder Audi RS 4 Convertible 2008
creating folder Audi A5 Coupe 2012
creating folder Audi TTS Coupe 2012
creating folder Audi R8 Coupe 2012
creating folder Audi V8 Sedan 1994
creating folder Audi 100 Sedan 1994
creating folder Audi 100 Wagon 1994
creating folder Audi TT Hatchback 2011
creating folder Audi S6 Sedan 2011
creating folder Audi S5 Convertible 2012
creating folder Audi S5 Coupe 2012
creating folder Audi S4 Sedan 2012
creating folder Audi S4 Sedan 2007
creating folder Audi TT RS Coupe 2012
creati

# Train a car classifier

Now it is your turn. Show us how you are tackeling this problem - Good luck! 

## Exploration

Your first task is to explore the provided training data. The goal here is to explore the data with e.g. statistics and plots. 

*Remember:* Before training a classifier it is usually a good idea to get to know your training data!

In [0]:
# Insert your exploration code and graphs here

## Modelling
As we told you in the beginning, we want to classify the images into the 196 different car model classes. Choose a suitable model and train the best classifier you can come up with!

**Requirements**

- Use the data placed in `data_in_class_folder` for training.
- The network should be trained with PyTorch. Please do not use pytorch wrappers like FastAI. You are encouraged to established architectures and pretrained weights. 
- **Visualize your training**: You can create custom graphs displaying training / validation loss and metrics. We prefer tensorboard, though. Don't forget to include your tensorboard logs in your submission.
- The notebook should include a **summary of your model performance** (loss ...) 
- **Save the checkpoint** of your best model in `best_checkpoint.pth`


In [0]:
# Insert your code to train a classifier

# Evaluation & Outlook

After finishing your training take a look at your out of sample predictions and analyse strengths and weeknesses of your model. 

Try to answer crucial questions here! 

- which car models get often mixed up? 
- What are good visualization and metrics to evaluate your model?
- What could be the next steps to further improve the model?

# Create an Inference class to make predictions on new pictures
Now that you have finished training your model you should implement a small function to use it. Head over to `inference.py` and finish the provided sceleton class.

Your function should be initialized and used the following:

```python
from inference import CarClassifier
classifier = CarClassifier()
classifier.predict("example.jpg")
```

The output should be a string with the correct class. E.g: `Audi R8 Coupe 2012` 


In [4]:
# Run the following cell to test your inference class.

from inference import CarClassifier
classifier = CarClassifier()

image_file = "example.jpg"
prediction = classifier.predict(image_file)
print("You predicted {} for image {}".format(prediction, image_file))
if type(prediction) == str:
    print("...seems like your classifier is working as expected...")

NameError: ignored

# Prepare your submission 

In [0]:
# Run this cell to get your submission file name
with open("./hash.txt","r") as f:
    submission_hash = f.read()
print("Please upload yor submission as {}.zip as outlined below.".format(submission_hash))

Please upload yor submission as d41d8cd98f00b204e9800998ecf8427e.zip as outlined below.


Please submit your solution until **XX.XX.XX**

**Requirements**:

1) Create a zip file `<submission_hash>.zip` (run the cell above to get your filename)

2) All of the following files have to be included: 

- `Audi IT-ML-Challenge-Applicants.ipynb` and `Audi IT-ML-Challenge-Applicants.html` (exported HTML Notebook)
- `inference.py`: Your inference class from Ch. 5
-  If you used Tensorboard for training visualization: directory named `runs` including the training summary of your best model
- `model.pth`: Checkpoint to your trained classifier

*Please do not include the data directory or any other files!*

3) Upload your compressed submission [here](https://collaboration.msi.audi.com/seafile/u/d/475b5c823a3b47aa83eb/)

4) **Send us the name (hash) of your submission zip file afterwards as an email.** 