# Picnic Hackathon 🥇🏆💯
 **When great customer support meets data**


## Why the challenge?
One of our core beliefs is to offer our customers the best support possible, by allowing them, for example, to send in pictures of defect products they wish to be reimbursed for. But processing these pictures is very time-consuming as it is all done manually.

## What is the challenge?
The challenge we propose is the following: As a first step in helping customer support, come up with a way of labeling every picture that comes in according to the product that is in the picture. To keep with the Picnic spirit, we encourage to be as innovative and creative with your solutions as possible.


## This NoteBook 📓📒
In this notebook we will show you how to reproduce the result I got in my **Final Submission**.




## Requirments ✅

- Python 3
- Fastai library
- Pandas & Numpy

I have used Colab Enviroment during this hackathon, because it offer free GPU's, and also for the great compatibility with Google Drive, which was the holder of the Images for training and Testing, and for the Final Model Weights.

The Final Model File can be access via this link, make sure you change the path in the section of **Loading The Solution Model** and change it to what you have, I'm using Colab so I will access it directly after getting reference to my drive.



## Setting & Imports
Here we load all the library required.

In [0]:
# Data Science Things
import pandas as pd
import numpy as np

# fast.ai Library
import fastai
from fastai.vision import *
from fastai.vision.models import *
import torch

# Images & Paths
from PIL import ImageFile
from pathlib import Path
import glob

#other
from google.colab import drive
from datetime import date

In [0]:
# setting random seed
np.random.seed(42)
# make sure to change this to what you have, this path will be used for loading test images.
path_to_folder = 'gdrive/My Drive/Dataset/The Picnic Hackathon 2019/'
ImageFile.LOAD_TRUNCATED_IMAGES = True

In [0]:
# getting reference to our model file from drive.
drive.mount('/content/gdrive')

## Loading The Solution Model
In this section, We will load the trained model file, which got about 0.86 as F1-Score, make sur you can reference the model file, otherwise contact me. I have tried a lot of pre-trained model, the one which got me to the last submission I've sent was DenseNet161, here is the [Officiel Repository](https://github.com/liuzhuang13/DenseNet) and It's [Paper](https://arxiv.org/pdf/1608.06993v3.pdf) 

In [4]:
# sometimes this cell cause error due Google Drive OSError, just re-run it, it happen only once.
# change this to what it convient for you. Where the model file is ?
path_to_model_file = 'gdrive/My Drive/'
# change this if you have renamed the file. What is the name of the file ?
file_name = 'densenet161_final_model.pkl'
model = load_learner(path = path_to_model_file, file = file_name)
print('Done')

Done


## Predicting the Test Set.

In [5]:
# first we get reference to all the fiels in the test set.
files = glob.glob(path_to_folder + 'test/*')
total = len(files)
print('Found {} images'.format(total))

Found 820 images


In [6]:
# Lopping over all the file, load -> predict -> and Store the results.
# final array to hold the results.
results = []
# variable to track the progress.
i = 1

for file in files:  
    print("\rImage #{} of {} , Total Progress {}% .".format(i, total, int((i/total)*100)), end="")
    sys.stdout.flush()
    i+=1
    # open the image
    img = open_image(Path(file)).apply_tfms(None, size = 224)
    # predict
    predicted_class, idx, out = model.predict(img)
    # getting file name.
    filename = os.path.basename(file)
    results.append([filename, str(predicted_class)])

Image #820 of 820 , Total Progress 100% .

In [0]:
# Constructing The Submission file.
headers = ['file', 'label']
submission = pd.DataFrame(results, columns=headers)
submission = submission.sort_values(['file'])

In [8]:
# Make sure the right appearance
submission.head()

Unnamed: 0,file,label
503,7263.jpeg,"Bell peppers, zucchinis & eggplants"
520,7264.jpeg,Eggs
571,7265.jpeg,"Broccoli, cauliflowers, carrots & radish"
332,7266.png,Lunch & Deli Meats
513,7267.jpeg,Potatoes


In [0]:
# saving the file into the desired format.
today = date.today()
name_file = today.strftime("%d-%m-%y") + '_1.tsv'
submission.to_csv(name_file, sep = '\t', index = False)

# Conclusion 😀🎉🙋‍♂️
I want to thanks the Picnic Company for this opprtunity to tackle real world problem, sharing their problem with the community of Devpost Hackers and Letting us Expirements lot fo things on their dataset.