### Background

- Data scraped from a website. Training data contains noise (mislabels+distorted images). Test data has been cleaned. 
- Data contains 101 labeleled images. 750 per category are included in the train set and 250 per category are included in the test set.

### Objectives
- Build a model which gets better ~85% top-one test accuracy across all categories using a ResNet50 or smaller architecture

### Approach

- Build a baseline model with transfer learning applied to ResNet50
- Assess performance on validation set (or puruse CV)

#### Potential Improvements

- Scrape additional images for poor-performing classes to augment training set
- Use data augmentation to increase data

#### Resources

- Test Time Augmentation: https://towardsdatascience.com/test-time-augmentation-tta-and-how-to-perform-it-with-keras-4ac19b67fb4d


# Data Preparation

In [30]:
import json
import os
import pandas as pd
import shutil

In [33]:
train_dict = json.load(open('food-101/meta/train.json'))

In [37]:
original_dataset_dir = '/Users/brady/Documents/GitHub/food101/food-101/images'
base_dir = '/Users/brady/Documents/GitHub/food101/data'

In [39]:
# Create class directories in train folder
# for key in train_dict.keys():
#     class_dir = os.path.join(base_dir, 'train/' + key)
#     os.mkdir(class_dir)

/Users/brady/Documents/GitHub/food101/data/train/churros
/Users/brady/Documents/GitHub/food101/data/train/hot_and_sour_soup
/Users/brady/Documents/GitHub/food101/data/train/samosa
/Users/brady/Documents/GitHub/food101/data/train/sashimi
/Users/brady/Documents/GitHub/food101/data/train/pork_chop
/Users/brady/Documents/GitHub/food101/data/train/spring_rolls
/Users/brady/Documents/GitHub/food101/data/train/panna_cotta
/Users/brady/Documents/GitHub/food101/data/train/beef_tartare
/Users/brady/Documents/GitHub/food101/data/train/greek_salad
/Users/brady/Documents/GitHub/food101/data/train/foie_gras
/Users/brady/Documents/GitHub/food101/data/train/tacos
/Users/brady/Documents/GitHub/food101/data/train/pad_thai
/Users/brady/Documents/GitHub/food101/data/train/poutine
/Users/brady/Documents/GitHub/food101/data/train/ramen
/Users/brady/Documents/GitHub/food101/data/train/pulled_pork_sandwich
/Users/brady/Documents/GitHub/food101/data/train/bibimbap
/Users/brady/Documents/GitHub/food101/data/tra

In [45]:
# Copy files from original dataset_dir to appropriate train directory
# for key in train_dict.keys():
#     for file in train_dict[key]:
#         src = os.path.join(original_dataset_dir, file + '.jpg')
#         dst = os.path.join(base_dir + '/train/', file + '.jpg')
#         shutil.copyfile(src, dst)

### Initial Baseline Model

In [46]:
from keras.preprocessing.image import ImageDataGenerator

Using TensorFlow backend.


In [47]:
datagen = ImageDataGenerator(validation_split=0.2)

In [48]:
!pwd

/Users/brady/Documents/GitHub/food101


In [52]:
train_data = datagen.flow_from_directory('data/train/', class_mode='categorical', subset='training')

Found 60600 images belonging to 101 classes.


In [54]:
# Base CNN Architecture
from keras import layers
from keras import models
from keras import optimizers

In [56]:
model = models.Sequential()
model.add(layers.Conv2D(32, (3,3), activation='relu', input_shape=(256,256,3)))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(64, (3,3), activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(128, (3,3), activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Conv2D(128, (3,3), activation='relu'))
model.add(layers.MaxPooling2D((2,2)))
model.add(layers.Flatten())
model.add(layers.Dense(512, activation='relu'))
model.add(layers.Dense(101, activation='sigmoid'))

model.compile(loss='binary_crossentropy',
              optimizer=optimizers.RMSprop(),
              metrics=['acc'])

In [57]:
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_3 (Conv2D)            (None, 254, 254, 32)      896       
_________________________________________________________________
max_pooling2d_2 (MaxPooling2 (None, 127, 127, 32)      0         
_________________________________________________________________
conv2d_4 (Conv2D)            (None, 125, 125, 64)      18496     
_________________________________________________________________
max_pooling2d_3 (MaxPooling2 (None, 62, 62, 64)        0         
_________________________________________________________________
conv2d_5 (Conv2D)            (None, 60, 60, 128)       73856     
_________________________________________________________________
max_pooling2d_4 (MaxPooling2 (None, 30, 30, 128)       0         
_________________________________________________________________
conv2d_6 (Conv2D)            (None, 28, 28, 128)       147584    
__________

In [58]:
history = model.fit_generator(
    train_data,
    steps_per_epoch=100,
    epochs=3,
    verbose=2,
)

Instructions for updating:
Use tf.cast instead.
Epoch 1/3
 - 267s - loss: 3.4407 - acc: 0.7825
Epoch 2/3
 - 272s - loss: 3.4086 - acc: 0.7863
Epoch 3/3
 - 270s - loss: 3.4104 - acc: 0.7862


In [60]:
history.history.keys()

dict_keys(['loss', 'acc'])

In [62]:
history.history['acc']

[0.7825464349985123, 0.7862809705734253, 0.7861695873737335]