<a href="https://colab.research.google.com/github/eshaanrathi2/vada-pav-classifier/blob/master/vada_pav.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

**First I created a dataset of vada_pav and not_vada_pav images via Google. Used Chrome store application for multiple images download. Cleaned the downloaded data and created Train and Test image folders having vada_pav and not_vada_pav each. I have uploaded these folders on my Google drive. Here's the link : 
https://drive.google.com/drive/folders/100tmP-8bvE2Fe99d0ihQEGqX1JpZkIFT?usp=sharing
For not_vada_pav I used Indian food other than vada_pav, humans, places, things, random_objects etc.**

**Mounting my Google drive with the Colab notebook. As the dataset is on my Google drive and would be saving weights over there.**

In [8]:
from google.colab import drive
drive.mount('/content/gdrive')

Drive already mounted at /content/gdrive; to attempt to forcibly remount, call drive.mount("/content/gdrive", force_remount=True).


**Importing few dependencies/libraries**

In [0]:
import numpy as np
import matplotlib.pyplot as plt
import os
import cv2
import keras
from keras.layers import Dense, Activation, Conv2D, Flatten, Dropout, MaxPooling2D
from keras.applications import InceptionV3
from keras.applications.inception_v3 import preprocess_input #importing preprocessing unit for inceptionv3 has a different syntax
from keras.preprocessing import image
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img
from keras.models import Model, Sequential
from keras.optimizers import Adam
from keras.callbacks import ModelCheckpoint # to save our model

In [0]:
datagen = ImageDataGenerator(rescale=1./255)

**Creating batch of train images :**

In [11]:
train_generator = datagen.flow_from_directory('/content/gdrive/My Drive/vada_pav/dataset/train', # this is where you specify the path to the main data folder
                                                 target_size = (224,224),
                                                 color_mode = 'rgb',
                                                 batch_size = 32,
                                                 shuffle = True)

Found 658 images belonging to 2 classes.


**Creating batch of test images**

In [12]:
test_generator = datagen.flow_from_directory('/content/gdrive/My Drive/vada_pav/dataset/test', # this is where you specify the path to the main data folder
                                                 target_size = (224,224),
                                                 color_mode = 'rgb',
                                                 batch_size = 32,
                                                 shuffle = True)

Found 159 images belonging to 2 classes.


**Architechture Construction :
Using InceptionV3 as the base for the model. Then added FC and Dense layer to it, with 2 output classes i.e vada_pav and not_vada_pav.**

In [0]:
base_model = InceptionV3(weights = 'imagenet',include_top = False, input_shape = (224,224,3))
x = Flatten()(base_model.output)
x = Dense(2, activation = 'softmax')(x)

model = Model(base_model.input, x)

**Now we would be only training last few layers and keep the weights of InceptionV3 intact.**

In [0]:
for layer in base_model.layers:
    layer.trainable = False


**Compiling the model :**

In [0]:
model.compile(optimizer = Adam(lr=0.001), loss = 'categorical_crossentropy', metrics = ['accuracy'])

**Checkpointing for better training**

In [0]:
checkpointer = ModelCheckpoint(filepath = '/content/gdrive/My Drive/vada_pav/weights.hdf5', verbose=1, save_best_only=True)

In [0]:
step_size_train = train_generator.n//train_generator.batch_size

**Few images are corrupted in my dataset even after cleaning. Hence following lines would solve the issue. Otherwise, a "image file is truncated" error will occur.**

In [0]:
from PIL import ImageFile
ImageFile.LOAD_TRUNCATED_IMAGES = True

**Now Training the above model** : 

In [25]:
history = model.fit(train_generator, epochs=15, callbacks=[checkpointer], steps_per_epoch = step_size_train)

Epoch 1/15
 2/20 [==>...........................] - ETA: 2:42 - loss: 0.8953 - acc: 0.7656

  'to RGBA images')


Epoch 2/15




Epoch 3/15
Epoch 4/15
Epoch 5/15
Epoch 6/15
Epoch 7/15
Epoch 8/15
Epoch 9/15
Epoch 10/15
Epoch 11/15
Epoch 12/15
Epoch 13/15
Epoch 14/15
Epoch 15/15


**Evaluation (on unseen data) :**

In [26]:
step_size_test = test_generator.n//test_generator.batch_size
model.evaluate_generator(test_generator, verbose=1, steps=step_size_test)



[0.7076681479811668, 0.8984375]

**Predictions :**

In [27]:
test_generator.reset()
preds = model.predict_generator(test_generator, steps= step_size_test, verbose=1)



In [0]:
predicted_class_indices = np.argmax(preds,axis=1) # These are labels created after predictions

labels = (test_generator.class_indices) 
labels = dict((v,k) for k,v in labels.items())
predictions = [labels[k] for k in predicted_class_indices] #These are true labels from test set. Feteching them from my folder's name

**Saving the model :**

In [0]:
model.save('/content/gdrive/My Drive/vada_pav/weights.hdf5')