## Submission for CV Coding Challenge, for summer internship at Step.ai

- Name   :  Sai Chowdary Gullapally
- Contact:  
   - scgullap@eng.ucsd.edu
   - 858-729-8482  

- Coded in Python 3 (images downloading function needs to be tweaked a bit for python 2)
- Libraries: Keras 


### Aim: Build a classifier to classify the Images corresponding to three classes given at the following URL's:
1. Class One
  - http://image-net.org/api/text/imagenet.synset.geturls?wnid=n02084071
2. Class Two
  - http://image-net.org/api/text/imagenet.synset.geturls?wnid=n02121808
3. Class Three
  - http://image-net.org/api/text/imagenet.synset.geturls?wnid=n00007846


### Approach: Transfer Learning

Why Transfer Learning?
- Models like VGG-16 have already been trained on huge datasets using GPU's for weeks and are very accurate.
- These pretrained models are readily available in all popular deep learning libraries, we use Keras as it makes our work quite easy.(Note: In my computer Keras has TensorFlow as backend).
- We can make use of these pretrained models, the initial layers in these model can be used are very good feature extractors, so we can retain the initial layers and slightly modify the layers in later stages and train them or fine tune them to get good results, also as the number of parameters to be trained is quite small, we will have to use only a few samples per each category for training.
- **This combined with another trick we shall see after a while eliminates the need of using GPU's and saves a lot of time as the training is quite fast.**

Alternative approach:
- Building our own convolutional network.
- But even with this dataset as the images vary a lot within each class, we will require atleast a moderate sized network and that means having to use GPU's and training for long times and even then results will not be as good.

### Part 1: Data Acquisition Pipeline
#### Function: download_images(filename,num_images) 

Arguments:
1. A  ".txt" file
2. The number of images to be downloaded for that category

What does the function do?
- The function downloads the specified number of images for each category into folders with the same name as the filename

Obtaining the ".txt" files:
- I went to the links provided for the datasets and manually copied the URL's for images of each category into a separate ".txt" file

In [6]:
#function to download the images
#References
#http://www.gadgetcluster.com/2014/07/download-images-from-urls-using-python/
#http://stackoverflow.com/questions/17960942/attributeerror-module-object-has-no-attribute-urlretrieve
def download_images(filename,num_img):       
    print("Reading URL's from "+filename+".txt")
    lines = open(filename+".txt","r").read().split("\n")
    print("Total number of URL's is: %d"%len(lines))
    print("Creating folder for downloading the images")
    os.makedirs(filename)
    print("Starting to download first %d existing images into 'data' folder..................."%num_img)
    count=0
    path=os.getcwd()
    for line in lines:
        if count<num_img:            
            try:                   
                URL= line
                IMAGE = URL.rsplit('/',1)[1]
                fullfilename = os.path.join(path,filename, IMAGE)
                urlretrieve(URL, fullfilename)
                if count%10==0:
                    print(">",end="")
                count=count+1
            except:   
                pass
        else:
            break
    return
download_images('category1',1000)
print()
download_images('category2',1000)
print()
download_images('category3',1000)

Reading URL's from category1.txt
Total number of URL's is: 1604
Creating folder for downloading the images
Starting to download first 1000 existing images into 'data' folder...................
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Reading URL's from category2.txt
Total number of URL's is: 1832
Creating folder for downloading the images
Starting to download first 1000 existing images into 'data' folder...................
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>
Reading URL's from category3.txt
Total number of URL's is: 1243
Creating folder for downloading the images
Starting to download first 1000 existing images into 'data' folder...................
>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>>

In [1]:
import numpy as np
from keras.models import Sequential
from keras.utils import np_utils
from keras.preprocessing.image import ImageDataGenerator, array_to_img,img_to_array,load_img
import os
from keras.preprocessing import image
from keras.applications.vgg16 import preprocess_input
from keras.applications import VGG16
from keras.models import Model
from keras.layers import Dense, Dropout, Activation, Flatten
from keras.optimizers import SGD,Adam,RMSprop
from PIL import Image

Using TensorFlow backend.


### Note 1:
After the images were downloaded upon manual inspection I found that there were a lot of blank images like the one shown below:
![alt text](blank_resized.jpg "blank images")
- These and a other similar kinds of "noise" is present in data for all categories, as we only use small training set I manually deleted these, in future I plan to use a self detecter function to delete these kinds images automatically(we could identify these images based on the amoutnt of white space in the image)  

### Note 2: Very Important
- For the code to work we need create a new folder 'data'(I have already done this part) and put the three folders created for each category into it. I will automate this in future.

In [2]:
path=os.getcwd()
folder='data'
path1=os.path.join(path,folder)
print("Path from which data will be taken")
print(path1)
#path1=r'C:\Users\Sai\Documents\deep_learning-master\data'
#path2=r'C:\Users\Sai\Documents\deep_learning-master\used_data'

Path from which data will be taken
C:\Users\Sai\Documents\deep_learning-master\Sai_submission\data


Now we read the images from the folders for preparing the training data: 
- We read *samples_per_class* images from each category for the training data and create corresponding labels
   - Here I am using only **130 images per each category** because we use transfer learning so we will not be needing many samples, as we are in a way **trying to fine tune a model trained to detect objects and importantly objects of similar nature so it is justified to take only a few samples** 
- In fact as we only have three categories we need not use this many images as well but as we have enough images its always a good idea to use more data so I used this many! 
- We read *20* images from each category for validation data and create corresponding labels

In [64]:
folders_list=os.listdir(path1)
count=0;
x_train=[]
y_train=[]
x_valid=[]
y_valid=[]
samples_per_class=130
for iter1 in folders_list:
    image_folder_path=path1+'\\'+iter1
    images_list=os.listdir(image_folder_path)
    images_chopped_list=images_list[0:samples_per_class]
    for i in images_chopped_list:
        img = image.load_img(image_folder_path+'\\'+i, target_size=(224, 224))
        #img.save(path2+'\\'+ i, "JPEG")
        img=image.img_to_array(img)
        x_train.append(img)
        y_train.append(count)    
    images_chopped_list=images_list[samples_per_class:samples_per_class+20]
    for i in images_chopped_list:
        img = image.load_img(image_folder_path+'\\'+i, target_size=(224, 224))
        #img.save(path2+'\\'+ i, "JPEG")
        img=image.img_to_array(img)
        x_valid.append(img)
        y_valid.append(count)
        
    count=count+1
print("completed reading training, validation data")

completed reading training, validation data


In [65]:
output_dim = 3
x_train=preprocess_input(np.array(x_train))
x_train=x_train/255
x_valid=preprocess_input(np.array(x_valid))
x_valid=x_valid/255
y_train=np.array(np_utils.to_categorical(y_train,output_dim))
y_valid=np.array(np_utils.to_categorical(y_valid,output_dim))
data_list=[x_train,y_train,x_valid,y_valid]

Finally we have all the data ready, we now begin the model pipeline

###  Part 2: Model Pipeline
#### Note: I have experimented with many models, *only one of them must be executed* if both are executed the results obtained will correspond to the latest executed model.

#### Model 1: 
- We download the VGG16 model with weights trained on ImageNet and modify it, we remove the last soft max layer (and later put a new softmax layer after a few blocks of code) 

In [66]:
[x_train,y_train,x_valid,y_valid]=data_list
vgg_model = VGG16( weights='imagenet', include_top=True )
vgg_out = vgg_model.layers[-2].output #Last FC layer's output 

#Create softmax layer taking input as vgg_out
#Create new transfer learning model
fprop = Model(input=vgg_model.input, output=vgg_out)
fprop.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_2 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
block1_conv1 (Convolution2D)     (None, 224, 224, 64)  1792        input_2[0][0]                    
____________________________________________________________________________________________________
block1_conv2 (Convolution2D)     (None, 224, 224, 64)  36928       block1_conv1[0][0]               
____________________________________________________________________________________________________
block1_pool (MaxPooling2D)       (None, 112, 112, 64)  0           block1_conv2[0][0]               
___________________________________________________________________________________________

#### Model 2: 
- We retain the entire VGG16 model with weights trained on ImageNet (and later put a new softmax layer after a few blocks of code) , i.e we use representation of representations model

In [23]:
[x_train,y_train,x_valid,y_valid]=data_list
vgg_model = VGG16( weights='imagenet', include_top=True )
vgg_out = vgg_model.layers[-1].output #Last FC layer's output 

#Create softmax layer taking input as vgg_out
#Create new transfer learning model
fprop = Model(input=vgg_model.input, output=vgg_out)
fprop.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_3 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
block1_conv1 (Convolution2D)     (None, 224, 224, 64)  1792        input_3[0][0]                    
____________________________________________________________________________________________________
block1_conv2 (Convolution2D)     (None, 224, 224, 64)  36928       block1_conv1[0][0]               
____________________________________________________________________________________________________
block1_pool (MaxPooling2D)       (None, 112, 112, 64)  0           block1_conv2[0][0]               
___________________________________________________________________________________________

#### Model 3: 
- We download the VGG16 model with weights trained on ImageNet and modify it, we remove the last soft max layer and also the layer before it  (and later put a new softmax layer after a few blocks of code) 

In [37]:
[x_train,y_train,x_valid,y_valid]=data_list
vgg_model = VGG16( weights='imagenet', include_top=True )
vgg_out = vgg_model.layers[-3].output #Last FC layer's output 

#Create softmax layer taking input as vgg_out
#Create new transfer learning model
fprop = Model(input=vgg_model.input, output=vgg_out)
fprop.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_6 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
block1_conv1 (Convolution2D)     (None, 224, 224, 64)  1792        input_6[0][0]                    
____________________________________________________________________________________________________
block1_conv2 (Convolution2D)     (None, 224, 224, 64)  36928       block1_conv1[0][0]               
____________________________________________________________________________________________________
block1_pool (MaxPooling2D)       (None, 112, 112, 64)  0           block1_conv2[0][0]               
___________________________________________________________________________________________

- fprop is the part of the model which is pretrained, as this part is not trained, we can forward prop the entire input throughthis and then store the output of this part in a numpy array and save it. Now in the next part we can just make a new network which has these values as input and then just one softmax layer, thus reducing the time and allowing us to train on huge datasets (though not needed here as we only have three categories) without the necessity of GPU's
- the code below shows how we are saving the "new" representation of the inputs

In [67]:
train_features=fprop.predict(x_train,batch_size=8,verbose=1)
np.save('train_features',train_features)
np.save('train_labels',y_train)
valid_features=fprop.predict(x_valid,batch_size=8,verbose=1)
np.save('valid_features',valid_features)
np.save('valid_labels',y_valid)



- Now from the next time if we need to make a few changes to parameters of the model etc, we need not run the acode for data generatio part again we can we can just start of from the below block where we load the numpy arrays of training and validation data!

In [68]:
train_features=np.load('train_features.npy')
y_train=np.load('train_labels.npy')
valid_features=np.load('valid_features.npy')
y_valid=np.load('valid_labels.npy')

- Once loading of the data is done we complete the model by adding a softmax layer

In [69]:
#select model

#model 1
inp_size=4096

#model 2
#inp_size=1000

#model 3
#inp_size=4096

#model 4: I put two trainable softmax layers (dim 1000 followed by dim 3), 
#        it is giving  alsmost same results as model 1  so i did not pursue it further
#inp_size=4096

tl_model=Sequential()
tl_model.add(Dense(output_dim,input_dim=inp_size,activation='softmax'))
#only for model 4(need to change the first layer as well)
#tl_model.add(Dense(output_dim,activation='softmax'))
tl_model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
dense_2 (Dense)                  (None, 3)             12291       dense_input_2[0][0]              
Total params: 12,291
Trainable params: 12,291
Non-trainable params: 0
____________________________________________________________________________________________________


- Training the model (pretty much straight forward!)

In [70]:
tl_model.compile(optimizer='rmsprop',loss='categorical_crossentropy', metrics=['acc'])
tl_model.fit(train_features,y_train,batch_size=22,validation_data=(valid_features,y_valid),nb_epoch=100,verbose=1,shuffle=True)

Train on 390 samples, validate on 60 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Ep

<keras.callbacks.History at 0x6180336be0>

## Classifier
Test image:
![alt text](test.jpg "test images")

In [73]:
# inputs list, here I just take one input 
x_test=[]
#Note: here for simplicity I am just using a single image, but we can put a loop and take multiple images as before
img = image.load_img(os.getcwd()+'\\test.jpg', target_size=(224, 224))
img=image.img_to_array(img)
x_test.append(img)
x_test=preprocess_input(np.array(x_test))

In [76]:
#no need to make a new model unnecessarily
from time import time

t1=time()
x=fprop.predict(x_test)
y=tl_model.predict(x)
category=np.argmax(y, axis=1)
if category[0]==0:
    print("This is the image of a dog.",end="")
elif category[0]==1:
    print("This is the image of a cat.",end="")
else:
    print("This is the image of a human.",end="")
t2=time()
print(" Time taken for prediction is %f second(s)."%(t2-t1))

This is the image of a dog. Time taken for prediction is 2.868860 second(s).


## Results:
epochs:100, batch size:22, 130 samples per category
- With Model 1: upto 79% validation accuracy, upto 92% training accuracy (I have tried for 500 epochs, at that point training accuracy goes to 100% or almost 100% BUT validation accuracy remains more or less in 72%-78% range i.e the model is perhaps in the overtraining mode and as the validation accuracy is what we are interested in I just trained for 100 epochs )
- With Model 2: upto 46% validation accuracy, upto 52% training accuracy 
- With Model 3: upto 40% validation accuracy, upto 46% training accuracy
- The time taken for prediction is around 3 seconds

## Discussion:
- We can see that with Model 1 (Model 4 also gives same results) we get pretty good results i.e almost 80%, this is pretty much as expected because we are using the VGG16 which has been trained on imagenet for many days, we just change the output softmax layer to fit our needs. 
- With Model 2 first we get the predictions of the VGG16 net and then we try classifying based on these into 3 categories, it is possible that the latter layers are more representative fot the entire data VGG was trained on and as a result they are not giving good results
- With Model 3 we remove an enttire layer completely i.e we reduce the depth of the network and as suggested in ablation studies the eror percentage i very high.
- I also experimented adding an additional trainable layer in between as compensation of the removal of one of the layers but surprisingly it did not improve results, I have yet to analyze why this is happening.

## Improving the results:
- Cearly the more data we add the better results we will get BUT this should be decided upon the application we want to use the network for as that will determine the level of accuracy demanded.
- We can use custom networks by starting from scratch and training them but this is much quicker and will give almost as good a result if not better, as the custom ones we train.
- We can kind of crop various (224,224) sections of the input images and feed this as inputs, or we can do global contrast normalization etc 

## Part 3: Other Achitectures:

## ResNet50

In [65]:
from keras.applications.resnet50 import preprocess_input, decode_predictions
from keras.applications.resnet50 import ResNet50

[x_train,y_train,x_valid,y_valid]=data_list

resnet = ResNet50(weights='imagenet')
resnet_out = resnet.layers[-2].output #Last FC layer's output 

#Create softmax layer taking input as vgg_out
#Create new transfer learning model
fprop = Model(input=resnet.input, output=resnet_out)
fprop.summary()

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels.h5
____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_8 (InputLayer)             (None, 224, 224, 3)   0                                            
____________________________________________________________________________________________________
zeropadding2d_1 (ZeroPadding2D)  (None, 230, 230, 3)   0           input_8[0][0]                    
____________________________________________________________________________________________________
conv1 (Convolution2D)            (None, 112, 112, 64)  9472        zeropadding2d_1[0][0]            
____________________________________________________________________________________________________
bn_conv1 (BatchNormalization)    (None, 112, 112, 6

In [67]:
train_features=fprop.predict(x_train,batch_size=8,verbose=1)
np.save('train_features',train_features)
np.save('train_labels',y_train)
valid_features=fprop.predict(x_valid,batch_size=8,verbose=1)
np.save('valid_features',valid_features)
np.save('valid_labels',y_valid)



In [68]:
train_features=np.load('train_features.npy')
y_train=np.load('train_labels.npy')
valid_features=np.load('valid_features.npy')
y_valid=np.load('valid_labels.npy')

In [69]:
inp_size=2048
tl_model=Sequential()
tl_model.add(Dense(output_dim,input_dim=inp_size,activation='softmax'))
tl_model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
dense_25 (Dense)                 (None, 3)             6147        dense_input_18[0][0]             
Total params: 6,147
Trainable params: 6,147
Non-trainable params: 0
____________________________________________________________________________________________________


In [77]:
tl_model.compile(optimizer='rmsprop',loss='categorical_crossentropy', metrics=['acc'])
tl_model.fit(train_features,y_train,batch_size=22,validation_data=(valid_features,y_valid),nb_epoch=100,verbose=1,shuffle=True)

Train on 390 samples, validate on 60 samples
Epoch 1/100
Epoch 2/100
Epoch 3/100
Epoch 4/100
Epoch 5/100
Epoch 6/100
Epoch 7/100
Epoch 8/100
Epoch 9/100
Epoch 10/100
Epoch 11/100
Epoch 12/100
Epoch 13/100
Epoch 14/100
Epoch 15/100
Epoch 16/100
Epoch 17/100
Epoch 18/100
Epoch 19/100
Epoch 20/100
Epoch 21/100
Epoch 22/100
Epoch 23/100
Epoch 24/100
Epoch 25/100
Epoch 26/100
Epoch 27/100
Epoch 28/100
Epoch 29/100
Epoch 30/100
Epoch 31/100
Epoch 32/100
Epoch 33/100
Epoch 34/100
Epoch 35/100
Epoch 36/100
Epoch 37/100
Epoch 38/100
Epoch 39/100
Epoch 40/100
Epoch 41/100
Epoch 42/100
Epoch 43/100
Epoch 44/100
Epoch 45/100
Epoch 46/100
Epoch 47/100
Epoch 48/100
Epoch 49/100
Epoch 50/100
Epoch 51/100
Epoch 52/100
Epoch 53/100
Epoch 54/100
Epoch 55/100
Epoch 56/100
Epoch 57/100
Epoch 58/100
Epoch 59/100
Epoch 60/100
Epoch 61/100
Epoch 62/100
Epoch 63/100
Epoch 64/100
Epoch 65/100
Epoch 66/100
Epoch 67/100
Epoch 68/100
Epoch 69/100
Epoch 70/100
Epoch 71/100
Epoch 72/100
Epoch 73/100
Epoch 74/100
Ep

<keras.callbacks.History at 0xb2178f29b0>

## Results and Discussions:
- 1500 epochs 
- Training accuracy 75% 
- validation accuracy 47% 
- Discussion: Here the number of parameters to finetune are half a smany as we had in VGG16 (fully connected layers of dimension 4096 vs 2048) perhaps that must be why we do not have good performance, if we start modifying the convolutional layers then as there  are a lot of filters we will have toomany parameters which might lead to overfitting and also computationally it gets expensive so for  out case VGG is better than ResNet50.

## InceptionV3

## Note: 
Inception has input image shape (229,229), but our images are (224,224). There is a custom input shape option for inception but it was giving me some error which I htink will take a while for me to debug so for our experiment with architectre I directly read the images again in (299,299) format. 

In [3]:
folders_list=os.listdir(path1)
count=0;
x_train=[]
y_train=[]
x_valid=[]
y_valid=[]
samples_per_class=130
for iter1 in folders_list:
    image_folder_path=path1+'\\'+iter1
    images_list=os.listdir(image_folder_path)
    images_chopped_list=images_list[0:samples_per_class]
    for i in images_chopped_list:
        img = image.load_img(image_folder_path+'\\'+i, target_size=(299, 299))
        #img.save(path2+'\\'+ i, "JPEG")
        img=image.img_to_array(img)
        x_train.append(img)
        y_train.append(count)    
    images_chopped_list=images_list[samples_per_class:samples_per_class+20]
    for i in images_chopped_list:
        img = image.load_img(image_folder_path+'\\'+i, target_size=(299, 299))
        #img.save(path2+'\\'+ i, "JPEG")
        img=image.img_to_array(img)
        x_valid.append(img)
        y_valid.append(count)
        
    count=count+1
print("completed reading training, validation data")

output_dim = 3
x_train=preprocess_input(np.array(x_train))
x_train=x_train/255
x_valid=preprocess_input(np.array(x_valid))
x_valid=x_valid/255
y_train=np.array(np_utils.to_categorical(y_train,output_dim))
y_valid=np.array(np_utils.to_categorical(y_valid,output_dim))
data_list=[x_train,y_train,x_valid,y_valid]

completed reading training, validation data


In [4]:
from keras.applications.inception_v3 import InceptionV3
from keras.layers import Input

[x_train,y_train,x_valid,y_valid]=data_list
inception = InceptionV3(weights='imagenet', include_top=True)
inception_out = inception.layers[-2].output #Last FC layer's output 

#Create softmax layer taking input as vgg_out
#Create new transfer learning model
fprop = Model(input=inception.input, output=inception_out)
fprop.summary()


____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
input_1 (InputLayer)             (None, 299, 299, 3)   0                                            
____________________________________________________________________________________________________
convolution2d_1 (Convolution2D)  (None, 149, 149, 32)  896         input_1[0][0]                    
____________________________________________________________________________________________________
batchnormalization_1 (BatchNorma (None, 149, 149, 32)  128         convolution2d_1[0][0]            
____________________________________________________________________________________________________
convolution2d_2 (Convolution2D)  (None, 147, 147, 32)  9248        batchnormalization_1[0][0]       
___________________________________________________________________________________________

In [5]:
train_features=fprop.predict(x_train,batch_size=8,verbose=1)
np.save('train_features',train_features)
np.save('train_labels',y_train)
valid_features=fprop.predict(x_valid,batch_size=8,verbose=1)
np.save('valid_features',valid_features)
np.save('valid_labels',y_valid)



In [6]:
train_features=np.load('train_features.npy')
y_train=np.load('train_labels.npy')
valid_features=np.load('valid_features.npy')
y_valid=np.load('valid_labels.npy')

In [7]:
inp_size=2048
tl_model=Sequential()
tl_model.add(Dense(output_dim,input_dim=inp_size,activation='softmax'))
#tl_model.add(Dense(3,activation='softmax'))
tl_model.summary()

____________________________________________________________________________________________________
Layer (type)                     Output Shape          Param #     Connected to                     
dense_1 (Dense)                  (None, 3)             6147        dense_input_1[0][0]              
Total params: 6,147
Trainable params: 6,147
Non-trainable params: 0
____________________________________________________________________________________________________


In [8]:
tl_model.compile(optimizer='rmsprop',loss='categorical_crossentropy', metrics=['acc'])
tl_model.fit(train_features,y_train,batch_size=22,validation_data=(valid_features,y_valid),nb_epoch=20,verbose=1,shuffle=True)

Train on 390 samples, validate on 60 samples
Epoch 1/20
Epoch 2/20
Epoch 3/20
Epoch 4/20
Epoch 5/20
Epoch 6/20
Epoch 7/20
Epoch 8/20
Epoch 9/20
Epoch 10/20
Epoch 11/20
Epoch 12/20
Epoch 13/20
Epoch 14/20
Epoch 15/20
Epoch 16/20
Epoch 17/20
Epoch 18/20
Epoch 19/20
Epoch 20/20


<keras.callbacks.History at 0x61b5628780>

## Classifier
Test image:
![alt text](test.jpg "test images")

In [None]:
# inputs list, here I just take one input 
x_test=[]
#Note: here for simplicity I am just using a single image, but we can put a loop and take multiple images as before
img = image.load_img(os.getcwd()+'\\test.jpg', target_size=(299, 299))
img=image.img_to_array(img)
x_test.append(img)
x_test=preprocess_input(np.array(x_test))

In [63]:
#no need to make a new model unnecessarily
from time import time

t1=time()
x=fprop.predict(x_test)
y=tl_model.predict(x)
category=np.argmax(y, axis=1)
if category[0]==0:
    print("This is the image of a dog.",end="")
elif category[0]==1:
    print("This is the image of a cat.",end="")
else:
    print("This is the image of a human.",end="")
t2=time()
print(" Time taken for prediction is %f second(s)."%(t2-t1))

This is the image of a dog. Time taken for prediction is 2.032044 second(s).


## Results and Discussions: This network is a BEAST!!!
- 20 epochs (stays almost same after that)
- Training accuracy 99% 
- validation accuracy 87%
- The learning is much much faster than in the case of VGG model
- Discussion: Clearly this gives us the best performance so far! The speed of convergence expected as they use Deep Supervision hence we can easily avoid the vanishing gradient problem, also the impressive accuracy must be attributed to the very deep model of inception.
- I tweaked around the model a bit by adding extra layers etc but there was not muxh of an improvement in the resuls.
- So from the results it seems that InceptionV3 is the best network for the task!
- The time taken for prediction is around 2 second. This is a bit counter intuitive actually as I had expected inception to take slightly longer time as it has many layers but perhaps as the number of parameters are very few we observe this but this is interesting and I will look further into it.