The data which is being used for this assignment is that of Images belonging to Adidas and Starbucks. This is our project data. Our project revolves around the concept of logo recognition of various brands from a wide range of images which contain the logos.


# Image Classifier Using Convolutional Neural Network


The goal of this image classifier is to identify class to which an image belongs to. The way I am going to achieve it is by training an artificial neural network on few thousand images of adidas and starbcuks and make the NN(Neural Network) learn to predict which class the image belongs to, next time it sees an image having adidas or starbucks in it.

This model can be trained on any type of classes like for example a doctor can train Neural Network that can take a brain scan as an input and predict if the scan contains tumor or not.

So coming to the coding part, I am going to use Keras deep learning library in python to build CNN(Convolutional Neural Network).

The process of building a Convolutional Neural Network always involves four major steps:

Convolution.
Pooling.
Flattening.
Full connection.

## PART A - DEEP LEARNING MODEL

In [11]:
from keras.models import Sequential
from keras.layers import Conv2D
from keras.layers import MaxPooling2D
from keras.layers import Flatten
from keras.layers import Dense
import os

I have imported Sequential from keras.models, to initialise my neural network model as a sequential network. There are two basic ways of initialising a neural network, either by a sequence of layers or as a graph.

I have imported Conv2D from keras.layers, this is to perform the convolution operation i.e the first step of a CNN, on the training images. Since we are working on images here, which a basically 2 Dimensional arrays, we’re using Convolution 2-D. For videos we can use convolution 3D.

I have imported MaxPooling2D from keras.layers, which is used for pooling operation, that is the step — 2 in the process of building a cnn. For building this particular neural network, we are using a Maxpooling function, there exist different types of pooling operations like Min Pooling, Mean Pooling, etc. Here in MaxPooling we need the maximum value pixel from the respective region of interest.

Flatten is used for flattening, it is the process of converting resultant 2D array into a single long continuos linear vector.

Dense is used to perform full connection of the neural network.

In [103]:
classifier = Sequential()    #Creating an object of the sequential class below.

## PART B - ACTIVATION FUNCTION

The activation function which is being used here is that of ReLU

In [104]:
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))    

Let’s break down the above code function by function. I took the sequential object , then I added a convolution layer by using the “Conv2D” function. The Conv2D function is taking 4 arguments, the first is the number of filters i.e 32 here, the second argument is the shape each filter is going to be i.e 3x3 here, the third is the input shape and the type of image(RGB or Black and White)of each image i.e the input image our CNN is going to be taking is of a 64x64 resolution and “3” stands for RGB, which is a colour img, the fourth argument is the activation function we want to use, here ‘relu’ stands for a rectifier function.

## Pooling

To understand pooling I recommend to use the following link: http://ufldl.stanford.edu/tutorial/supervised/Pooling/

In [105]:
classifier.add(MaxPooling2D(pool_size = (2, 2)))

In [106]:
classifier.add(Conv2D(32, (3, 3), input_shape = (64, 64, 3), activation = 'relu'))    

In [107]:
classifier.add(MaxPooling2D(pool_size = (2, 2)))

## Flattening

In [108]:
classifier.add(Flatten())

What I am basically doing here is taking the 2-D array, i.e pooled image pixels and converting them to a one dimensional single vector.

In [109]:
classifier.add(Dense(units = 128, activation = 'relu'))

In [110]:
classifier.add(Dense(units = 1, activation = 'sigmoid'))


In this step we need to create a fully connected layer, and to this layer we are going to connect the set of nodes we got after the flattening step, these nodes will act as an input layer to these fully-connected layers.

‘units’ is where we define the number of nodes that should be present in this hidden layer, these units value will be always between the number of input nodes and the output nodes but the art of choosing the most optimal number of nodes can be achieved only through experimental tries. Though it’s a common practice to use a power of 2. And the activation function will be a rectifier function.

## PART C & E- COST FUNCTION & GRADIENT ESTIMATION

The cost function which is being used here is Cross-Entropy. Since, the number of classes to predict is only two, the type of cross-entropy is binary.
The optimization function to be used for estimating the highest probability to classify correctly is "ADAM"

In [111]:
classifier.compile(optimizer = 'adam', loss = 'binary_crossentropy', metrics = ['accuracy'])


Optimizer parameter is to choose the stochastic gradient descent algorithm. Loss parameter is to choose the loss function. The metrics parameter is to choose the performance metric.

So before we fit our images to the neural network, we need to perform some image augmentations on them, which is basically synthesising the training data. This step is done to prevent overfitting of the model. I am going to do this using keras.preprocessing library for doing the synthesising part as well as to prepare the training set and the test set of images that are present in a properly structured directories, where the directory’s name is take as the label of all the images present in it.

## Making new predictions from our trained model

The test_image holds the image that needs to be tested on the CNN. Preparing the image to be sent into the model by converting its resolution to 64x64 as the model only excepts that resolution. Then I am using predict() method on classifier object to get the prediction. As the prediction will be in a binary form, we will be receiving either a 1 or 0, which will represent a dog or a cat respectively.

In [114]:
from keras.preprocessing.image import ImageDataGenerator
train_datagen = ImageDataGenerator(rescale = 1./255,
shear_range = 0.2,
zoom_range = 0.2,
horizontal_flip = True)
test_datagen = ImageDataGenerator(rescale = 1./255)
training_set = train_datagen.flow_from_directory('assign4/train',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')
test_set = test_datagen.flow_from_directory('assign4/test',
target_size = (64, 64),
batch_size = 32,
class_mode = 'binary')

Found 229 images belonging to 2 classes.
Found 20 images belonging to 2 classes.


## Fiting the model

## PART D - EPOCHS

A single epoch is a single step in training a neural network; in other words when a neural network is trained on every training samples only in one pass we say that one epoch is finished. So training process should consist more than one epochs.In this case we have defined 2 epochs.

In [115]:
classifier.fit_generator(training_set,
steps_per_epoch = 400,
epochs = 2,
validation_data = test_set,
validation_steps = 6)

Epoch 1/2
Epoch 2/2


<keras.callbacks.History at 0x1f05a0c8940>

In [None]:
import matplotlib.pyplot as plt
%matplotlib.inline
fig=plt.figure()
plt.plot(display)

In [10]:
import numpy as np
from keras.preprocessing import image
path = "abc.png"
#imageID+=1
test_image = image.load_img(path, target_size = (64, 64))
test_image = image.img_to_array(test_image)
test_image = np.expand_dims(test_image, axis = 0)
result = load.predict(test_image)
#training_set.class_indices
#print(training_set.class_indices)

if result[0]==0:
    print('dunkin donuts')
else:
    print('starbucks')




    


dunkin donuts


In [38]:
starCount = 0
ddCount = 0

In [39]:
directoryPath = 'assign4/test/dunkin donuts'
for dire,subdir,files in os.walk(directoryPath):
    for i in files:
        path = 'assign4/test/dunkin donuts/' + str(i)
        test_image = image.load_img(path, target_size = (64, 64))
        test_image = image.img_to_array(test_image)
        test_image = np.expand_dims(test_image, axis = 0)
        result = load.predict(test_image)
        if result[0]==0:
            ddCount += 1
            #print('dunkin donuts')
            flag = 0
        else:
            starCount += 1
            #print('starbucks')
            flag = 1


            
if flag ==0:
    print('No. of predicted Dunkin Donuts images: ', ddCount)
else:
    print('No. of predicted Starbucks images: ', starCount)
 

No. of predicted Dunkin Donuts images:  8


In [41]:
print(starCount)

6


In [42]:
print(ddCount)

8


## PART F - NETWORK ARCHITECTURE

In [127]:
classifier.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
conv2d_7 (Conv2D)            (None, 62, 62, 32)        896       
_________________________________________________________________
max_pooling2d_7 (MaxPooling2 (None, 31, 31, 32)        0         
_________________________________________________________________
conv2d_8 (Conv2D)            (None, 29, 29, 32)        9248      
_________________________________________________________________
max_pooling2d_8 (MaxPooling2 (None, 14, 14, 32)        0         
_________________________________________________________________
flatten_3 (Flatten)          (None, 6272)              0         
_________________________________________________________________
dense_9 (Dense)              (None, 128)               802944    
_________________________________________________________________
dense_10 (Dense)             (None, 1)                 129       
Total para

In [128]:
classifier.save("C:/Users/chels/bdiaprojfinal.h5")

In [3]:
from keras.models import load_model
load = load_model('C://Users//chels//bigdatafinalproject//bdiaprojfinal.h5')

MIT License

Copyright (c) 2018, Karan Barai, Gauravi Chaudhari, Pranav Swaminathan

Permission is hereby granted, free of charge, to any person obtaining a copy
of this software and associated documentation files (the "Software"), to deal
in the Software without restriction, including without limitation the rights
to use, copy, modify, merge, publish, distribute, sublicense, and/or sell
copies of the Software, and to permit persons to whom the Software is
furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all
copies or substantial portions of the Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR
IMPLIED, INCLUDING BUT NOT LIMITED TO THE WARRANTIES OF MERCHANTABILITY,
FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE
AUTHORS OR COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER
LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR OTHERWISE, ARISING FROM,
OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE
SOFTWARE.