# Image Classifier - Animal Classification 


This notebook shows how a CNN model can be trained with images of animals (birds, dogs, cats and fishes) and then be used to classify an image accordingly.

## 1. Setup
To prepare your environment, you need to install some packages and enter credentials for the Watson services.

### 1.1 Install the necessary packages
You need the latest versions of these packages:
python-swiftclient: is a python client for the Swift API.

### Install IBM Cloud Object Storage Client:  

Not required with running the notebook on Watson studio

In [2]:
!pip install ibm-cos-sdk

Requirement not upgraded as not directly required: ibm-cos-sdk in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages
Requirement not upgraded as not directly required: ibm-cos-sdk-core==2.*,>=2.0.0 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from ibm-cos-sdk)
Requirement not upgraded as not directly required: ibm-cos-sdk-s3transfer==2.*,>=2.0.0 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from ibm-cos-sdk)
Requirement not upgraded as not directly required: python-dateutil<3.0.0,>=2.1 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from ibm-cos-sdk-core==2.*,>=2.0.0->ibm-cos-sdk)
Requirement not upgraded as not directly required: jmespath<1.0.0,>=0.7.1 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from ibm-cos-sdk-core==2.*,>=2.0.0->ibm-cos-sdk)
Requirement not upgraded as not directly required: docutils>=0.10 in /opt/conda/envs/DSX-Python35/lib/python3.5/site-packages (from ibm-cos-sdk-core==2.*,>=2.0.0->ibm-cos-sd

### Now restart the kernel by choosing Kernel > Restart.

### 1.2 Import packages and libraries
Import the packages and libraries that you'll use: 

Keras is used to build the CNN model

In [6]:
import os, random
import numpy as np
import pandas as pd
import PIL
import keras
import itertools
from PIL import Image
import ibm_boto3
from botocore.client import Config


from sklearn.utils import shuffle
from sklearn.cross_validation import train_test_split
from sklearn import preprocessing
from skimage import feature, data, io, measure
from sklearn.metrics import confusion_matrix

import matplotlib.pyplot as plt
from matplotlib import ticker
import seaborn as sns
%matplotlib inline 

from keras import backend as K
from keras.models import Sequential
from keras.layers import Input, Dropout, Flatten, Conv2D, MaxPooling2D, Dense, Activation
from keras.optimizers import RMSprop
from keras.callbacks import ModelCheckpoint, Callback, EarlyStopping
from keras.optimizers import Adam
from keras.utils import np_utils
from keras.models import Sequential
from keras.layers import Input, Dropout, Flatten, Conv2D, MaxPooling2D, Dense, Activation
from keras.optimizers import RMSprop
from keras.callbacks import ModelCheckpoint, Callback, EarlyStopping
from keras.utils import np_utils
from keras.preprocessing.image import ImageDataGenerator, array_to_img, img_to_array, load_img


Using TensorFlow backend.


## 2. Configuration
Add configurable items of the notebook below

### 2.1 Add your service credentials for Object Storage
You must create Object Storage service on IBM Cloud. To access data in a file in Object Storage, you need the Object Storage authentication credentials. Insert the Object Storage Streaming Body credentials and ensure the variable is referred as  streaming_body_1 in the following cell after removing the current contents in the cell.

In [3]:

import types
import pandas as pd
from ibm_botocore.client import Config
import ibm_boto3

def __iter__(self): return 0

# @hidden_cell
# The following code accesses a file in your IBM Cloud Object Storage. It includes your credentials.
# You might want to remove those credentials before you share your notebook.
client_06ebd8cf4b5948e4b14a454efee02c0f = ibm_boto3.client(service_name='s3',
    ibm_api_key_id='iJFsJgWO9_FYuS0_5IbTcgyhdNgThw1bUtyph3P9cdC-',
    ibm_auth_endpoint="https://iam.bluemix.net/oidc/token",
    config=Config(signature_version='oauth'),
    endpoint_url='https://s3.eu-geo.objectstorage.service.networklayer.com')

# Your data file was loaded into a botocore.response.StreamingBody object.
# Please read the documentation of ibm_boto3 and pandas to learn more about your possibilities to load the data.
# ibm_boto3 documentation: https://ibm.github.io/ibm-cos-sdk-python/
# pandas documentation: http://pandas.pydata.org/
streaming_body_1 = client_06ebd8cf4b5948e4b14a454efee02c0f.get_object(Bucket='datasciencepython-donotdelete-pr-b0ptb4tc6m7zkt', Key='Data.zip')['Body']
# add missing __iter__ method, so pandas accepts body as file-like object
if not hasattr(streaming_body_1, "__iter__"): streaming_body_1.__iter__ = types.MethodType( __iter__, streaming_body_1 ) 



### 2.2 Global Variables 
Enter the batch size for training, testing and validation dataset

In [1]:
batch_size_train = 4 #the number of classes of the dataset
batch_size_val = 1 #number of images present in the validation folder
batch_size_test = 1 #number of images in the test folder
num_classes= 4 #number of classes (number of animal classes)
intereseted_folder='Dogs'
STANDARD_SIZE=(224,224)

# 3. Storage

## 3.1 Extract the Dataset 

Input the zip file from object storage and extract the data onto the /home/dsxuser/work folder

In [4]:
from io import BytesIO
import zipfile

zip_ref = zipfile.ZipFile(BytesIO(streaming_body_1.read()),'r')
paths = zip_ref.namelist()
classes_required=[]
for path in paths:
    zip_ref.extract(path)
    temp=path.split('/')
    if len(temp) > 3:
        if temp[2] not in classes_required:
            classes_required.append(temp[2])
print(classes_required)
zip_ref.close()

['Birds', 'Cats', 'Dogs', 'Fishes']


# 4. Classification

## 4.1 Create the Datset

In [7]:
'''Converting Data Format according to the backend used by Keras
'''
datagen=keras.preprocessing.image.ImageDataGenerator(data_format=K.image_data_format())

In [8]:
'''Input the Training Data
'''
train_path = '/home/dsxuser/work/Data/Train_Data/'
train_batches = ImageDataGenerator().flow_from_directory(train_path, target_size=(224,224), classes=classes_required, batch_size=batch_size_train)
type(train_batches)

Found 16 images belonging to 4 classes.


keras.preprocessing.image.DirectoryIterator

In [9]:
'''Input the Validation Data
'''

val_path = '/home/dsxuser/work/Data/Val_Data/'
val_batches = ImageDataGenerator().flow_from_directory(val_path, target_size=(224,224), classes=classes_required, batch_size=batch_size_val)


Found 4 images belonging to 4 classes.


In [10]:
'''Input the Test Data
'''
test_path = '/home/dsxuser/work/Data/Test_Data/'
test_batches = ImageDataGenerator().flow_from_directory(test_path, target_size=(224,224), classes=classes_required, batch_size=batch_size_test)


Found 4 images belonging to 4 classes.


In [13]:
test_imgs, test_labels = next(test_batches)
test_labels

array([[ 0.,  0.,  1.,  0.]], dtype=float32)

In [14]:
y_test= [ np.where(r==1)[0][0] for r in test_labels ]
y_test

[2]

## 4.2 Build the Model

In [15]:
vgg16_model = keras.applications.vgg16.VGG16()
vgg16_model.summary()

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg16_weights_tf_dim_ordering_tf_kernels.h5
_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
____

In [14]:
type(vgg16_model) #This is a Keras Functional API need to convert to sequential
model = Sequential() #Iterate over the functional layers and add it as a stack
for layer in vgg16_model.layers:
    model.add(layer)

In [15]:
model.layers.pop()
model.summary()

_________________________________________________________________
Layer (type)                 Output Shape              Param #   
input_1 (InputLayer)         (None, 224, 224, 3)       0         
_________________________________________________________________
block1_conv1 (Conv2D)        (None, 224, 224, 64)      1792      
_________________________________________________________________
block1_conv2 (Conv2D)        (None, 224, 224, 64)      36928     
_________________________________________________________________
block1_pool (MaxPooling2D)   (None, 112, 112, 64)      0         
_________________________________________________________________
block2_conv1 (Conv2D)        (None, 112, 112, 128)     73856     
_________________________________________________________________
block2_conv2 (Conv2D)        (None, 112, 112, 128)     147584    
_________________________________________________________________
block2_pool (MaxPooling2D)   (None, 56, 56, 128)       0         
__________

In [16]:
for layer in model.layers: #Since the model is already trained with certain weights, we dont want to change it. Let it be the same
    layer.trainable = False

In [16]:
model.add(Dense(4, activation='sigmoid')) # Add the last layer ... set dense to be 4 since there are 4 classes
model.summary()

NameError: name 'model' is not defined

In [18]:
# Complie the model
model.compile(Adam(lr=.00015), loss='categorical_crossentropy', metrics=['accuracy'])

## 4.3 Train the Model

The model will take about 30-45 minutes to train. 

In [19]:
model.fit_generator(train_batches, steps_per_epoch=20, validation_data=val_batches, validation_steps=20, epochs=5, verbose=1)

Epoch 1/5
Epoch 2/5
Epoch 3/5
Epoch 4/5
Epoch 5/5


<keras.callbacks.History at 0x7f580859dda0>

## 4.4 Test the Model with External Test Images

In [34]:

# Your data file was loaded into a botocore.response.StreamingBody object.
# Please read the documentation of ibm_boto3 and pandas to learn more about your possibilities to load the data.
# ibm_boto3 documentation: https://ibm.github.io/ibm-cos-sdk-python/
# pandas documentation: http://pandas.pydata.org/
streaming_body_2 = client_06ebd8cf4b5948e4b14a454efee02c0f.get_object(Bucket='datasciencepython-donotdelete-pr-b0ptb4tc6m7zkt', Key='test_doc-external.zip')['Body']
# add missing __iter__ method so pandas accepts body as file-like object
if not hasattr(streaming_body_2, "__iter__"): streaming_body_2.__iter__ = types.MethodType( __iter__, streaming_body_2 ) 



In [None]:
#model.save_weights('my_model_weights.h5')
#model.load_weights('my_model_weights.h5')

In [35]:
from io import BytesIO
import zipfile

zip_ref = zipfile.ZipFile(BytesIO(streaming_body_2.read()),'r')
paths = zip_ref.namelist()
# del paths[0]
print(paths)
for path in paths:
    print(zip_ref.extract(path))
zip_ref.close()

['test_doc-external/bird7.jpg', 'test_doc-external/cat7.jpg', 'test_doc-external/dog7.jpg', 'test_doc-external/fish7.jpg']
/home/dsxuser/work/test_doc-external/bird7.jpg
/home/dsxuser/work/test_doc-external/cat7.jpg
/home/dsxuser/work/test_doc-external/dog7.jpg
/home/dsxuser/work/test_doc-external/fish7.jpg


In [36]:
X_test=[]
def convert_to_image(X):
    '''Function to convert all Input Images to the STANDARD_SIZE and create Training Dataset
    '''
    for f in paths:
        #fobj=get_file(f)
        #print(type(fobj))predictions= model.predict(X_test)
        if os.path.isdir(f):
            continue
        img= PIL.Image.open(f)
        img = img.resize(STANDARD_SIZE)
        img=np.array(img)
        X.append(img)
        #print(X_train)
    #print(len(X_train))
    return X
X_test=np.array(convert_to_image(X_test))
datagen.fit(X_test)

In [37]:
predictions= model.predict(X_test)
predictions

array([[ 0.51688582,  0.49448106,  0.47963449,  0.49022463],
       [ 0.47832951,  0.51879197,  0.50910962,  0.48012954],
       [ 0.50886524,  0.49857593,  0.51481497,  0.49262235],
       [ 0.48639402,  0.47868204,  0.50364858,  0.51672333]], dtype=float32)

In [38]:
y_pred=[]
for i in range(len(predictions)):
    y_pred.append(np.argmax(predictions[i]))
y_pred
j = 0
for i in y_pred:
    print(paths[y_pred[j]])
    j = j + 1

test_doc-external/bird7.jpg
test_doc-external/cat7.jpg
test_doc-external/dog7.jpg
test_doc-external/fish7.jpg


In [41]:
#print(classes_required)
index= classes_required.index('Dogs') #Edit this class index here to retrieve image of that class 
index1= classes_required.index('Cats') #Edit this class index here to retrieve image of that class
index2= classes_required.index('Fishes') #Edit this class index here to retrieve image of that class
index3= classes_required.index('Birds') #Edit this class index here to retrieve image of that class
for i in range(len(y_pred)):
    if y_pred[i] == index:
        print("Image that has been classified as a dog is: ", paths[i]) 
    elif y_pred[i] == index1:
        print("Image that has been classified as a cat is: ", paths[i]) 
    elif y_pred[i] == index2:
        print("Image that has been classified as a fish is: ", paths[i])
    elif y_pred[i] == index3:
        print("Image that has been classified as a bird is: ", paths[i])

Image that has been classified as a bird is:  test_doc-external/bird7.jpg
Image that has been classified as a cat is:  test_doc-external/cat7.jpg
Image that has been classified as a dog is:  test_doc-external/dog7.jpg
Image that has been classified as a fish is:  test_doc-external/fish7.jpg


## 4.5 Accuracy Testing

In [42]:
predictions = model.predict_generator(test_batches, steps=1, verbose=0)
predictions

array([[ 0.50884491,  0.49853694,  0.51481485,  0.49263278]], dtype=float32)

In [44]:
predictions
y_pred=[]
for i in range(len(predictions)):
    y_pred.append(np.argmax(predictions[i]))
print(y_pred)
#plots(test_imgs, titles=y_pred)

ctr=0
for i in range(len(y_pred)):
    if y_pred[i] == y_test[i]:
        ctr=ctr+1
res = ctr/len(y_pred)*100
print(res)


[2]
100.0
