# Pokemon Classifier
- we'll use ResNet50 model (50 layers classifier of 1000 objects). [From Keras](https://keras.io/api/applications/)

In [1]:
from tensorflow.python.keras.applications.resnet50 import ResNet50
from tensorflow.python.keras.applications.resnet50 import preprocess_input, decode_predictions
from tensorflow.python.keras.preprocessing import image
import numpy as np

- preprocess_input is function to preprocess images for ResNet50 input
- decode_predictions is function to convert model output -> 1000 labels

In [2]:
model = ResNet50(weights='imagenet')
print('Model Downloaded Successfully')
# load weights trained on the Imagenet dataset. Can set to None to initialize model with random weights

Model Downloaded Successfully


In [3]:
# Loading the image
img_path = 'elephant.jpg'
img = image.load_img(img_path, target_size=(224,224)) #load to desired model size
x = image.img_to_array(img) # convert to numpy array
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)
print('image preprocess done!')

image preprocess done!


### Make the predictions

In [4]:
x.shape

(1, 224, 224, 3)

In [5]:
pred = model.predict(x)
# decode results to list of tuple (class, description, probability)
# instead of 3, we could predict up to 1000 classes
print(decode_predictions(pred, top=3)[0])

[('n01871265', 'tusker', 0.40294957), ('n02504013', 'Indian_elephant', 0.3175466), ('n02504458', 'African_elephant', 0.2786193)]


- Note that no label has pbb > 50%. Since image used may be very different from original dataset.
- To overcome this we use transfer learning strategy to build our own classifier

# Project
About dataset
- contains 10 different Pokemons.


1. load, shuffle, preprocess (one hot labels) our dataset

In [7]:
import os
from tensorflow.python.keras.preprocessing import image

folders = os.listdir('Train') # folders inside the Train folder
print(folders)

# we'll store array of all images (numpy array)
image_data = []
labels = []
count = 0

for ix in folders:
    path = os.path.join('Train', ix) # path to each folder
    
    for im in os.listdir(path):
        try:
            img = image.load_img(os.path.join(path, im), target_size = (224, 224))
            img_array = image.img_to_array(img) # the loaded image numpy
            image_data.append(img_array)
            labels.append(count)
        except:
            pass
    count += 1

['Spearow', 'Squirtle', 'Pikachu', 'Charmander', 'Fearow', 'Aerodactyl', 'Bulbasaur', 'Meowth', 'Psyduck', 'Dratini']


  'to RGBA images')


In [8]:
# Shuffle the dataset (randomize the image and labels numpy)

import random

combined_dataset = list(zip(image_data, labels)) # zip() creates a tuple combining array elements
random.shuffle(combined_dataset)
image_data[:], labels[:] = zip(*combined_dataset) # unzip the data

In [9]:
# Convert labels to one-hot encoding (what multi-layer perception accepts)

from tensorflow.python.keras.utils import np_utils
# convert lists to numpy array
X_train = np.array(image_data)
Y_train = np.array(labels)
# use np_utils to convert labels to one hot encoding
Y_train = np_utils.to_categorical(Y_train)

2. Use the ResNet50 model

In [11]:
from tensorflow.python.keras.applications.resnet50 import ResNet50
from tensorflow.python.keras.optimizers import Adam
from tensorflow.python.keras.layers import *
from tensorflow.python.keras.models import Model
import numpy as np
print("Imported successfully")

Imported successfully


In [12]:
# include_top - false so we don't load model with classifier. We want to build our own classifier on top of convolution base
# if include_top false, we must set input_shape
# we want model to load weights it learned on the ImageNet dataset
model = ResNet50(include_top = False, weights = 'imagenet', input_shape = (224, 224, 3))
# print summary to understand architecture
print(model.summary())

Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.2/resnet50_weights_tf_dim_ordering_tf_kernels_notop.h5
__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 112, 112, 64) 9472        input_2[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 112, 112, 64) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_49 (Activation)      (None, 112, 112, 64) 0       

there is 23 million trainable params. We won't train this much.
Shows that even after flattening, there's large number of params. To avoid this, we apply 'GlobalAveragePooling' (7,7,2048) to (1, 2048) before passing to our fully connected layer 

note: we're using Functional API

In [16]:
# Build classifier 
av1 = GlobalAveragePooling2D()(model.output) # the input of this layer is output of model
fc1 = Dense(256, activation = 'relu')(av1) # add dense layer with 25 neurons
d1 = Dropout(0.5)(fc1) # disactivate half of neurons. Input is output of fc1
fc2 = Dense(10, activation = 'softmax')(d1) # final layer with 10 neurons softmax (add to 1)

(?, 2048)
(?, 1, 1, 2048)


In [17]:
# connect the pre-trained model with built classifier.
model_new = Model(inputs = model.input, outputs = fc2) # Groups layers into object
model_new.summary()

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 112, 112, 64) 9472        input_2[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 112, 112, 64) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_49 (Activation)      (None, 112, 112, 64) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
max_poolin

3. Make predictions using our aggregated model

In [21]:
# Check our pikachu image
from tensorflow.python.keras.applications.resnet50 import preprocess_input

image_path = 'pikachu.jpg'
img = image.load_img(image_path, target_size = (224,224))
x = image.img_to_array(img)
x = np.expand_dims(x, axis=0)
x = preprocess_input(x)

pred = model_new.predict(x)
print(np.argmax(pred))

2


4. Fine tuning the model 


- update weights using low learning rate
- we will then set some layer as trainable and some non-trainable

In [24]:
# print all layers and index of the model
for ix in range(len(model_new.layers)):
    print(ix, model_new.layers[ix])

0 <tensorflow.python.keras.engine.input_layer.InputLayer object at 0x7ffc3836a6d8>
1 <tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ffc3836ae10>
2 <tensorflow.python.keras.layers.normalization.BatchNormalization object at 0x7ffc3836ac88>
3 <tensorflow.python.keras.layers.core.Activation object at 0x7ffc0b28f0f0>
4 <tensorflow.python.keras.layers.pooling.MaxPooling2D object at 0x7ffc0b28fa20>
5 <tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ffc5fae4390>
6 <tensorflow.python.keras.layers.normalization.BatchNormalization object at 0x7ffc38244588>
7 <tensorflow.python.keras.layers.core.Activation object at 0x7ffc382444a8>
8 <tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ffc382447b8>
9 <tensorflow.python.keras.layers.normalization.BatchNormalization object at 0x7ffc3823beb8>
10 <tensorflow.python.keras.layers.core.Activation object at 0x7ffc3823bf28>
11 <tensorflow.python.keras.layers.convolutional.Conv2D object at 0x7ffc38259fd0>
12 

In [28]:
# there is 178 layers. let's set 0 to 168 to untrainable layers. So we can only change 10 layers
for ix in range(169):
    model_new.layers[ix].trainable = False

In [29]:
adam = Adam(lr = 0.00003)
model_new.compile(loss = 'categorical_crossentropy', optimizer = adam, metrics = ['accuracy'])
print(model_new.summary())

__________________________________________________________________________________________________
Layer (type)                    Output Shape         Param #     Connected to                     
input_2 (InputLayer)            (None, 224, 224, 3)  0                                            
__________________________________________________________________________________________________
conv1 (Conv2D)                  (None, 112, 112, 64) 9472        input_2[0][0]                    
__________________________________________________________________________________________________
bn_conv1 (BatchNormalization)   (None, 112, 112, 64) 256         conv1[0][0]                      
__________________________________________________________________________________________________
activation_49 (Activation)      (None, 112, 112, 64) 0           bn_conv1[0][0]                   
__________________________________________________________________________________________________
max_poolin

5. Train our model with our data
- pass our train data and label to fit(). 

In [None]:
hist = model_new.fit(X_train, Y_train, shuffle = True, batch_size = 16, epochs = 8, validation_split = 0.2)

Train on 1316 samples, validate on 329 samples
Epoch 1/8
Epoch 2/8