#### Copyright 2018 Google LLC.

## Classification of Various Metal Oxide Materials with Model Fine-Tuning



## Feature Extraction Using a Pretrained Model

One thing that is commonly done in computer vision is to take a model trained on a very large dataset, run it on your own, smaller dataset, and extract the intermediate representations (features) that the model generates. These representations are frequently informative for your own computer vision task, even though the task may be quite different from the problem that the original model was trained on. This versatility and repurposability of convnets is one of the most interesting aspects of deep learning.

In our case, we will use the [Inception V3 model](https://arxiv.org/abs/1512.00567) developed at Google, and pre-trained on [ImageNet](http://image-net.org/), a large dataset of web images (1.4M images and 1000 classes). This is a powerful model; let's see what the features that it has learned can do for our cat vs. dog problem.

First, we need to pick which intermediate layer of Inception V3 we will use for feature extraction. A common practice is to use the output of the very last layer before the `Flatten` operation, the so-called "bottleneck layer." The reasoning here is that the following fully connected layers will be too specialized for the task the network was trained on, and thus the features learned by these layers won't be very useful for a new task. The bottleneck features, however, retain much generality.

Let's instantiate an Inception V3 model preloaded with weights trained on ImageNet:


In [8]:
 #mount google drive with images
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [9]:
!pip install -q pyyaml h5py  # Required to save models in HDF5 format

import os
import numpy as np
import pandas as pd
import seaborn as sn

import tensorflow as tf

from tensorflow.keras import layers
from tensorflow.keras import Model
from tensorflow.keras.preprocessing.image import ImageDataGenerator
from tensorflow.keras.applications.inception_v3 import InceptionV3
from tensorflow.keras.optimizers import RMSprop

import matplotlib.pyplot as plt
import numpy as np

from sklearn.metrics import classification_report, confusion_matrix

%tensorflow_version 2.x
print('TensorFlow version is {}'.format(tf.__version__))

TensorFlow version is 2.2.0


In [10]:
#check that you have the right GPUs and RAM
device_name = tf.test.gpu_device_name()
if device_name != '/device:GPU:0':
  raise SystemError('GPU device not found')
print('Found GPU at: {}'.format(device_name))

gpu_info = !nvidia-smi
gpu_info = '\n'.join(gpu_info)
if gpu_info.find('failed') >= 0:
  print('Select the Runtime > "Change runtime type" menu to enable a GPU accelerator, ')
  print('and then re-execute this cell.')
else:
  print(gpu_info)

from psutil import virtual_memory
ram_gb = virtual_memory().total / 1e9
print('Your runtime has {:.1f} gigabytes of available RAM\n'.format(ram_gb))

if ram_gb < 20:
  print('To enable a high-RAM runtime, select the Runtime > "Change runtime type"')
  print('menu, and then select High-RAM in the Runtime shape dropdown. Then, ')
  print('re-execute this cell.')
else:
  print('You are using a high-RAM runtime!')

Found GPU at: /device:GPU:0
Thu Jun  4 06:00:27 2020       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 440.82       Driver Version: 418.67       CUDA Version: 10.1     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|   0  Tesla P100-PCIE...  Off  | 00000000:00:04.0 Off |                    0 |
| N/A   44C    P0    35W / 250W |    353MiB / 16280MiB |      0%      Default |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Processes:                                                       GPU Memory |
|  GPU       PID   Type   Process name                      

Now let's download the weights:

In [0]:
#https://www.tensorflow.org/tutorials/keras/
#https://colab.research.google.com/github/tensorflow/docs/blob/master/site/en/guide/keras/custom_callback.ipynb#scrollTo=Ct0VCSI2dt3a
#above reference is related to callbacks and modularity of training
def create_model():
    !wget --no-check-certificate \
        https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5 \
        -O /tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5

    #from tensorflow.keras.applications.inception_v3 import InceptionV3
    #from tensorflow.keras.optimizers import RMSprop

    local_weights_file = '/tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5'
    pre_trained_model = InceptionV3(
        input_shape=(150, 150, 3), include_top=False, weights=None)
    pre_trained_model.load_weights(local_weights_file)

    #lets make the model non-trainable, since we will only use it for feature extraction
    #we wont update the weights of the pretrained model during training
    for layer in pre_trained_model.layers:
      layer.trainable = False

    #The layer we will use for feature extraction in Inception v3 is called mixed7. 
    #It is not the bottleneck of the network, but we are using it to keep a sufficiently 
    #large feature map (7x7 in this case). (Using the bottleneck layer would have 
    #resulting in a 3x3 feature map, which is a bit small.) Let's get the output from mixed7:
    last_layer = pre_trained_model.get_layer('mixed7')
    last_output = last_layer.output

    #Now let's stick a fully connected classifier on top of last_output:
    # Flatten the output layer to 1 dimension
    x = layers.Flatten()(last_output)
    # Add a fully connected layer with 1,024 hidden units and ReLU activation
    x = layers.Dense(1024, activation='relu')(x)
    # Add a dropout rate of 0.2
    x = layers.Dropout(0.2)(x)
    # Add a final softmax layer for classification
    x = layers.Dense(3, activation='softmax', name='visualized_layer')(x)

    # Configure and compile the model
    model = Model(pre_trained_model.input, x)
    
    # Return pre-compiled model
    return model

In [0]:
#add data locations
#this is class balanced with only thre classes
train_dir = './drive/My Drive/SupportClassification/three_supports/train'
validation_dir = './drive/My Drive/SupportClassification/three_supports/validation'

train_NPs_dir = os.path.join(train_dir, 'NPs')
train_SiO2NBs_dir = os.path.join(train_dir, 'SiO2NBs')
train_TiO2_dir = os.path.join(train_dir, 'TiO2')

validation_NPs_dir = os.path.join(validation_dir, 'NPs')
validation_SiO2NBs_dir = os.path.join(validation_dir, 'SiO2NBs')
validation_TiO2_dir = os.path.join(validation_dir, 'TiO2')

num_NPs_tr = len(os.listdir(train_NPs_dir))
num_SiO2NBs_tr = len(os.listdir(train_SiO2NBs_dir))
num_TiO2_tr = len(os.listdir(train_TiO2_dir))

num_NPs_val = len(os.listdir(validation_NPs_dir))
num_SiO2NBs_val = len(os.listdir(validation_SiO2NBs_dir))
num_TiO2_val = len(os.listdir(validation_TiO2_dir))

total_train = num_NPs_tr + num_TiO2_tr

total_val = num_NPs_val + num_TiO2_val

In [0]:
#Lets define our callback function, starting with our save directory
checkpoint_path = "drive/My Drive/SupportClassification/three_supports/SandV3_modelweights/weights.{epoch:02d}-{val_acc:.2f}.hdf5.ckpt"
checkpoint_dir = os.path.dirname(checkpoint_path)
# Create a callback that saves the model's weights
cp_callback = tf.keras.callbacks.ModelCheckpoint(filepath=checkpoint_path,
                                                 save_weights_only=True,
                                                 verbose=1)

Set up the generators to generate sets of training and test data.

In [14]:
epochs = [5]
zoom_ranges = [0.5]
width_shift_ranges = [.5]
height_shift_ranges = [.5]
learning_rates = [.0001]

for epoch in epochs:
  for zr in zoom_ranges:
    for wsr in width_shift_ranges:
      for hsr in height_shift_ranges:
        for lr in learning_rates:

          print('Optimizing with epoch {}, zoom_range {}, width_shift_range {}, and height_shift_range {} and learning rate {}'.format(epoch, zr, wsr, hsr, lr))

          model = create_model()
          print('Extracted pre_trained_model!')
          
          model.compile(loss='binary_crossentropy',
                optimizer=RMSprop(lr),
                metrics=['acc'])

          train_image_generator = ImageDataGenerator(
                              rescale=1./255,
                              #rotation_range=45,
                              #width_shift_range = wsr,
                              #height_shift_range = hsr,
                              horizontal_flip=True,
                              #zoom_range = zr
                              )
          
          validation_image_generator = ImageDataGenerator(rescale=1./255) # Generator for our validation data

          batch_size = 128
          epochs = epoch
          IMG_HEIGHT = 150
          IMG_WIDTH = 150

          train_data_gen = train_image_generator.flow_from_directory(batch_size=batch_size,
                                                                    directory=train_dir,
                                                                    shuffle=True,
                                                                    target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                                    class_mode='categorical')

          val_data_gen = validation_image_generator.flow_from_directory(batch_size=batch_size,
                                                                        directory=validation_dir,
                                                                        target_size=(IMG_HEIGHT, IMG_WIDTH),
                                                                        class_mode='categorical',
                                                                        shuffle = False)
          
####################################GET#CALLBACKS#READY####################################                   
          history = model.fit_generator(
              train_data_gen,
              steps_per_epoch=total_train // batch_size,
              epochs=epochs,
              validation_data=val_data_gen,
              validation_steps=total_val // batch_size,
              callbacks=[cp_callback]  
              )

Optimizing with epoch 5, zoom_range 0.5, width_shift_range 0.5, and height_shift_range 0.5 and learning rate 0.0001
--2020-06-04 06:00:29--  https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5
Resolving storage.googleapis.com (storage.googleapis.com)... 172.217.203.128, 2607:f8b0:400c:c13::80
Connecting to storage.googleapis.com (storage.googleapis.com)|172.217.203.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 87910968 (84M) [application/x-hdf]
Saving to: ‘/tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5’


2020-06-04 06:00:30 (93.8 MB/s) - ‘/tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5’ saved [87910968/87910968]

Extracted pre_trained_model!
Found 1571 images belonging to 3 classes.
Found 329 images belonging to 3 classes.
Instructions for updating:
Please use Model.fit, which supports generators.
Epoch 1/5
Epoch 00001: saving model to drive/My Drive/SupportClassification/th

###Lets load one of the models that we created and see if we can interpret.

In [0]:
loaded_model = create_model()
loaded_model.load_weights("drive/My Drive/SupportClassification/three_supports/SandV3_modelweights/weights.04-0.99.hdf5.ckpt")

--2020-06-04 06:01:54--  https://storage.googleapis.com/mledu-datasets/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5
Resolving storage.googleapis.com (storage.googleapis.com)... 172.253.123.128, 2607:f8b0:400c:c15::80
Connecting to storage.googleapis.com (storage.googleapis.com)|172.253.123.128|:443... connected.
HTTP request sent, awaiting response... 200 OK
Length: 87910968 (84M) [application/x-hdf]
Saving to: ‘/tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5’


2020-06-04 06:01:55 (297 MB/s) - ‘/tmp/inception_v3_weights_tf_dim_ordering_tf_kernels_notop.h5’ saved [87910968/87910968]



### Now that we have a specific model, lets start looking into interpretability! Lets start by importing some utility functions, as usual.

In [0]:
#this can be finicky sometimes... double check next time that theres no problems
#sometimes need to reupload net_visualization... and image_utils.py into folder
#because they form stupid __pycache__ items
import sys
sys.path.append('./drive/My Drive/SupportClassification/three_supports')
from image_utils import preprocess_image, deprocess_image
from net_visualization_tensorflow import compute_saliency_maps

In [0]:
#this creates a set of training images and training labels
sample_training_images, sample_onehot_training_labels = next(train_data_gen) #(128, 150, 150, 3), (128, 2)
sample_training_labels = np.zeros((sample_onehot_training_labels.shape[0])) #(128, )
for i in range(len(sample_onehot_training_labels)):
  sample_training_labels[i] = int(np.where(sample_onehot_training_labels[i] == 1)[0][0])

# This function will plot images in the form of a grid with 1 row and 5 columns where images are placed in each column.
def plotImages(images_arr):
    fig, axes = plt.subplots(1, 5, figsize=(20,20))
    axes = axes.flatten()
    for img, ax in zip( images_arr, axes):
        ax.imshow(img)
        ax.axis('off')
    plt.tight_layout()
    plt.show()
plotImages(sample_training_images[:5])

print(type(sample_training_labels[0]))
sample_training_labels = sample_training_labels.astype(np.int64) 
print(type(sample_training_images[0]))
sample_training_images = sample_training_images[:5, :, :, :] #(5, 150, 150, 3), (5,)
sample_training_labels = sample_training_labels[:5]
print(sample_training_images.shape, sample_training_labels.shape)

In [0]:
class_names = {0 : 'NPs', 1 : 'SiO2NBs', 2 : 'TiO2'}

def show_saliency_maps(X, y, mask):
    mask = np.asarray(mask)
    Xm = X[mask]
    ym = y[mask]
    
    saliency = compute_saliency_maps(Xm, ym, loaded_model)

    for i in range(mask.size):
        plt.subplot(2, mask.size, i + 1)
        plt.imshow(deprocess_image(Xm[i]))
        plt.axis('off')
        plt.title(class_names[ym[i]])
        plt.subplot(2, mask.size, mask.size + i + 1)
        plt.title(mask[i])
        plt.imshow(saliency[i], cmap=plt.cm.hot)
        plt.axis('off')
        plt.gcf().set_size_inches(20, 8)
    plt.show()

mask = np.arange(5)
show_saliency_maps(sample_training_images, sample_training_labels, mask)