

## Convolutional Neural Networks

---

In your upcoming project, you will download pre-computed bottleneck features.  In this notebook, we'll show you how to calculate VGG-16 bottleneck features on a toy dataset.  Note that unless you have a powerful GPU, computing the bottleneck features takes a significant amount of time.

### 1. Load and Preprocess Sample Images

Before supplying an image to a pre-trained network in Keras, there are some required preprocessing steps.  You will learn more about this in the project; for now, we have implemented this functionality for you in the first code cell of the notebook.  We have imported a very small dataset of 8 images and stored the  preprocessed image input as `img_input`.  Note that the dimensionality of this array is `(8, 224, 224, 3)`.  In this case, each of the 8 images is a 3D tensor, with shape `(224, 224, 3)`.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Mounted at /content/drive


In [None]:
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing import image
import numpy as np
import glob

In [None]:
# from sklearn.datasets import load_files
# path = '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/'
# a_temp = load_files(path)
# a_temp

{'DESCR': None,
 'data': [],
 'filenames': array([], dtype=float64),
 'target': array([], dtype=float64),
 'target_names': []}

In [None]:
# img_paths = glob.glob("/content/drive/MyDrive/01 - My Documents/**",recursive=True)
# img_paths
img_paths = glob.glob("/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/*.jpg")
img_paths

['/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Labrador_retriever_06449.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Curly-coated_retriever_03896.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Brittany_02625.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/American_water_spaniel_00648.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Labrador_retriever_06455.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/sopa.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Labrador_retriever_06457.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Welsh_springer_spaniel_08203.jpg']

In [None]:
img_paths

['/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Labrador_retriever_06449.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Curly-coated_retriever_03896.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Brittany_02625.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/American_water_spaniel_00648.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Labrador_retriever_06455.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/sopa.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Labrador_retriever_06457.jpg',
 '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Welsh_springer_spaniel_08203.jpg']

In [None]:
# path = '/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/Labrador_retriever_06449.jpg'

# def a_fun_to_read_image_in_array(img_path):
#   return np.expand_dims( image.img_to_array( image.load_img(path, target_size=(224,224))) , axis = 0 )

# a_fun_to_read_image_in_array(path)



In [None]:
def path_to_tensor(img_path):
    # loads RGB image as PIL.Image.Image type
    img = image.load_img(img_path, target_size=(224, 224))
    # convert PIL.Image.Image type to 3D tensor with shape (224, 224, 3)
    x = image.img_to_array(img)
    # convert 3D tensor to 4D tensor with shape (1, 224, 224, 3) and return 4D tensor
    return np.expand_dims(x, axis=0)

def paths_to_tensor(img_paths):
    list_of_tensors = [path_to_tensor(img_path) for img_path in img_paths]
    return np.vstack(list_of_tensors)


# a_array = paths_to_tensor(img_paths)


# calculate the image input. you will learn more about how this works the project!
img_input = preprocess_input(paths_to_tensor(img_paths))

print(img_input.shape)

(8, 224, 224, 3)


### 2. Recap How to Import VGG-16

Recall how we import the VGG-16 network (including the final classification layer) that has been pre-trained on ImageNet.

![VGG-16 model](figures/vgg16.png)

In [None]:
from keras.applications.vgg16 import VGG16
model = VGG16()
model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels.h5
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_1 (InputLayer)        [(None, 224, 224, 3)]     0         
                                                                 
 block1_conv1 (Conv2D)       (None, 224, 224, 64)      1792      
                                                                 
 block1_conv2 (Conv2D)       (None, 224, 224, 64)      36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, 112, 112, 64)      0         
                                                                 
 block2_conv1 (Conv2D)       (None, 112, 112, 128)     73856     
                                                                 
 block2_conv2 (Conv2D)       (None, 112, 112, 128)     14758

For this network, `model.predict` returns a 1000-dimensional probability vector containing the predicted probability that an image returns each of the 1000 ImageNet categories.  The dimensionality of the obtained output from passing `img_input` through the model is `(8, 1000)`.  The first value of `8` merely denotes that 8 images were passed through the network.

In [None]:
x = model.predict(img_input)

In [None]:
x.shape

(8, 1000)

### 3. Import the VGG-16 Model, with the Final Fully-Connected Layers Removed

When performing transfer learning, we need to remove the final layers of the network, as they are too specific to the ImageNet database.  This is accomplished in the code cell below.

![VGG-16 model for transfer learning](figures/vgg16_transfer.png)

In [None]:
from keras.applications.vgg16 import VGG16
model = VGG16(include_top=False)
model.summary()

Downloading data from https://storage.googleapis.com/tensorflow/keras-applications/vgg16/vgg16_weights_tf_dim_ordering_tf_kernels_notop.h5
Model: "vgg16"
_________________________________________________________________
 Layer (type)                Output Shape              Param #   
 input_2 (InputLayer)        [(None, None, None, 3)]   0         
                                                                 
 block1_conv1 (Conv2D)       (None, None, None, 64)    1792      
                                                                 
 block1_conv2 (Conv2D)       (None, None, None, 64)    36928     
                                                                 
 block1_pool (MaxPooling2D)  (None, None, None, 64)    0         
                                                                 
 block2_conv1 (Conv2D)       (None, None, None, 128)   73856     
                                                                 
 block2_conv2 (Conv2D)       (None, None, None, 128)  

### 4. Extract Output of Final Max Pooling Layer

Now, the network stored in `model` is a truncated version of the VGG-16 network, where the final three fully-connected layers have been removed.  In this case, `model.predict` returns a 3D array (with dimensions $7\times 7\times 512$) corresponding to the final max pooling layer of VGG-16.  The dimensionality of the obtained output from passing `img_input` through the model is `(8, 7, 7, 512)`.  The first value of `8` merely denotes that 8 images were passed through the network.  

In [None]:
bottle_neck_features = model.predict(img_input)

In [None]:
bottle_neck_features.shape

(8, 7, 7, 512)

In [None]:
np.savez('botleneck_features_testing.npz', model.predict(img_input))

In [None]:
from sklearn.datasets import load_files   
from keras.utils import np_utils    

def load_dataset(path):
    data = load_files(path)
    # dog_files = np.array(data['filenames'])
    dog_targets = np_utils.to_categorical(np.array(data['target']), 133)
    # return dog_files, dog_targets
    return dog_targets

# train_targets = load_dataset('/content/drive/MyDrive/dogImages/train')
train_targets = load_dataset('/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/')


# valid_targets = load_dataset('/content/drive/MyDrive/dogImages/valid')
# test_targets = load_dataset('/content/drive/MyDrive/dogImages/test')

In [None]:
train_targets

array([], shape=(0, 133), dtype=float32)

In [None]:
p

['133.Yorkshire_terrier',
 '044.Cane_corso',
 '090.Italian_greyhound',
 '116.Parson_russell_terrier',
 '012.Australian_shepherd',
 '021.Belgian_sheepdog',
 '100.Lowchen',
 '063.English_springer_spaniel',
 '092.Keeshond',
 '052.Clumber_spaniel',
 '095.Kuvasz',
 '056.Dachshund',
 '048.Chihuahua',
 '038.Brussels_griffon',
 '120.Pharaoh_hound',
 '066.Field_spaniel',
 '114.Otterhound',
 '080.Greater_swiss_mountain_dog',
 '050.Chinese_shar-pei',
 '097.Lakeland_terrier',
 '006.American_eskimo_dog',
 '016.Beagle',
 '108.Norwegian_buhund',
 '005.Alaskan_malamute',
 '130.Welsh_springer_spaniel',
 '128.Smooth_fox_terrier',
 '073.German_wirehaired_pointer',
 '055.Curly-coated_retriever',
 '015.Basset_hound',
 '106.Newfoundland',
 '047.Chesapeake_bay_retriever',
 '009.American_water_spaniel',
 '004.Akita',
 '022.Belgian_tervuren',
 '079.Great_pyrenees',
 '030.Border_terrier',
 '014.Basenji',
 '072.German_shorthaired_pointer',
 '033.Bouvier_des_flandres',
 '025.Black_and_tan_coonhound',
 '031.Borzoi

In [None]:
np_utils.to_categorical(np.array(p))

ValueError: ignored

In [None]:
import os
from keras.utils import np_utils    
p=os.listdir(r'/content/drive/MyDrive/dogImages/train')

# np_utils.to_categorical(np.array(p, 133))

# for i in p:
#     if os.path.isdir(i):
#         print(i)

TypeError: ignored

In [None]:
data = load_files('/content/drive/MyDrive/AML_TL_SAMPLE_IMAGES/images/')
np_utils.to_categorical(np.array(data['target']), 133)

[]

This is exactly how we calculate the bottleneck features for your project!