DSC160 Data Science and the Arts - Twomey - Spring 2020 - [dsc160.roberttwomey.com](http://dsc160.roberttwomey.com)

# Convolutional Neural Networks as Feature Extractors

## Introduction

This notebook shows the use of a pretrained convolutional neural network (VGG16) as a feature extractor for images. It will produce an approximately 25k dimensional feature vector to describe each image, according to the activations of the feature maps of the deep layers of the network. These features are a common approach for clustering or visualizing (dimensional reduction) image data.

For background on convolutional neural networks, please see the following links: 

- Introduction to Image Kernels / convolution: http://setosa.io/ev/image-kernels/
- Visualization of MNIST Digit recognition with CNN: https://www.cs.ryerson.ca/~aharley/vis/conv/flat.html
- VGG16
  - Simonyan and Zisserman, 'Very Deep Convolutional Networks for Large-Scale Image Recognition' (2014) [https://arxiv.org/abs/1409.1556](https://arxiv.org/abs/1409.1556)
  - Web introduction to VGG16: https://neurohive.io/en/popular-networks/vgg16
  
We will talk more about convolutional neural networks in the coming weeks.

## Setup

Install necessary packages:

In [4]:
!pip install umap --user

Collecting umap
  Downloading umap-0.1.1.tar.gz (3.2 kB)
Building wheels for collected packages: umap
  Building wheel for umap (setup.py) ... [?25ldone
[?25h  Created wheel for umap: filename=umap-0.1.1-py3-none-any.whl size=3565 sha256=d5c84a9229d8412e6e973f14c4e6877e50803a745249de7b825bfceb3cd099af
  Stored in directory: /Users/gabrielzalles/Library/Caches/pip/wheels/65/55/85/945cfb3d67373767e4dc3e9629300a926edde52633df4f0efe
Successfully built umap
Installing collected packages: umap
Successfully installed umap-0.1.1


In [5]:
!pip install keras --user

Collecting keras
  Downloading Keras-2.3.1-py2.py3-none-any.whl (377 kB)
[K     |████████████████████████████████| 377 kB 3.9 MB/s eta 0:00:01
[?25hCollecting keras-preprocessing>=1.0.5
  Downloading Keras_Preprocessing-1.1.2-py2.py3-none-any.whl (42 kB)
[K     |████████████████████████████████| 42 kB 2.6 MB/s  eta 0:00:01
Collecting keras-applications>=1.0.6
  Downloading Keras_Applications-1.0.8-py3-none-any.whl (50 kB)
[K     |████████████████████████████████| 50 kB 8.1 MB/s  eta 0:00:01
Collecting pyyaml
  Downloading PyYAML-5.3.1.tar.gz (269 kB)
[K     |████████████████████████████████| 269 kB 7.6 MB/s eta 0:00:01
[?25hCollecting h5py
  Downloading h5py-2.10.0-cp37-cp37m-macosx_10_6_intel.whl (3.0 MB)
[K     |████████████████████████████████| 3.0 MB 9.3 MB/s eta 0:00:01
[?25hBuilding wheels for collected packages: pyyaml
  Building wheel for pyyaml (setup.py) ... [?25ldone
[?25h  Created wheel for pyyaml: filename=PyYAML-5.3.1-cp37-cp37m-macosx_10_9_x86_64.whl size=15221

import libraries

In [8]:
import os
import umap
import numpy as np
import pandas as pd
import seaborn as sns
import matplotlib.pyplot as plt

from keras.applications.vgg16 import VGG16
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.applications.vgg16 import preprocess_input

Using TensorFlow backend.


ModuleNotFoundError: No module named 'tensorflow'

In [None]:
from keras.models import Sequential
from keras.layers.core import Flatten, Dense, Dropout
from keras.layers.convolutional import Convolution2D, MaxPooling2D, ZeroPadding2D
from keras.optimizers import SGD

In [7]:
from keras import backend as K
K.set_image_data_format('channels_first')

Using TensorFlow backend.


ModuleNotFoundError: No module named 'tensorflow'

### VGG Architecture

VGG-16 was one of the best performing architecture in [ILSVRC](http://www.image-net.org/challenges/LSVRC/) challenge 2014. It was the runner up in classification task with top-5 classification error of 7.32% (only behind GoogLeNet with classification error 6.66%). It was also the winner of localization task with 25.32% localization error.

![title](https://neurohive.io/wp-content/uploads/2018/11/vgg16-1-e1542731207177.png)

VGG is a deep convolutional neural network, with stacks of convolutional layers producing higher level features as you get deeper into the network. The output (fully connected) layers and softmax take those feature maps and predict image class ('sailboat', 'automobile', etc). This is the basis for a whole image classifier. 

During training, input images (at left) and class labels (at right, 1000 classes) are used to learn the weights of the kernels for each of these convolutional layers above. Once trained, we can also use the penultimate layers (7 x 7 x 512 above) as a feature vector describing the images in this high dimensional feature space.

### Use of kernels and convolution

![2DConvUrl](https://upload.wikimedia.org/wikipedia/commons/1/19/2D_Convolution_Animation.gif "2D Conv")

An image kernel is a small matrix used to apply effects like the ones you might find in Photoshop or Gimp, such as blurring, sharpening, outlining or embossing. They're also used in machine learning for 'feature extraction', a technique for determining the most important portions of an image. In this context the process is referred to more generally as "convolution"

The following cell loads the pretrained VGG16 (trained on imagenet), but leaves off the "top" of the network (the softmax and fully connected layeras). We do not want to do classfication here, we are just interested in the feature maps to use the network as a feature extractor.

In [None]:
model = VGG16(weights='imagenet', include_top=False)
model.summary() # shows the various layers, etc.

## Load test images

Download two Mondrian paintings as test images and save to the current directory. We will try an abstract image and a landscape image.

In [None]:
!wget -O landscape.jpg https://images.rkd.nl/rkd/thumb/650x650/bcb9558d-08a1-a57f-b5fc-ec562c446838.jpg
!wget -O abstract.jpg https://images.rkd.nl/rkd/thumb/650x650/56c1a7ff-4661-12ea-e5bc-0f8be29c977a.jpg

Now lets use the `keras.preprocessing.image` method `load_img` to read in the file, and display it with matplotlib. (Try with both the landscape and abstract image)

In [None]:
image = load_img("landscape.jpg", target_size=(224, 224))
# image = load_img("abstract.jpg", target_size=(224, 224))

plt.imshow(image)
plt.show()

In [None]:
image.size

convert the image to a numpy array using the `keras.preprocessing.image` `img_to_array` method imported above

In [None]:
image = img_to_array(image)
image.shape

Keras expects an array of images when evaluating/predicting with the convolutional neural network. Let's reshape our single image into an array of images with only 1 member:

In [None]:
image = image.reshape((1, image.shape[0], image.shape[1], image.shape[2]))
image.shape

Notice this added an additional dimension to the front of the array. 

If you were extending this activity to calculate features for a whole set of images (say the mondrian paintings, or rothko, or you images from exercise 1), we would read in each of those images, and have all `n` of them stored in an `(n, 3, 224, 224)` array.

## Extract Feature Vectors with VGG16

In this step we will prepare an image to work with VGG16, using the `keras.applications.vgg16` `preprocess_input` method.

In [None]:
image = preprocess_input(image)

Evaluate the network in the "forward" direction. This means we provide the preprocessed image as an input and calculate the feature maps and activations for every layer in our network:

In [None]:
vgg16_feature = model.predict(image)
vgg16_feature.shape

Notice that our output (`vgg16_feature`) is an array of 512 7 x 7 maps. If we had passed in multiple test images to predict, we would see a result of shape `(n, 512, 7, 7)`, where each of `n` inputs has 512 7 x 7 feature maps.

Let's grab the feature vectors as a numpy array:

In [None]:
vgg16_feature_np = np.array(vgg16_feature)
vgg16_feature_np.shape

## Display the Extracted Features as Feature Maps

We can display our extracted features as a grid of feature maps (using the raw `vgg_feature` directly). Here we will plot a subset of the 512 vectors (only 64 of them) in an 8 x 8 grid:

In [None]:
plt.figure(figsize=(10,10))
plt.title('First 64 Feature maps (7 x 7) For Image', fontsize=16);
# plot 64 of the maps on an 8x8 square. (NOTE we have 512 total)
xcount = 8
ycount = 8
ix = 1
for _ in range(xcount):
    for _ in range(ycount):
        # specify subplot and turn of axis
        ax = plt.subplot(xcount, ycount, ix)
        ax.set_xticks([])
        ax.set_yticks([])
        # plot filter channel in grayscale
        plt.imshow(vgg16_feature[0, ix-1, :, :], cmap='gray')
        ix += 1
# show the figure
plt.show()

NOTE: we are only displaying a subset (64) of the full 512 feature vectors.

### Display the Extracted Features as a Linear Vector

We can flatten our feature maps into one linear vector:

In [None]:
vgg16_feature_vector = vgg16_feature_np.flatten()
vgg16_feature_vector.shape

Notice that our 512 7 x 7 feature maps gives us a 25088-dimensional (7 x 7 x 512= 25088) feature vector.

In [None]:
vgg16_feature_vector

This will be easiest to display as a small grid. 

We can reshape our 1 x 25088 feature vector into a more manageable 392 x 64 array. We can display that as a small grid. Bright spots correspond to higher "activations" on the output feature map.

In [None]:
plt.figure(figsize=(10, 4), dpi=150)
plt.imshow(vgg16_feature_vector.reshape((64, 392)), cmap='gray')
plt.show()

This bright spots in this vector correspond to the degree of activation in the feature map outputs. Brighter spots correspond to the degree of "activation" of the high order features that the network has learned to detect through its training.

## Extensions
- Assemble a set of images (for instance mondrian paintings, or your images from exercise 1)
- Calculate and store the VGG16 feature maps for each image
- Clustering: use a clustering algorithm (k means, affinitey clustering, any others you know) to group the paintings according to their feature vectors. What groups do you see? Do they make sense?
- Displaying: plot the results using UMAP, PCA, or t-SNE to see how our images are groupd in this high dimensional feature space.

## Reference
- Code for visualizing all layers: https://machinelearningmastery.com/how-to-visualize-filters-and-feature-maps-in-convolutional-neural-networks