# CNN Filter & Feature Map Visualization 

## Visualizing the activations and first-layer weights

**Layer Activations.** The most straight-forward visualization technique is to show the activations of the network during the forward pass. For ReLU networks, the activations usually start out looking relatively blobby and dense, but as the training progresses the activations usually become more sparse and localized. One dangerous pitfall that can be easily noticed with this visualization is that some activation maps may be all zero for many different inputs, which can indicate dead filters, and can be a symptom of high learning rates.

Typical-looking activations on the first CONV layer (left), and the 5th CONV layer (right) of a trained AlexNet looking at a picture of a cat. Every box shows an activation map corresponding to some filter. Notice that the activations are sparse (most values are zero, in this visualization shown in black) and mostly local.


**Conv/FC Filters.** The second common strategy is to visualize the weights. These are usually most interpretable on the first CONV layer which is looking directly at the raw pixel data, but it is possible to also show the filter weights deeper in the network. The weights are useful to visualize because well-trained networks usually display nice and smooth filters without any noisy patterns. Noisy patterns can be an indicator of a network that hasn’t been trained for long enough, or possibly a very low regularization strength that may have led to overfitting.

Typical-looking filters on the first CONV layer (left), and the 2nd CONV layer (right) of a trained AlexNet. Notice that the first-layer weights are very nice and smooth, indicating nicely converged network. The color/grayscale features are clustered because the AlexNet contains two separate streams of processing, and an apparent consequence of this architecture is that one stream develops high-frequency grayscale features and the other low-frequency color features. The 2nd CONV layer weights are not as interpretable, but it is apparent that they are still smooth, well-formed, and absent of noisy patterns.


##




In [14]:
#!pip install tf-nightly

# USELESS !!!!

Collecting tf-nightly
  Downloading tf_nightly-2.3.0.dev20200610-cp37-cp37m-manylinux2010_x86_64.whl (349.8 MB)
[K     |████████████████████████████████| 349.8 MB 14 kB/s s eta 0:00:01     |███████████████████████▎        | 254.3 MB 65.3 MB/s eta 0:00:02
Collecting tf-estimator-nightly
  Downloading tf_estimator_nightly-2.3.0.dev2020060901-py2.py3-none-any.whl (459 kB)
[K     |████████████████████████████████| 459 kB 41.9 MB/s eta 0:00:01
Collecting tb-nightly<2.4.0a0,>=2.3.0a0
  Downloading tb_nightly-2.3.0a20200610-py3-none-any.whl (3.0 MB)
[K     |████████████████████████████████| 3.0 MB 44.3 MB/s eta 0:00:01
Installing collected packages: tf-estimator-nightly, tb-nightly, tf-nightly
Successfully installed tb-nightly-2.3.0a20200610 tf-estimator-nightly-2.3.0.dev2020060901 tf-nightly-2.3.0.dev20200610


In [None]:
# MODULES
from matplotlib import pyplot
from tensorflow.keras.applications.vgg16 import VGG16
from tensorflow.keras.applications.vgg19 import VGG19
from tensorflow.keras.applications.xception import Xception
from tensorflow.keras.applications.densenet import DenseNet169


# LOAD MODELS
model1 = VGG16()
model2 = VGG19()
model3 = Xception()
model4 = DenseNet169()

# SUMMARIZE MODELS
#model1.summary()
#model2.summary()
#model3.summary()
model4.summary()

  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])
  _np_qint8 = np.dtype([("qint8", np.int8, 1)])
  _np_quint8 = np.dtype([("quint8", np.uint8, 1)])
  _np_qint16 = np.dtype([("qint16", np.int16, 1)])
  _np_quint16 = np.dtype([("quint16", np.uint16, 1)])
  _np_qint32 = np.dtype([("qint32", np.int32, 1)])
  np_resource = np.dtype([("resource", np.ubyte, 1)])


Instructions for updating:
Call initializer instance with the dtype argument instead of passing it to the constructor
Downloading data from https://github.com/fchollet/deep-learning-models/releases/download/v0.1/vgg19_weights_tf_dim_ordering_tf_kernels.h5

In [None]:
# summarize filter shapes
for model1_layer in model1.layers:
    # check for convolutional layer
#    print(layer.name)
    if 'conv' not in model1_layer.name:
        continue
    #print(model1_layer.name)
    
for model2_layer in model2.layers:
    # check for convolutional layer
#    print(layer.name)
    if 'conv' not in model2_layer.name:
        continue
    #print(model2_layer.name)

for model3_layer in model3.layers:
    # check for convolutional layer
#    print(layer.name)
    if 'conv2d_26' not in model3_layer.name:
        continue
    #print(model3_layer.name)
    
for model4_layer in model4.layers:
    # check for convolutional layer
#    print(layer.name)
    if 'conv' not in model4_layer.name:
        continue
    print(model4_layer.name)

In [None]:
# summarize filters in each convolutional layer

for model1_layer in model1.layers:
    # check for convolutional layer
    #print(layer.name)
    if 'conv' not in model1_layer.name:
        continue
    # get filter weights
    model1_filters, model1_biases = model1_layer.get_weights()
    #print(model1_layer.name, model1_filters)
#type(filters[0])

for model2_layer in model2.layers:
    # check for convolutional layer
    #print(layer.name)
    if 'conv' not in model2_layer.name:
        continue
    # get filter weights
    model2_filters, model2_biases = model2_layer.get_weights()
    #print(model2_layer.name, model2_filters)
#type(filters[0])

for model3_layer in model3.layers:
    # check for convolutional layer
    #print(layer.name)
    if 'conv2d_26' not in model3_layer.name:
        continue
    # get filter weights
    model3_filters = model3_layer.get_weights()[0]
    #print(model3_layer.name, model3_filters)
#type(filters[0])

for model4_layer in model4.layers:
    # check for convolutional layer
#    print(layer.name)
    if 'conv' not in model4_layer.name:
        continue
    # get filter weights
    model4_filters = model4_layer.get_weights()
    print(model4_layer.name, model4_filters)
#type(filters[0])

In [136]:
# normalize filter values to 0-1 so we can visualize them
f_min1, f_max1 = model1_filters.min(), model1_filters.max()
model1_filters = (model1_filters - f_min1) / (f_max1 - f_min1)

# normalize filter values to 0-1 so we can visualize them
f_min2, f_max2 = model2_filters.min(), model2_filters.max()
model2_filters = (model2_filters - f_min2) / (f_max2 - f_min2)

# normalize filter values to 0-1 so we can visualize them
f_min3, f_max3 = model3_filters.min(), model3_filters.max()
model3_filters = (model3_filters - f_min3) / (f_max3 - f_min3)

# normalize filter values to 0-1 so we can visualize them
f_min4, f_max4 = model4_filters.min(), model4_filters.max()
model4_filters = (model4_filters - f_min4) / (f_max4 - f_min4)


print(model1_filters.shape, "||", model2_filters.shape, "||", model3_filters.shape, "||", model4_filters.shape)

AttributeError: 'list' object has no attribute 'min'

In [140]:
f = pyplot.figure(figsize=(16,16))
# plot first few filters
n_filters, ix = 6, 1
for i in range(n_filters):
	# get the filter
	f = model4_filters[:, :, :, i]
	# plot each channel separately
	for j in range(3):
		# specify subplot and turn of axis
		ax = pyplot.subplot(n_filters, 3, ix)
		ax.set_xticks([])
		ax.set_yticks([])
		# plot filter channel in grayscale
		pyplot.imshow(f[:, :, j], cmap='gray')
		ix += 1
# show the figure
pyplot.show()

TypeError: list indices must be integers or slices, not tuple

<Figure size 1152x1152 with 0 Axes>

ying this together, the complete example of plotting the first six filters from the first hidden convolutional layer in the VGG16 model is listed below.

In [84]:
from matplotlib import pyplot
f = pyplot.figure(figsize=(16,16))

# load the model
#model = DenseNet121()
# retrieve weights from the second hidden layer
filters, biases = model.layers[0].get_weights()
# normalize filter values to 0-1 so we can visualize them
f_min, f_max = filters.min(), filters.max()
filters = (filters - f_min) / (f_max - f_min)
# plot first few filters
n_filters, ix = 6, 1
for i in range(n_filters):
	# get the filter
	f = filters[:, :, :, i]
	# plot each channel separately
	for j in range(3):
		# specify subplot and turn of axis
		ax = pyplot.subplot(n_filters, 3, ix)
		ax.set_xticks([])
		ax.set_yticks([])
		# plot filter channel in grayscale
		pyplot.imshow(f[:, :, j], cmap='gray')
		ix += 1
# show the figure
pyplot.show()

ValueError: not enough values to unpack (expected 2, got 0)

<Figure size 1152x1152 with 0 Axes>

In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
import skimage.io
from skimage.transform import resize
from imgaug import augmenters as iaa
from tqdm import tqdm
import PIL
from PIL import Image, ImageOps
import cv2
from sklearn.utils import class_weight, shuffle
from keras.losses import binary_crossentropy, categorical_crossentropy
#from keras.applications.resnet50 import preprocess_input
from keras.applications.densenet import DenseNet121,DenseNet169
import keras.backend as K
import tensorflow as tf
from sklearn.metrics import f1_score, fbeta_score, cohen_kappa_score
from keras.utils import Sequence
from keras.utils import to_categorical
from sklearn.model_selection import train_test_split
import imgaug as ia

WORKERS = 2
CHANNEL = 3

import warnings
warnings.filterwarnings("ignore")
SIZE = 300
NUM_CLASSES = 5

In [None]:
df_train = pd.read_csv('../input/train.csv')
df_test = pd.read_csv('../input/test.csv')

In [None]:
def display_samples(df, columns=4, rows=3):
    fig=plt.figure(figsize=(5*columns, 4*rows))

    for i in range(columns*rows):
        image_path = df.loc[i,'id_code']
        image_id = df.loc[i,'diagnosis']
        img = cv2.imread(f'../input/train_images/{image_path}.png')
        img = cv2.cvtColor(img, cv2.COLOR_BGR2RGB)
        
        fig.add_subplot(rows, columns, i+1)
        plt.title(image_id)
        plt.imshow(img)
    
    plt.tight_layout()

display_samples(df_train)

In [None]:
x = df_train['id_code']
y = df_train['diagnosis']

x, y = shuffle(x, y, random_state=8)
y.hist()

In [None]:
y = to_categorical(y, num_classes=NUM_CLASSES)
train_x, valid_x, train_y, valid_y = train_test_split(x, y, test_size=0.15,
                                                      stratify=y, random_state=8)
print(train_x.shape)
print(train_y.shape)
print(valid_x.shape)
print(valid_y.shape)

In [None]:
# plot feature map of first conv layer for given image
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.models import Model
from matplotlib import pyplot 
from numpy import expand_dims


f = plt.figure(figsize=(16,16))
# load the modelf = plt.figure(figsize=(10,3))
model = VGG16()
# redefine model to output right after the first hidden layer
model = Model(inputs=model.inputs, outputs=model.layers[1].output)
model.summary()
# load the image with the required shape
img = load_img(f'../input/test_images/270a532df702.png', target_size=(224, 224))
# convert the image to an array
img = img_to_array(img)
# expand dimensions so that it represents a single 'sample'
img = expand_dims(img, axis=0)
# prepare the image (e.g. scale pixel values for the vgg)
img = preprocess_input(img)
# get feature map for first hidden layer
feature_maps = model.predict(img)
# plot all 64 maps in an 8x8 squares
square = 8
ix = 1
for _ in range(square):
	for _ in range(square):
		# specify subplot and turn of axis
		ax = pyplot.subplot(square, square, ix)
		ax.set_xticks([])
		ax.set_yticks([])
		# plot filter channel in grayscale
		pyplot.imshow(feature_maps[0, :, :, ix-1], cmap='viridis')
		ix += 1
# show the figure
pyplot.show()


In [None]:
# visualize feature maps output from each block in the vgg model
from keras.applications.vgg16 import VGG16
from keras.applications.vgg16 import preprocess_input
from keras.preprocessing.image import load_img
from keras.preprocessing.image import img_to_array
from keras.models import Model
import matplotlib.pyplot as plt
from numpy import expand_dims




# load the model
model = VGG16()
# redefine model to output right after the first hidden layer
ixs = [2, 5, 9, 13, 17]
outputs = [model.layers[i].output for i in ixs]
model = Model(inputs=model.inputs, outputs=outputs)
# load the image with the required shape
# convert the image to an array
img = load_img(f'../input/test_images/270a532df702.png', target_size=(224, 224))
# convert the image to an array
img = img_to_array(img)
# expand dimensions so that it represents a single 'sample'
img = expand_dims(img, axis=0)
# prepare the image (e.g. scale pixel values for the vgg)
img = preprocess_input(img)
# get feature map for first hidden layer
feature_maps = model.predict(img)
# plot the output from each block
square = 8
for fmap in feature_maps:
	# plot all 64 maps in an 8x8 squares
	ix = 1
	for _ in range(square):
		plt.figure(figsize=(64,64))
		for _ in range(square):
           

			# specify subplot and turn of axis
			ax = pyplot.subplot(square, square, ix)
			ax.set_xticks([])
			ax.set_yticks([])
			
			# plot filter channel in grayscale
			plt.imshow(fmap[0, :, :, ix-1], cmap='viridis')
			ix += 1
	# show the figure

        
	plt.show()

Running the example results in five plots showing the feature maps from the five main blocks of the VGG16 model.

We can see that the feature maps closer to the input of the model capture a lot of fine detail in the image and that as we progress deeper into the model, the feature maps show less and less detail.

This pattern was to be expected, as the model abstracts the features from the image into more general concepts that can be used to make a classification. Although it is not clear from the final image that the model saw a bird, we generally lose the ability to interpret these deeper feature maps.