<a href="https://colab.research.google.com/github/HSE-LAMBDA/MLDM-2021/blob/master/09-convolutions-and-regularization/MLDM_2021_seminar09_Intro_to_CNN.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Convolutions

In [None]:
#!wget https://raw.githubusercontent.com/HSE-LAMBDA/MLDM-2021/main/09-convolutions-and-regularization/img.npy

In [None]:
import numpy as np
import tensorflow as tf
import matplotlib.pyplot as plt

## Demonstration: convolving to extract features

Let's check out the image we have:

In [None]:
img = np.load("img.npy")

plt.figure(dpi=150)
plt.imshow(img);

At first, we'll experiment with `tf.nn.conv2d` - the function that performs 2d image convolution.

*Note:* this function is designed to work in the context of a neural network (i.e. where input and output come in batches and have multiple channels), so the functin expects 4D tensors rather than 2D. We'll write a short wrapper to work with 2D images.

In [None]:
def convolve(img, kernel):
  return tf.nn.conv2d(
      img[None,...,None],
      kernel[...,None,None], strides=1, padding='VALID'
    ).numpy().squeeze()

Let's try some simple kernels extracting horizontal and vertical edges:

In [None]:
kernel_ver_edge = tf.convert_to_tensor(
    [[ 1., -1.],
     [ 1., -1.]]
)
kernel_hor_edge = tf.convert_to_tensor(
    [[ 1.,  1.],
     [-1., -1.]]
)

vertical_edges = convolve(img, kernel_ver_edge)
horizontal_edges = convolve(img, kernel_hor_edge)

plt.figure(figsize=(4, 5), dpi=150)
plt.subplot(2, 1, 1)
plt.imshow(vertical_edges);
plt.colorbar()
plt.subplot(2, 1, 2)
plt.imshow(horizontal_edges);
plt.colorbar();

We can combine the result, e.g. like this:

In [None]:
edges = (vertical_edges**2 + horizontal_edges**2)**0.5
plt.figure(dpi=150)
plt.imshow(edges);

Another example, blurring kernel:

In [None]:
kernel_blur = tf.convert_to_tensor([[1.,  4.,  7.,  4., 1.],
                                    [4., 16., 26., 16., 4.],
                                    [7., 26., 41., 26., 7.],
                                    [4., 16., 26., 16., 4.],
                                    [1.,  4.,  7.,  4., 1.]]) / 273

edges_blurred = convolve(edges, kernel_blur)

### Uncomment these lines one by one to see the effect
### gradually increasing:
# edges_blurred = convolve(edges_blurred, kernel_blur)
# edges_blurred = convolve(edges_blurred, kernel_blur)
# edges_blurred = convolve(edges_blurred, kernel_blur)
# edges_blurred = convolve(edges_blurred, kernel_blur)
### Keep them **uncommented** for the further code to work

plt.imshow(edges_blurred);

Let's pick up a small patch out of this image:

In [None]:
edges_subset = edges_blurred[210:243, 246:282]
plt.imshow(edges_subset);

What do you think will happen if we use this patch as a kernel when running convolution on the edges image?

In [None]:
plt.figure(dpi=150)
plt.imshow(convolve(edges_blurred, edges_subset))
plt.colorbar();

Note how this kernel highlighted the location of that shape on the input!

## Convolutional layer

Keras has predefined convolutional layers that make use of the convolution function described above.

Note that in the context of deep learning the convolutional kernel is **trainable**, i.e. the network tries to find the best kernel to extract useful features.

In [None]:
# Let's build a layer that takes an image with a single channel and outputs 
# two-channel feature representation:
conv_layer = tf.keras.layers.Conv2D(
    filters=2, kernel_size=2)
conv_layer.build(input_shape=(None, None, 1))

Note that the kernel is initialized randomly (for optimization):

In [None]:
conv_layer.kernel

but we can set it to e.g. our edge detecting kernel values:

In [None]:
conv_layer.kernel[..., 0, 0].assign(kernel_hor_edge)
conv_layer.kernel[..., 0, 1].assign(kernel_ver_edge)

And now the layer performs exactly the same edge-detecting operation:

In [None]:
# Note how we add the batch and channel dimensions here
result = conv_layer(img[None,...,None].astype('float32')).numpy().squeeze()

plt.figure(figsize=(10, 4), dpi=100)
plt.subplot(1, 2, 1)
plt.imshow(result[...,0])
plt.subplot(1, 2, 2)
plt.imshow(result[...,1]);

## Ridiculously impractical example: trying to learn the kernels from the 1st demo

Let's make a keras model that make a similar transformation to the one we did above (i.e. edge detection + blur). We'll try to learn corresponding kernels.

In [None]:
model = tf.keras.Sequential(
    [
      # a block to "reproduce" edge detection:
      tf.keras.layers.Conv2D(filters=2, kernel_size=2, activation='elu'),
 
      tf.keras.layers.Conv2D(filters=100, kernel_size=1, activation='elu'),
      tf.keras.layers.Conv2D(filters=1, kernel_size=1, activation='elu'),

      # a block to "reproduce" blurring
      tf.keras.layers.Conv2D(filters=4, kernel_size=3, activation='elu'),
      tf.keras.layers.Conv2D(filters=4, kernel_size=3, activation='elu'),
      tf.keras.layers.Conv2D(filters=4, kernel_size=3, activation='elu'),
      tf.keras.layers.Conv2D(filters=4, kernel_size=3, activation='elu'),
      tf.keras.layers.Conv2D(filters=4, kernel_size=3, activation='elu'),
      tf.keras.layers.Conv2D(filters=4, kernel_size=3, activation='elu'),
      tf.keras.layers.Conv2D(filters=4, kernel_size=3, activation='elu'),
      tf.keras.layers.Conv2D(filters=4, kernel_size=3, activation='elu'),
      tf.keras.layers.Conv2D(filters=4, kernel_size=3, activation='elu'),
      tf.keras.layers.Conv2D(filters=1, kernel_size=3, activation='elu'),
    ]
)
model.build(input_shape=(None, None, None, 1))
model.summary()

Note: we have quite a lot of parameters and just a single image - we'll probably overfit heavily...

In [None]:
from tqdm import trange

opt = tf.optimizers.Adam()

loss_values = []
for _ in trange(500):
  with tf.GradientTape() as t:
    prediction = model(img[None,...,None].astype('float32'))
    loss = tf.reduce_mean((prediction - edges_blurred[None,...,None])**2)
  grads = t.gradient(loss, model.trainable_variables)
  opt.apply_gradients(zip(grads, model.trainable_variables))
  loss_values.append(loss.numpy())

plt.plot(loss_values);

Let's have a look on what the result of our model's transformation is:

In [None]:
plt.imshow(model(img[None,...,None].astype('float32')).numpy().squeeze());

Try checking the following things:
 - Do the first layers indeed extract the edges?
 - What the intermediate representations of our model look like? (e.g. take the input and only apply a subset of layers from our model to it)
