<a href="https://colab.research.google.com/github/nyp-sit/sdaai-iti107/blob/main/session-1/convnet_filter.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

<a href="https://colab.research.google.com/github/nyp-sit/sdaai-iti107/blob/main/session-1/convnet_filter.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab" align="left"/></a>

# Convolutional Filter

Welcome to the programming exercise. This is part of the series of exercises to help you understand and apply convolutional neural networks (Convnet or CNN) to different machine learning problems (e.g. computer vision tasks). In this exercise, we are going to see how kernel or convolution filter (one of the key building block in Convnet) is acting as feature detector (such as edge detector, line detector, etc). We use hand-tuned parameters for our filter in the exercise, but in the following labs, we will train the filter to automatically learn the parameters from data.

You will learn: 
- to use the kernel as a feature detector  


Let us download some sample images to be used for experimenting with our hand-crafted filter later on. 

In [None]:
!wget https://github.com/nyp-sit/sdaai-iti107/raw/main/session-1/images/black_circle.jpg
!wget https://github.com/nyp-sit/sdaai-iti107/raw/main/session-1/images/vertical_horizontal.png

In [None]:
import tensorflow as tf

import numpy as np
import matplotlib 
import matplotlib.pyplot as plt
from matplotlib import image

First we define some convenient functions to plot image.

In [None]:
# expect an image of shape (X,Y)
def plot_image(image):
    plt.imshow(image, cmap="gray")
    plt.axis("off")
    plt.show()

def plot_color_image(image):
    plt.imshow(image.astype(np.uint8))
    plt.axis("off")
    plt.show()

In the following, we will construct a vertical edge detector, using a $3\times3$ filter. 
The filter will be as follows (see the explanation in the lecture):
$$\begin{bmatrix}
 -1 & 0 & 1  \\ -1 & 0 & 1 \\ -1 & 0 & 1
\end{bmatrix}$$



In [None]:
vertical_filter = np.zeros(shape=(3,3), dtype=np.float32)
vertical_filter[:,0] = -1.  # set the first column to -1
vertical_filter[:,2] = 1.   # set the 3rd column to 1
print(vertical_filter)

Let us visualize the filter.

In [None]:
plt.imshow(vertical_filter, cmap='gray')

Now we want to apply this filter  to our image below. Let's load and visualize our original image. You will see that the image consists of vertical and horizontal lines.

In [None]:
img = image.imread('vertical_horizontal.png')
plt.imshow(img)
print(img.shape)

We will now convolve this image with our filter and see the resulting output. 
Tensorflow provides a function to perform 2d convolution, called `conv2d()`:

```
tf.nn.conv2d(
    input,
    filter=None,
    strides=None,
    padding=None,
    use_cudnn_on_gpu=True,
    data_format='NHWC',
    dilations=[1, 1, 1, 1],
    name=None,
    filters=None
)
```

It is expecting an input tensor of shape ``[batch, in_height, in_width, in_channels]`` and a filter / kernel tensor of shape ```[filter_height, filter_width, in_channels, out_channels]```

So we will have to reshape our image and filter to the appropriate number of axis, i.e. as a 4-D tensor.

In [None]:
height, width, channels = img.shape

# here we take average of our RGB channels and treat this as gray scale image with only 1 color channel
img = img.mean(axis=2).astype(np.float32)
images = img.reshape(1, height, width, 1)
print(images.shape)

In [None]:
vertical_filters = vertical_filter.reshape(3,3,1,1)
outputs = tf.nn.conv2d(images, vertical_filters, strides=1, padding="SAME")
print(outputs.shape)

# we take the absolute values, so all values falls within 0,255
outputs = np.abs(outputs)
# we plot the image data at batch=0 and channel=0, our plot_image expects a a shape of (X,Y), i.e. 2 dimensions
plot_image(outputs[0,:,:,0])

Here you can see that a bright line is shown for every vertical edge detected in the image.

**Exercise 1:**

Construct a horizontal edge detector (filter) and apply it to the image. Plot the resultant image. (Hint: Note the vertical edge detector has vertical column values transitioning from -1 to 1)

<details><summary>Click here for answer</summary>
<br/>

```
horizontal_filter = np.zeros(shape=(3,3), dtype=np.float32)
horizontal_filter[0,:] = -1.  # set the first column to -1
horizontal_filter[2,:] = 1.   # set the 3rd column to 1
print(horizontal_filter)
horizontal_filter = horizontal_filter.reshape(3,3,1,1)
outputs = tf.nn.conv2d(images, horizontal_filter, strides=1, padding="SAME")
print(outputs.shape)
outputs = np.abs(outputs)
plot_image(outputs[0,:,:,0])
```
    
</details>

In [None]:
### START YOUR CODE HERE ###



### END YOUR CODE HERE ###

**Exercise 2 (Optional):** 

[Sobel filter](https://en.wikipedia.org/wiki/Sobel_operator) is commonly used in computer vision for edge detection, creating images emphasising the edge. $G_x$ and $G_y$ are used to detect pixel gradient in the x and y directions respectively, $|G|$ is used to compute absolute gradient magnitude at eah pixel.


$$G_x=\begin{bmatrix}-1 & 0 & 1  \\ -2 & 0 & 2 \\ -1 & 0 & 1\end{bmatrix}$$

$$G_y=\begin{bmatrix}-1 & -2 & -1  \\ 0 & 0 & 0 \\ 1 & 2 & 1\end{bmatrix}$$

$$|G|=\sqrt{G_x^2 + G_y^2}$$



In [None]:
circle_image = image.imread('black_circle.jpg')
plt.imshow(circle_image, cmap=plt.cm.gray, interpolation='nearest')
plt.show()
print(circle_image.shape)

Construct the two Sobel filters (based on the formula given above) and apply them to the image. Combined the two resultant images to get the final image. Plot the final image.

<details><summary>Click here for answer</summary>
<br/>
    
```
gradx_filter = np.array([[-1,0,1],[-2,0,2],[-1,0,1]])
gradx_filter = gradx_filter.reshape(3,3,1,1)

grady_filter = np.array([[-1,-2,-1],[0,0,0],[1,2,1]])
grady_filter = grady_filter.reshape(3,3,1,1)

height, width, channels = circle_image.shape

# collapse the 3 channels into a single by taking average
circle_image = circle_image.mean(axis=2)

# reshape to add in additional batch dimension
circle_image = circle_image.reshape(1, height, width, 1)

    
image_x = tf.nn.conv2d(circle_image, gradx_filter, strides=1, padding="SAME")
plot_image(image_x[0,:,:,0])

image_y = tf.nn.conv2d(circle_image, grady_filter, strides=1, padding="SAME")
plot_image(image_x[0,:,:,0])

gradient_magnitude = np.sqrt(np.square(image_x[0,:,:,0]) + np.square(image_y[0,:,:,0]))
plt.imshow(gradient_magnitude, cmap='gray')

```
</details>


In [None]:
### START YOUR CODE HERE ###

# define the G_x filter 
gradx_filter = None

# define the G_x filter 
grady_filter = None

# reshape the circle image to appropriate shape suitable for conv2d
heigh, width, channels= circle_image.shape

# collapse the 3 channels into a single by taking average
circle_image = circle_image = circle_image.mean(axis=2)

# add in batch and channel dimension
circle_image = None 

# apply the G_x filter on circle_image to obtain a new image, image_x
image_x = None 

# apply the G_y filter on circle_image to obtain a new image, image_y 
image_y = None 


# compute combined gradient magnitude, i.e. sqrt(G_x^2 + G_y^2)
gradient_magnitude = None


plt.imshow(gradient_magnitude, cmap='gray')


### END YOUR CODE ###