# Introduction #

In these exercises, you'll explore the effect of the `strides` and `padding` parameters in `Conv2D` and `MaxPool2D` layers, learn about how convnets can capture large-scale visual features through stacking layers, and finally see how convolution can be used on one-dimensional data, a time series.

Run the cell below to set everything up.

In [None]:
# Setup feedback system
from learntools.core import binder
binder.bind(globals())
from learntools.computer_vision.ex4 import *

from cv_prelude import *

# Moving Windows #

In this exercise, you'll have a chance to experiment with the moving window parameters in `Conv2D` and `MaxPool2D` layers.

### 1a) Explore Stride

In [None]:
# Lines below will give you a hint or solution code
#_COMMENT_IF(PROD)_
q_1.a.hint()
#_COMMENT_IF(PROD)_
q_1.a.solution()

### 1b) Explore Padding


In [None]:
# Lines below will give you a hint or solution code
#_COMMENT_IF(PROD)_
q_1.b.hint()
#_COMMENT_IF(PROD)_
q_1.b.solution()

# The Receptive Field #

In all of the examples we've done, we've used $3 \times 3$ kernels; this is perhaps the most common choice for kernel size in convolutional networks. We might worry, though: if our images have dimension `(192, 192)`, are kernels with dimension `(3, 3)` large enough to capture all of the important features? A $3 \times 3$ square wouldn't cover the shape of an eye or an ear, for instance. Maybe we should use larger kernels?

In fact, this is rarely necessary. Networks occassionally will have an initial `Conv2D` layer with larger kernels, perhaps with `kernel_size=(5, 5)`, but usually not much larger. Instead, convnets can more effectively capture large-scale information from an image by stacking convolution and pooling layers. This stacking increases the number of pixels the output neurons are receiving information from, that is, it increases the size of the neurons' **receptive field**. Let's see how this happens now.

### 3) How the Receptive Field Grows

This next picture illustrates two stacked convolutional layers both with `(3, 3)` kernels. The bottom layer represents the input. Each of the neurons in the first (middle) layer has a $3 \times 3$ receptive field. Following the path of connections, we can see that each of the neurons in the second (top) layer has a $5 \times 5$ receptive field.

<figure>
<img src="https://i.imgur.com/HmwQm2S.png" alt="Illustration of the receptive field of two stacked convolutions." width=250>
</figure>

If you added a *third* convolutional layer with a `(3, 3)` kernel, each of its neurons would have a receptive field of:

In [None]:
# Lines below will give you a hint or solution
#_COMMENT_IF(PROD)_
q_2.a.hint()
#_COMMENT_IF(PROD)_
q_2.a.solution()

Now say you add a `(2, 2)` maximum pooling layer with `strides=2` after the third convolution. What receptive field do the outputs have now? (This is harder. Try the hint if you need help.)

In [None]:
# Lines below will give you a hint or solution
#_COMMENT_IF(PROD)_
q_2.b.hint()
#_COMMENT_IF(PROD)_
q_2.b.solution()

# One-Dimensional Convolution #

Though we've been using convolutional networks on two-dimensional data, it turns out that they can also be useful on *one*-dimensional data, like time series or natural language texts. In fact, convolutional networks tend to be successful on any kind of data with a strong **local topological structure**, meaning that the information about a point tends to be concentrated in nearby points -- you can most successfully predict the value of a pixel by looking at nearby pixels, you can most successfully predict the weather today by looking at the weather yesterday instead of a month ago.

### 4) Apply a 1D Convolution

In this exercise, we'll see how a convolution can be used on a **time series**. The time series we'll use is from [Google Trends](https://trends.google.com/trends/); it measures the popularity of the search term "machine learning" for weeks from January 25, 2015 to January 15, 2020.

In [None]:
import pandas as pd

# Load the time series as a Pandas dataframe
machinelearning = pd.read_csv(
    '/kaggle/input/computer-vision-resources/machinelearning.csv',
    parse_dates=['Week'],
    index_col='Week',
)

machinelearning.plot();

Because our data is one-dimensional, the kernel needs to be one-dimensional as well. Define a one dimensional kernel. Though not required, you'll get better results if the entries sum to 1.

In [None]:
# YOUR CODE HERE: Define a 1D kernel. 
kernel = tf.constant([____])
q_3.check()

In [None]:
#%%RM_IF(PROD)%%
kernel = tf.constant([0.1, 0.2, 0.3, 0.4])
q_3.assert_check_passed()

Now run the next cell to apply the kernel with a convolution and see what effect it had on the time series.

In [None]:
# Reformat for TensorFlow
ts_data = machinelearning.to_numpy()
ts_data = tf.expand_dims(ts_data, axis=0)
ts_data = tf.cast(ts_data, dtype=tf.float32)
kern = tf.reshape(kernel, shape=(*kernel.shape, 1, 1))

ts_filter = tf.nn.conv1d(
    input=ts_data,
    filters=kern,
    stride=1,
    padding='VALID',
)

# Format as Pandas Series
machinelearning_filtered = pd.Series(tf.squeeze(ts_filter).numpy())

machinelearning_filtered.plot();

# Conclusion #

This lesson ends our discussion of feature extraction. Hopefully, having completed these lessons, you've gained some intuition about how the process works and why the usual choices for its implementation are often the best ones.

In the next lesson, Lesson 5, you'll learn how to compose the `Conv2D` and `MaxPool2D` layers to build your own convolutional networks from scratch.