In [None]:
import torch

### The cross-correlation operator

In [None]:
def corr2d(X, K):
    H, W = X.shape
    h, w = K.shape
    Y = torch.zeros(H-h+1, W-w+1)
    for i in range(Y.shape[0]):
        for j in range(Y.shape[1]):
            Y[i, j] = (X[i:i+h, j:j+w] * K).sum()
    return Y

In [None]:
# test your function
X = torch.tensor([[0, 1, 2],
                 [3, 4, 5],
                 [6, 7, 8]]).type(torch.FloatTensor)
K = torch.tensor([[0, 1],
                 [2, 3]]).type(torch.FloatTensor)

corr2d(X, K)

### Convolutional Layers

A convolutional layer cross-correlates the input and kernels and adds a scalar bias to produce an output.
The parameters of the convolutional layer are precisely the values that constitute the kernel and the scalar
bias. When training the models based on convolutional layers, we typically initialize the kernels randomly,
just as we would with a fully-connected layer.

In [None]:
# # insert your code here
class Conv2D(torch.nn.Module):
    def __init__(self, kernel_size, **kargs):
        super(Conv2D, self).__init__(**kargs)
        self.weight = torch.nn.Parameter(torch.randn(kernel_size),
                                        requires_grad=True)
        self.bias = torch.nn.Parameter(torch.randn((1,)),
                                      requires_grad=True)
    def forward(self, x):
        return corr2d(x, self.weight) + self.bias

In [None]:
# let's test
my_cnn = Conv2D((2, 2))
out = my_cnn(X)

In [None]:
out

### Object Detection in Images

Let’s look at a simple application of a convolutional layer: detecting the edge of an object in an image by
finding the location of the pixel change. First, we construct an 'image' of 6 x 8 pixels. The middle four
columns are black (0) and the rest are white (1).

In [None]:
# create X
X = torch.ones((6, 8)).type(torch.FloatTensor)
X[:, 2:6] = 0
X

Next, we construct a kernel K with a height of 1 and width of 2. When we perform the cross-correlation
operation with the input, if the horizontally adjacent elements are the same, the output is 0. Otherwise, the
output is non-zero.

In [None]:
# create Kernel
K = torch.tensor([[1, -1]]).type(torch.FloatTensor)

Enter X and our designed kernel K to perform the cross-correlation operations. As you can see, we will detect
1 for the edge from white to black and -1 for the edge from black to white. The rest of the outputs are 0.

In [None]:
# check output
Y = corr2d(X, K)
Y

Let’s apply the kernel to the transposed image.

In [None]:
# insert your code here
corr2d(X.T, K)

### Learning a Kernel

Designing an edge detector by finite differences [1, -1] is neat if we know this is precisely what we are
looking for. However, as we look at larger kernels, and consider successive layers of convolutions, it might
be impossible to specify precisely what each filter should be doing manually.

Now let’s see whether we can learn the kernel that generated Y from X by looking at the (input, output)
pairs only. We first construct a convolutional layer and initialize its kernel as a random array. Next, in
each iteration, we will use the squared error to compare Y and the output of the convolutional layer, then
calculate the gradient to update the weight.

In [None]:
# insert your code here
K = torch.tensor([[1, -1]]).type(torch.FloatTensor)
Y = corr2d(X, K)

my_net = Conv2D((1, 2))
optimizer = torch.optim.SGD(my_net.parameters(),
                           lr=0.01)

for i in range(10):
    Y_hat = my_net(X)
    l = ((Y_hat - Y)**2).sum()
    print("Error is", float(l))
    optimizer.zero_grad()
    l.backward()
    optimizer.step()

As you can see, the error has dropped to a small value after 10 iterations. Now we will take a look at the kernel array we learned.

In [None]:
# check the learned weights
my_net.weight

Indeed, the learned kernel array is remarkably close to the kernel array K we defined earlier.