In [1]:
import numpy as np

In [24]:
X = np.random.randint(low=0,high=255,size=(5,5))
X

array([[143, 218, 157,  12, 101],
       [189, 190, 222, 208, 146],
       [ 63, 126, 141, 187,  29],
       [218, 217,  31,  61, 174],
       [241,   2,  68, 123, 223]])

In [25]:
X[2, 0] # [row, col]

63

In [26]:
x = 0
y = 0
X[y: y+3, x: x+3]

array([[143, 218, 157],
       [189, 190, 222],
       [ 63, 126, 141]])

$x$ is the column pointer, $y$ is the row pointer. Let's consider row selection ($y$): start from row 0 and select up to 3. Now let's consider column selection ($x$): start from 0 and select up to 3.

It may be strange to see $x$ represent columns, but recall that images have their origin in the top left corner. From this point of view, we can think of moving to the right across columns as the $x$ direction and going down the rows as $y$ direction.

So how can we iterate over every pixel of the image?

In [27]:
x, y = X.shape
for x_i in range(x):
    for y_i in range(y):
        print(f"({x_i}, {y_i})")

(0, 0)
(0, 1)
(0, 2)
(0, 3)
(0, 4)
(1, 0)
(1, 1)
(1, 2)
(1, 3)
(1, 4)
(2, 0)
(2, 1)
(2, 2)
(2, 3)
(2, 4)
(3, 0)
(3, 1)
(3, 2)
(3, 3)
(3, 4)
(4, 0)
(4, 1)
(4, 2)
(4, 3)
(4, 4)


Great! Now let's iterate over a $2 \times 2$ subset of pixels.

In [31]:
X

array([[143, 218, 157,  12, 101],
       [189, 190, 222, 208, 146],
       [ 63, 126, 141, 187,  29],
       [218, 217,  31,  61, 174],
       [241,   2,  68, 123, 223]])

In [37]:
x, y = X.shape
for x_i in range(x):
    # print(X[x_i])
    for y_i in range(y):
        # print(X[:,y_i])
        # S = X[y_i: y_i+2, x_i: x_i+2]
        S = X[x_i: x_i+2, y_i: y_i+2]
        print(S)
        print()
        

[[143 218]
 [189 190]]

[[218 157]
 [190 222]]

[[157  12]
 [222 208]]

[[ 12 101]
 [208 146]]

[[101]
 [146]]

[[189 190]
 [ 63 126]]

[[190 222]
 [126 141]]

[[222 208]
 [141 187]]

[[208 146]
 [187  29]]

[[146]
 [ 29]]

[[ 63 126]
 [218 217]]

[[126 141]
 [217  31]]

[[141 187]
 [ 31  61]]

[[187  29]
 [ 61 174]]

[[ 29]
 [174]]

[[218 217]
 [241   2]]

[[217  31]
 [  2  68]]

[[ 31  61]
 [ 68 123]]

[[ 61 174]
 [123 223]]

[[174]
 [223]]

[[241   2]]

[[ 2 68]]

[[ 68 123]]

[[123 223]]

[[223]]



This scan works as expected but we can see that not all elements are $2 \times 2$ as we expect them to be. This always happens when we're at the edge of the matrix.

One way to fix this is to add zero-padding where we add a certain number of rows and columns to all sides of the matrix. In this case, because we're interested in a $2 \times 2$ view, let's pad $X$ with a single layer of 0's on all sides.

In [40]:
X_pad = np.pad(X, (1,1), mode='constant')
X_pad

array([[  0,   0,   0,   0,   0,   0,   0],
       [  0, 143, 218, 157,  12, 101,   0],
       [  0, 189, 190, 222, 208, 146,   0],
       [  0,  63, 126, 141, 187,  29,   0],
       [  0, 218, 217,  31,  61, 174,   0],
       [  0, 241,   2,  68, 123, 223,   0],
       [  0,   0,   0,   0,   0,   0,   0]])

Great, the padding worked! Now let's try that again.

In [49]:
x, y = X_pad.shape
for x_i in range(x):
    for y_i in range(y):
        if x_i >= (x-2):
            continue
        if y_i >= (y-2):
            continue
        S = X_pad[x_i: x_i+2, y_i: y_i+2]
        print(S)
        print()

[[  0   0]
 [  0 143]]

[[  0   0]
 [143 218]]

[[  0   0]
 [218 157]]

[[  0   0]
 [157  12]]

[[  0   0]
 [ 12 101]]

[[  0 143]
 [  0 189]]

[[143 218]
 [189 190]]

[[218 157]
 [190 222]]

[[157  12]
 [222 208]]

[[ 12 101]
 [208 146]]

[[  0 189]
 [  0  63]]

[[189 190]
 [ 63 126]]

[[190 222]
 [126 141]]

[[222 208]
 [141 187]]

[[208 146]
 [187  29]]

[[  0  63]
 [  0 218]]

[[ 63 126]
 [218 217]]

[[126 141]
 [217  31]]

[[141 187]
 [ 31  61]]

[[187  29]
 [ 61 174]]

[[  0 218]
 [  0 241]]

[[218 217]
 [241   2]]

[[217  31]
 [  2  68]]

[[ 31  61]
 [ 68 123]]

[[ 61 174]
 [123 223]]



Perfect! Now let's look at how we can element-wise multiply $S$ by another matrix representing the kernel which we call $K$.

In [54]:
K = np.random.randint(low=-5, high=5, size=(2,2))
print(K)
print()
print()
x, y = X_pad.shape
for x_i in range(x):
    for y_i in range(y):
        if x_i >= (x-2):
            continue
        if y_i >= (y-2):
            continue
        S = X_pad[x_i: x_i+2, y_i: y_i+2]
        print(K*S)
        print()

[[-3  3]
 [ 3 -1]]


[[   0    0]
 [   0 -143]]

[[   0    0]
 [ 429 -218]]

[[   0    0]
 [ 654 -157]]

[[  0   0]
 [471 -12]]

[[   0    0]
 [  36 -101]]

[[   0  429]
 [   0 -189]]

[[-429  654]
 [ 567 -190]]

[[-654  471]
 [ 570 -222]]

[[-471   36]
 [ 666 -208]]

[[ -36  303]
 [ 624 -146]]

[[  0 567]
 [  0 -63]]

[[-567  570]
 [ 189 -126]]

[[-570  666]
 [ 378 -141]]

[[-666  624]
 [ 423 -187]]

[[-624  438]
 [ 561  -29]]

[[   0  189]
 [   0 -218]]

[[-189  378]
 [ 654 -217]]

[[-378  423]
 [ 651  -31]]

[[-423  561]
 [  93  -61]]

[[-561   87]
 [ 183 -174]]

[[   0  654]
 [   0 -241]]

[[-654  651]
 [ 723   -2]]

[[-651   93]
 [   6  -68]]

[[ -93  183]
 [ 204 -123]]

[[-183  522]
 [ 369 -223]]



Awesome. Now let's sum those matrices.

In [58]:
K = np.random.randint(low=-5, high=5, size=(2,2))
print(K)
print()
print()
x, y = X_pad.shape
for x_i in range(x):
    for y_i in range(y):
        if x_i >= (x-2):
            continue
        if y_i >= (y-2):
            continue
        S = X_pad[x_i: x_i+2, y_i: y_i+2]
        print( (K*S).sum() )
        print()

[[-2 -2]
 [ 4 -2]]


-286

136

558

604

-154

-664

-346

-434

134

314

-504

-758

-602

-670

-18

-562

60

272

-654

-536

-918

90

-624

-158

-424



Let's use these sum values to construct a new matrix $Z$.

In [62]:
Z = np.zeros_like(X) # transformation should have input dim

K = np.random.randint(low=-5, high=5, size=(2,2))
print('Kernel:')
print(K)
print()
print()
x, y = X_pad.shape
for x_i in range(x):
    for y_i in range(y):
        if x_i >= (x-2):
            continue
        if y_i >= (y-2):
            continue
        S = X_pad[x_i: x_i+2, y_i: y_i+2]
        Z[x_i, y_i] = (K*S).sum()
print('Convolved:')
Z

Kernel:
[[ 0  1]
 [ 4 -1]]


Convolved:


array([[-143,  354,  715,  616,  -53],
       [ -46,  784,  695,  692,  787],
       [ 126,  316,  585,  585,  865],
       [-155,  781,  978,  250,   99],
       [ -23, 1179,  -29,  210,  443]])

Done.