In [1]:
import numpy as np

## creating the image where h is the image 
h = [2,1,0]

## creating the keneral where x is the keneral
x = [3,4,5]

In [3]:
y = np.convolve(x,h)
print(y)

[ 6 11 14  5  0]


In [4]:
print("Compare with the following values from Python: y[0] = {0} ; y[1] = {1}; y[2] = {2}; y[3] = {3}; y[4] = {4}".format(y[0],y[1],y[2],y[3],y[4])) 

Compare with the following values from Python: y[0] = 6 ; y[1] = 11; y[2] = 14; y[3] = 5; y[4] = 0


3 ways to apply the keranel matrix, with padding(full), with paddin(same) and without padding(valid)


# 1) Visually understanding the operation with padding (full)

Lets think of the kernel as a sliding window. We have to come with the solution of padding zeros on the input array. Ths is a very famous implementation and will be easier to show how it works with a simple example, consider this case:



x[i] = [6,2]
h[i] = [1,2,5,4]

In [7]:
x = [6,2]
h = [1,2,5,4]

In [12]:
import numpy as np
y = np.convolve(x, h, "full") #now, because of the zero padding, the final dimension of the array is bigger
y

array([ 6, 14, 34, 34,  8])

# 2) Visually understanding the operation with "same"

In this approach, we just add the zero to left (and top of the matrix in 2D). That is, only the first 4 steps of "full" method:

In [14]:
y = np.convolve(x, h, "same") #it is same as zero padding, but withgenerates same 
y

array([ 6, 14, 34, 34])

# 3) Visually understanding the operation with no padding (valid)

In the last case we only applied the kernel when we had a compatible position on the h array, in some cases you want a dimensionality reduction. For this purpose, we simple ignore the steps tha would need padding:

x[i] = [6 2] h[i] = [1 2 5 4]

You have to invert the filter x, otherwise the operation would be cross-correlation. First step, (now without zero padding):

In [18]:
y = np.convolve(x, h, "valid") #we will understand why we used the argument valid in the next example
y

array([14, 34, 34])

# Convolution: 2D operation with Python (Numpy/Scipy)

In [22]:
from scipy import signal as sg
I = [[255,7,3],
    [212,240,4],
    [218,216,230]]
g = [[-1,1]]
print("without zero padding \n")
print('{0} \n'.format(sg.convolve(I, g), "valid"))
# The 'valid' argument states that the output consists only of those elements 
# that do not rely on the zero-padding.
print("with zero padding \n")
print(sg.convolve(I,g))

without zero padding 

[[-255  248    4    3]
 [-212  -28  236    4]
 [-218    2  -14  230]] 

with zero padding 

[[-255  248    4    3]
 [-212  -28  236    4]
 [-218    2  -14  230]]


In [23]:
print("with zero padding \n")
print("[0] \n".format(sg.convolve(I, g, 'full')))
print("with padding \n")
print(sg.convolve(I,g))

with zero padding 

[0] 

with padding 

[[-255  248    4    3]
 [-212  -28  236    4]
 [-218    2  -14  230]]


In [24]:
print("with zero padding \n")
print("[0] \n".format(sg.convolve(I, g, 'same')))
print("with padding \n")
print(sg.convolve(I,g))

with zero padding 

[0] 

with padding 

[[-255  248    4    3]
 [-212  -28  236    4]
 [-218    2  -14  230]]


In [26]:
from scipy import signal as sg
I = [[255,7,3],
    [212,240,4],
    [218,216,230]]
g =[[-1,1],
  [2,3]]

In [30]:
# The output is the full discrete linear convolution of the inputs. 
# It will use zero to complete the input matrix

print('with zero padding \n')
print('{0} \n'.format(sg.convolve(I, g, 'full')))

# The output is the full discrete linear convolution of the inputs. 
# It will use zero to complete the input matrix
print('without zero padding \n')
print('{0} \n'. format(sg.convolve(I, g, 'same')))

# The 'valid' argument states that the output consists only of those elements 
#that do not rely on the zero-padding.
print('without zero padding \n')
print (sg.convolve( I, g, 'valid'))

with zero padding 

[[-255  248    4    3]
 [ 298  751  263   13]
 [ 206 1118  714  242]
 [ 436 1086 1108  690]] 

without zero padding 

[[-255  248    4]
 [ 298  751  263]
 [ 206 1118  714]] 

without zero padding 

[[ 751  263]
 [1118  714]]


# Coding with Tensorflow 

Numpy is great because it has high optmized matrix operations implemented in a backend using C/C++. However, if our goal is to work with DeepLearning, we need much more. TensorFlow does the same work, but instead of returning to Python everytime, it creates all the operations in the form of graphs and execute them once with the highly optimized backend.

Suppose that you have two tensors:

3x3 filter (4D tensor = [3,3,1,1] = [width, height, channels, number of filters])
10x10 image (4D tensor = [1,10,10,1] = [batch size, width, height, number of channels]
The output size for zero padding 'SAME' mode will be:

the same as input = 10x10
The output size without zero padding 'VALID' mode:

input size - kernel dimension +1 = 10 -3 + 1 = 8 = 8x8

In [31]:
import tensorflow as tf

In [32]:
## now build the graph
input = tf.Variable(tf.random_normal([1,10,10,1]))
filter = tf.Variable(tf.random_normal([3,3,1,1]))
op = tf.nn.conv2d(input, filter, strides=[1,1,1,1], padding = 'VALID')
op2 = tf.nn.conv2d(input, filter, strides=[1,1,1,1], padding = 'SAME')

In [38]:
## initializzation and session 
init = tf.global_variables_initializer()
with tf.Session() as sess:
    sess.run(init)
    print("Input \n")
    print('{0} \n'.format(input.eval()))
    print("Filter/Kernel \n")
    print('{0} \n'.format(filter.eval()))
    print("Result/Feature Map with valid positions \n")
    result = sess.run(op)
    print(result)
    print('\n')
    print("Result/Feature Map with padding \n")
    result2 = sess.run(op2)
    print(result2)

Input 

[[[[-1.38936436]
   [-0.00345891]
   [-1.09597671]
   [-1.77963614]
   [ 1.08780801]
   [-1.49857414]
   [ 2.1747942 ]
   [ 0.19542922]
   [ 1.04823709]
   [-0.17813747]]

  [[-1.43401778]
   [-2.04771852]
   [ 0.24121429]
   [ 1.23904002]
   [-0.34568408]
   [-1.79317725]
   [-0.46358258]
   [ 0.36715981]
   [-1.37458265]
   [ 1.02449059]]

  [[ 0.77515209]
   [-1.00133491]
   [ 0.10918215]
   [-1.51642907]
   [-0.83707547]
   [ 1.77123642]
   [ 0.3847937 ]
   [-1.01701093]
   [ 2.03626585]
   [ 1.26673007]]

  [[ 0.05478265]
   [-0.49241954]
   [ 1.39341569]
   [-0.03377312]
   [ 0.46627158]
   [-1.62419963]
   [ 0.4087334 ]
   [ 0.76176333]
   [-0.57188118]
   [ 0.44882923]]

  [[-0.68774509]
   [-0.26597512]
   [ 1.8318578 ]
   [-0.69608659]
   [-1.07974589]
   [-0.68666738]
   [-0.69825584]
   [ 1.0078733 ]
   [-0.88290262]
   [-0.55823547]]

  [[ 0.48838812]
   [ 1.99886167]
   [-0.46106985]
   [ 1.78463745]
   [-0.72148538]
   [ 1.05484653]
   [ 0.27525723]
   [ 3.030698

# Convolution applied on images


Upload your own image (drag and drop to this window) and type its name on the input field on the next cell (press shift + enter). The result of this pre-processing will be an image with only a greyscale channel.

You can type bird.jpg to use a default image

In [43]:
## downloading the  standard image
!wget --quiet https://ibm.box.com/shared/static/cn7yt7z10j8rx6um1v9seagpgmzzxnlz.jpg --output-document bird.jpg


'wget' is not recognized as an internal or external command,
operable program or batch file.
