<br><br>
<font size='6'><b>Convolutional Neural Networks (CNN)
</b></font><br><br>

<br>
<div class=pull-right>
By Prof. Seungchul Lee<br>
http://iai.postech.ac.kr/<br>
Industrial AI Lab at POSTECH
</div>

Table of Contents
<div id="toc"></div>

# 1. Convolution

## 1.1. 1D Convolution

<br>
<center><img src="./image_files/1d_conv.png" width = 500></center>


## 1.2. Convolution on Image (= Convolution in 2D)

__Filter (or Kernel)__
- Modify or enhance an image by filtering
- Filter images to emphasize certain features or remove other features
- Filtering includes smoothing, sharpening and edge enhancement

- Discrete convolution can be viewed as element-wise multiplication by a matrix

<br>
<center><img src="./image_files/conv_animation.gif" width = 350></center>

<center><img src="./image_files/conv_lena.png" width = 800></center>



__How to find the right Kernels__


- We learn many different kernels that make specific effect on images

- Let’s apply an opposite approach

- We are not designing the kernel, but are learning the kernel from data

- Can learn feature extractor from data using a deep learning framework

# 2. Convolutional Neural Networks (CNN)


## 2.1. Motivation: Learning Visual Features

<br>
The bird occupies a local area and looks the same in different parts of an image. We should construct neural networks which exploit these properties.

<br><br>
<center><img src="./image_files/cnn_motivation.png" width = 800></center>

- ANN structure for object detecion in image

    - does not seem the best
    - did not make use of the fact that we are dealing with images
    - Spatial organization of the input is destroyed by flattening

<br>
<center><img src="./image_files/bird_ANN.png" width = 600></center>


<br>

- __Locality__: objects tend to have a local spatial support
    - fully and convolutionally connected layer $\rightarrow$ locally and convolutionally connected layer

<br>
<center><img src="./image_files/cnn_locality.png" width = 600></center>
<br>

- __Translation invariance__: object appearance is independent of location
    - Weight sharing: untis connected to different locations have the same weights
    - We are not designing the kernel, but are learning the kernel from data
    - _i.e._ We are learning visual feature extractor from data

<br>
<center><img src="./image_files/translation_invariance.png" width = 600></center>

## 2.2. Convolutional Operator

__Convolution of CNN__

- Local connectivity
- Weight sharing
- Typically have sparse interactions

- Convolutional Neural Networks
    - Simply neural networks that use the convolution in place of general matrix multiplication in at least one of their layers
   

- Multiple channels

<br>
<center><img src="./image_files/mult_channel_01.png" width = 400></center>


- Multiple kernels

<br>
<center><img src="./image_files/multi_kernels.png" width = 400></center>



## 2.3 Stride and Padding

- Strides: increment step size for the convolution operator
    - Reduces the size of the output map


- No stride and no padding

<br>
<center><img src="./image_files/no_padding_no_strides.gif" width = 300></center>

- Stride example with kernel size 3×3 and a stride of 2

<br>
<center><img src="./image_files/stride_example.gif" width = 300></center>


- Padding: artificially fill borders of image
    - Useful to keep spatial dimension constant across filters
    - Useful with strides and large receptive fields
    - Usually fill with 0s

<br>
<center><img src="./image_files/same_padding_no_strides.gif" width = 300></center>


## 2.4. Nonlinear Activation Function

<br>
<center><img src="./image_files/ReLU.png" width = 500></center>

## 2.5. Pooling
- Compute a maximum value in a sliding window (max pooling)
    - Reduce spatial resolution for faster computation
    - Achieve invariance to any permutation inside one of the cell

<br>
<center><img src="./image_files/pooling_invariance.png" width = 700></center>

- Pooling size : $2\times2$ for example


<center><img src="./image_files/Max_pooling_image.png" width = 500></center>


## 2.6. CNN for Classification 

- CONV and POOL layers output high-level features of input
- Fully connected layer uses these features for classifying input image
- Express output as probability of image belonging to a particular class


<br>
<center><img src="./image_files/cnn_clf.png" width = 900></center>
<br>

# 3. Lab: CNN with TensorFlow (MNIST)


- MNIST example 
- To classify handwritten digits

<br>
<center><img src="./image_files/CNN_arch.png" width = 900></center>
<br>

## 3.1. Training

## 3.2. Testing or Evaluating

In [None]:
test_img = test_x[np.random.choice(test_x.shape[0], 1)]

predict = model.predict_on_batch(test_img)
mypred = np.argmax(predict, axis = 1)

plt.figure(figsize = (12,5))

plt.subplot(1,2,1)
plt.imshow(test_img.reshape(28, 28), 'gray')
plt.axis('off')
plt.subplot(1,2,2)
plt.stem(predict[0])
plt.show()

print('Prediction : {}'.format(mypred[0]))

# 4. Lab: CNN with Tensorflow (Steel Surface Defects)

- NEU steel surface defects example 
- To classify defects images into 6 classes

<br>
<center><img src="./image_files/NEU.jpg" width = 700></center>
<br>

Download [NEU steel surface defects](http://faculty.neu.edu.cn/yunhyan/NEU_surface_defect_database.html) images and labels 

- [NEU train images](https://www.dropbox.com/s/5fcdf9zfj95dztt/NEU_train_imgs.npy?dl=1)
- [NEU train labels](https://www.dropbox.com/s/0sy8nd8auwrt43m/NEU_train_labels.npy?dl=1)
- [NEU test images](https://www.dropbox.com/s/znjylp2hwnro2j6/NEU_test_imgs.npy?dl=1)
- [NEU test labels](https://www.dropbox.com/s/rm18trt9lr32bxb/NEU_test_labels.npy?dl=1)

## 4.1. Training

## 4.2. Testing or Evaluating

In [None]:
name = ['scratches', 'rolled-in scale', 'pitted surface', 'patches', 'inclusion', 'crazing']

idx = np.random.choice(test_x.shape[0], 1)
test_img = test_x[idx]
GT = test_y[idx]

predict = model.predict_on_batch(test_img)
mypred = np.argmax(predict, axis = 1)

plt.figure(figsize = (12,5))

plt.subplot(1,2,1)
plt.imshow(test_img.reshape(200, 200), 'gray')
plt.axis('off')
plt.subplot(1,2,2)
plt.stem(predict[0])
plt.show()

print('Prediction : {}'.format(name[mypred[0]]))
print('Ground Truth : {}'.format(name[GT[0]]))

In [None]:
%%javascript
$.getScript('https://kmahelona.github.io/ipython_notebook_goodies/ipython_notebook_toc.js')