# Python Tutorial for ECE4880J (Coding part)

This tutorial is written by Yueyuan for ECE4880J in SJTU-UMJI. It runs Python3 by default.

## Introduction

In this tutorial, we will cover:

- Basic Python
- Numpy $\to$ (extend) $\to$ Scipy
- Matplotlib $\leftarrow$ (compare) $\rightarrow$ OpenCV, PIL
- Pytorch

### About Jupyter Notebook

Jupyter is a web-based interactive development environment. The two basic components in a Jupyter botebook are code block and text block. In the code block, you can execute over 40 programming languages, including Python, R, Julia, and Scala. In the text block, you can produce rich text format content, including HTML, LateX, and Markdown. The modular design of Jupyter Notebook allows extensions to expand and enrich functionality.

For more details about Jupyter Notebook, you can visit its official ducumentation [here](https://docs.jupyter.org/en/latest/).

#### Checklist

- Select Python path.
- Add, delete, and move the blocks.
- Clear output.
- Restart kernel.

### Routine checkup before running python

Before every first time you run a new python script, I strongly suggest you should check the path you are under, the python version, and the package manager version.

```![command]``` means that the line is a Terminal command.

In [None]:
!pwd

In [None]:
!python --version
!pip --version

If you are to run some open source code, please always remember to install the required packages first.

```shell
pip install -r requirements.txt
```

In [None]:
# Run this block to install all the package we need for this tutorial
!pip install numpy
!pip install pandas
!pip install scipy
!pip install matplotlib
!pip install pillow
!pip install opencv-python
!pip install torch
!pip install torchvision
# in case you use conda
# !conda install numpy
# !conda install pandas
# !conda install scipy
# !conda install matplotlib
# !conda install pillow
# !conda install opencv-python
# !conda install torch
# !conda install torchvision

## Basic Python (checklist)

- Indentation is important for python.
- You do not have to declare the data type of a variable before using it.

In [None]:
# I do not encourage you to assign value to a variable like this, but it works in Python.
def declare_variable(input=None):
    if input == 1:
        a = 1
    else:
        a = 0
    print(a)

declare_variable(1)
declare_variable()

- The data type of a variable can be changed at any time.

In [None]:
a = int(1)
print(a, end=", ")
a = "a"
print(a, end=", ")
a = float(1)
print(a)

- In a python list, the data type of the elements do not have to be the same.

In [None]:
b = [int(42), float(3.14), "Matt Bomber", {"wagtail"}, {1: "2"}]
b

- Check an instance's data type by ```isinstance```

In [None]:
print(isinstance(b, type(list)))

- Assignment statements in Python do not copy objects, they create bindings between a target and an object. 

  **Be carefull when you copy a variable!**

  (I only demo ```copy()```. Please check ```deepcopy()``` [here](https://docs.python.org/3/library/copy.html) by yourself)

In [None]:
# case 1
list1 = [1,2,3,4,5]
list2 = list1
list2[0] = 0
print(list1, list2)

In [None]:
# case 2
list1 = [1,2,3,4,5]
list2 = list1.copy() # or list2 = list1[:]
list2[0] = 0
print(list1, list2)

In [None]:
# case 3
def change_first_element(list_test:list):
    list_test[0] = 0

list1 = [1,2,3,4,5]
list2 = list1
change_first_element(list2)
print(list1, list2)

In [None]:
# case 4
def change_first_element(list_test:list):
    list_test[0] = 0

list1 = [1,2,3,4,5]
list2 = list1.copy()
change_first_element(list2)
print(list1, list2)

- Inline if-else

In [None]:
ECE4880J = True
i_want_to_sleep = True if not ECE4880J else False
print(i_want_to_sleep)

- Inline loop

In [None]:
assignment_is_hard = [False, False, False]
assignment_is_finished = [not x for x in assignment_is_hard]
print(assignment_is_hard, assignment_is_finished)

- More code samples about basic data types (list, dictionary, set, tuple) can be found [here](https://colab.research.google.com/github/cs231n/cs231n.github.io/blob/master/python-colab.ipynb).

## Numpy

> NumPy (**Numerical Python**) is an open source Python library...It’s the universal standard for working with numerical data in Python, and it’s at the core of the scientific Python and PyData ecosystems...The NumPy API is used extensively in Pandas, SciPy, Matplotlib, scikit-learn, scikit-image and most other data science and scientific Python packages. <br><br> from the *Official Numpy Documentation*

![](https://bids.berkeley.edu/sites/default/files/2020-0916-numpy-nature-fig2.png)


In [None]:
import numpy as np

### ```np.ndarray``` vs. ```list```

#### 1. Initialize an array with arbitrary length

In [None]:
n = 5
sample_list = [5] * n
sample_np = np.ones(5) * n
print(sample_list, sample_np)

#### 2. Generate an array with evenly spaced intervals

In [None]:
sample_list = [*range(0,6,2)]
sample_np = np.arange(0,6,2)
print(sample_list, sample_np)

In [None]:
# The step in range() must be an interger.
sample_list = [*range(0,6,0.5)]

In [None]:
# You have more freedom when generating an array with numpy
sample_np = np.arange(0,6,0.5)
print(sample_np)

#### 3. Adding elements

In [None]:
# Add one element to a list
sample_list.append(5)
print(sample_list)

In [None]:
# Concatenate two lists
print(sample_list + [1,2,3])

In [None]:
# In numpy, you only have concatenate
print(np.concatenate((sample_np, np.zeros(3))))

In [None]:
# Numpy only concatenate arrays with the same shape
print(np.concatenate((np.ones((3,2)), np.zeros((4,2)))))

In [None]:
print(np.concatenate((np.ones((3,2)), np.zeros((4,3)))))

In [None]:
# You can concatenate numpy arrays along different axis.
print(np.concatenate((np.ones((3,2)), np.zeros((3,2))), axis=1))

#### 4. The shape and size of an array

In [None]:
# You can only check the length of a list
sample_list = [[1,2], [3,4], [5,6]]
print(len(sample_list))

In [None]:
# You can check the dimension, total number of elements, and the shape of a np.ndarray
sample_np = np.array(sample_list)
print(sample_np.ndim, sample_np.size, sample_np.shape)

#### 5. Basic array operations

In [None]:
# "+" operation in a python list means joining the lists.
# You cannot apply "+" to a list and an element
print([1,2,3,4] + 1)

In [None]:
# "+" can be applied to 
print(np.array([1,2,3,4]) + 1)
print(np.array([1,2,3,4]) + np.ones(4))

In [None]:
# "*" is used to enlarge the list size 
print([1,2] * 5)

In [None]:
# In numpy, you can see three different products
print(np.array([1,2,3]) * 2)
print(np.array([1,2,3]) * np.array([4,5,6])) # multiply element by element
print(np.dot([1,2,3], [4,5,6])) # dot product
print(np.cross([1,2,3], [4,5,6])) # cross product

### Specialties of Numpy

#### 1. Reshape an array

In [None]:
sample_np = np.arange(6)
print(sample_np)
print(sample_np.reshape(2,3))

#### 2. Select elements

In [None]:
print(sample_np[sample_np % 2 == 0])
print(sample_np[(sample_np > 2) & (sample_np < 5)])

#### 3. Transpose

In [None]:
sample_np = sample_np.reshape(2,3)
print(sample_np)
print(sample_np.T)

#### 4. Reverse

In [None]:
print(np.flip(sample_np, axis=0))
print(np.flip(sample_np, axis=1))

#### 5. Powerful APIs

You can search for the details [here](https://numpy.org/doc/1.22/reference/index.html).

- Mathematical functions: sin, cos, exp, sum, max, min, etc.
- Sorting, searching, and counting
- Discrete Fourier Transform (```numpy.fft``` module)
    
    Fast Fourier Transform is widely used to restore an image with noise.
    
    <img src="https://kirkt.smugmug.com/photos/45700484_3hrU5-L.jpg" width="500px" />
- Linear algebra (```numpy.linalg``` module)
- Masked array operations

## Scipy

> SciPy (**Scientific Python**) is a free and open-source Python library used for scientific computing and technical computing. <br> SciPy contains modules for optimization, linear algebra, integration, interpolation, special functions, FFT, signal and image processing, ODE solvers and other tasks common in science and engineering... <br> The basic data structure used by SciPy is a multidimensional array provided by the **NumPy** module. <br> <br> from *Wikipeida*

Some of the submodules we may frequently use for this course:

- scipy.fft: Discrete Fourier Transform algorithms
- scipy.signal: signal processing tools
- scipy.linalg: linear algebra routines
- scipy.ndimage: various functions for multi-dimensional image processing
- scipy.io: data input and output

In [None]:
import scipy
from scipy import signal

## Matplotlib

### Differences between matplotlib, PIL, and OpenCV

Matplotlib is a comprehensive library for creating static, animated, and interactive visualizations in Python. It is written in Python and C++. It focuses more on **visualization**. Its ability to process an image is relatively weak.

Pillow and OpenCV are image libraires for loading, processing, and creating images. They provide the implemenetation of many classical image processing methods. 
- Pillow is written in Python and C, while OpenCV is written in C++ and C. In most of the cases, OpenCV is faster.
- Pillow reads the images in BGR format by default. OpenCV reads the images in RGB format by default.
- Both of these two libraries provide tools for

    - Image filters (blur, sharpen, etc.）
    - Image transformation (filp, rotate, warp, etc.)
    - Conversion between image types

- OpenCV also provides

    - Tools to process videos
    - Feature extraction methods (SIFT, HOG, HAAR, etc.)
    - Classical machine learning models (Bayes classifier, KNN, SVM, etc.)

### Be careful of your backend!

If your image is not plotted as expected while there is not any error reported, the problem is very likely on your selection of backend!

The Matplotlib architecture is composed of three main layers:

- Backend Layer: Handles all the heavy works via communicating to the drawing toolkits in your machine. It is the most complex layer.
- Artist Layer: Allows full control and fine-tuning of the Matplotlib figure, the top-level container for all plot elements.
- Scripting Layer: The lightest scripting interface among the three layers, designed to make Matplotlib work like MATLAB script.

In [None]:
import urllib.request
import matplotlib
import matplotlib.pyplot as plt
import cv2
import PIL

In [None]:
%matplotlib --list

In [None]:
matplotlib.get_backend()

### Structure of a plot in matplotlib

![](https://3.bp.blogspot.com/-AtPG_12l4e8/XRSuQEECZGI/AAAAAAAAHxY/ZsgtA4rMphMZujcWUur9BB-xYKoWDkKPQCLcBGAs/s1600/basics_matplotlib.PNG)



## Convolution

Convolution is the process of adding each element of the image to its local neighbors, weighted by the kernel. This operation is widely use in image processing.

<img src="https://assets.leetcode.com/users/images/a7e6370b-84cb-4e9a-861a-deceb8064a07_1599380839.3085368.png" width="500px" />

### Application 1: Extract features

<img src="https://s2.loli.net/2022/05/25/rbo15VqQT7PxgcX.png" width="500px" />


In [None]:
# First we load an online image
f_name, _ = urllib.request.urlretrieve(
    "https://www.kindpng.com/picc/m/134-1342850_logo-harry-potter-hogwarts-png-download-logo-hogwarts.png", 
    "SJTU.png")
img = plt.imread(f_name)
print(img.shape)
img = img[:,:,:3]
plt.imshow(img)
plt.axis("off")

In [None]:
def HarrisCorner(img, k=0.1):
    G_x = np.array([[-1,0,1], [-2,0,2], [-1,0,1]]) # Sobel operator on x direction
    G_y = np.array([[-1,-2,-1], [0,0,0], [1,2,1]]) # Sobel operator on y direction
    I_x = signal.convolve2d(img[:,:,0], G_x, mode="same")
    I_y = signal.convolve2d(img[:,:,0], G_y, mode="same")
    I_xx = scipy.ndimage.gaussian_filter(I_x**2, sigma=1)
    I_xy = scipy.ndimage.gaussian_filter(I_y*I_x, sigma=1)
    I_yy = scipy.ndimage.gaussian_filter(I_y**2, sigma=1)

    determinant = I_xx * I_yy - I_xy ** 2
    trace = I_xx + I_yy
    harris_response = determinant - k * trace ** 2
    return harris_response

In [None]:
img_for_corners = np.copy(img)
harris_response = HarrisCorner(img)
for i, r_response in enumerate(harris_response):
    for j, r in enumerate(r_response):
        if r > 0:
            img_for_corners[i, j] = [255,0,0]

In [None]:
plt.imshow(img_for_corners)

### Application 2. Denoise an image

In [None]:
f_name, _ = urllib.request.urlretrieve(
    "https://raw.githubusercontent.com/timlentse/Add-Salt_Pepper_noise/master/add%20noise%20%20image.png",
    "pepper_noise.png")
img = plt.imread(f_name)
print(img.shape)
img = img[:,:,:3]
plt.imshow(img)
plt.axis("off")

In [None]:
denoised_img = signal.medfilt(img, kernel_size=3)
print(img.shape)
plt.imshow(denoised_img)

## Pooling

[Pooling Methods in Deep Neural Networks, a Review](https://arxiv.org/pdf/2009.07485.pdf#:~:text=We%20divided%20pooling%20methods%20into,of%20Interest%20Pooling%20are%20discussed.)

- **Average Pooling**: performs down-sampling by dividing the input into rectangular pooling regions and computing the average values of each region.
- **Max Pooling**： passes forward the maximum value within a group of $R$ activations. A max-pooling operator can be applied to down-sample the convolutional output bands, thus reducing variability.
- **Mixed Pooling**: hybrid approach by combining the average pooling and max pooling.
$$
s_j = \lambda \max_{i\in R_j} a_i+(1-\lambda)\frac{1}{|R_j|}\sum_{i\in R_j} a_i
$$
- **$L_p$ Pooling**: takes a weighted average of inputs.
- **Stochastic Pooling**: applies multinomial distribution to pick the value randomly
- **Spatial Pyramid Pooling**: partitions the image into
divisions from finer to coarser levels and aggregates local features in them.
- **Region of Interest Pooling**: mostly used for object detection and segmentation. The ROI pooling layer worked by shifting the processing specific to individual bounding boxes later in the network architecture

<table style="width:100%; table-layout:fixed;">
    <tr>
        <td> <img src="https://s2.loli.net/2022/05/25/pSu6D5WyMGYmdaz.png" width="350px" /> </td>
        <td> <img src="https://s2.loli.net/2022/05/25/pDJMRy9GKCv2ugZ.png" width="300px" /> </td>
    </tr>
</table>

> Check： Up till now, we keep talking about down-pooling. How to do up-pooling?

### Application 3: Resize an image

In [None]:
def max_pooling(img, kernel_size = 3):
    img_resize = np.zeros((img.shape[0] // kernel_size, img.shape[1] // kernel_size))
    for i in range(img_resize.shape[0]):
        for j in range(img_resize.shape[1]):
            img_resize[i,j] = np.max(img[i*3:i*3+3, j*3:j*3+3])
    return img_resize

In [None]:
img_resize = np.zeros((179,171,3))
for i in range(3):
    img_resize[:,:,i] = max_pooling(denoised_img[:,:,i])
plt.imshow(img_resize)

## Pytorch

An open source machine learning framework that accelerates the path from research prototyping to production deployment, primarily developed by MetaAI.

In [None]:
import pandas as pd
import torch
from torch.utils.data import Dataset, DataLoader
from torchvision import datasets
from torchvision.transforms import ToTensor
import torch.nn as nn
import torch.optim as optim
from torch.autograd import Variable

device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(torch.cuda.is_available())

### Data type: Tensors

> A PyTorch Tensor is basically the same as a numpy array: it does not know anything about deep learning or computational graphs or gradients, and is just a generic n-dimensional array to be used for arbitrary numeric computation. <br> The biggest difference between a numpy array and a PyTorch Tensor is that <u>a PyTorch Tensor can run on either CPU or GPU</u>. To run operations on the GPU, just cast the Tensor to a cuda datatype. <br><br> from the *Official Pytorch document*.

**You should keep in mind that all the input and output of a pytorch-based network are ```pytorch.Tensor```.**

### Dataloader: Prepare data for training

In [None]:
# Download a built-in dataset from Pytorch
train_data = datasets.FashionMNIST(
    root="data",
    train=True,
    download=True,
    transform=ToTensor()
)

test_data = datasets.FashionMNIST(
    root="data",
    train=False,
    download=True,
    transform=ToTensor()
)

print(train_data.data.size(), train_data.targets.size())
print(test_data.data.size(), test_data.targets.size())

In [None]:
# Define the labels
labels_map = {
    0: "T-Shirt",
    1: "Trouser",
    2: "Pullover",
    3: "Dress",
    4: "Coat",
    5: "Sandal",
    6: "Shirt",
    7: "Sneaker",
    8: "Bag",
    9: "Ankle Boot",
}

# Iterate and visualize the dataset
figure = plt.figure(figsize=(8, 8))
cols, rows = 3, 3
for i in range(1, cols * rows + 1):
    sample_idx = torch.randint(len(train_data), size=(1,)).item()
    img, label = train_data[sample_idx]
    figure.add_subplot(rows, cols, i)
    plt.title(labels_map[label])
    plt.axis("off")
    plt.imshow(img.squeeze(), cmap="gray")
plt.show()

In [None]:
# Create a custom dataset
# This class has the same functionality as datasets.FashionMINST.
# It is only for demonstration. Do not run it.
class CustomImageDataset(Dataset):
    def __init__(
        self, annotations_file, img_dir:str, transform:bool=None, target_transform:bool=None):
        """Run once when instantiating the Dataset object."""
        self.img_labels = pd.read_csv(annotations_file)
        self.img_dir = img_dir
        self.transform = transform
        self.target_transform = target_transform

    def __len__(self):
        """Return the number of samples in our dataset."""
        return len(self.img_labels)

    def __getitem__(self, idx):
        """Load and return a sample from the dataset at the given index idx"""
        img_path = os.path.join(self.img_dir, self.img_labels.iloc[idx, 0])
        image = read_image(img_path)
        label = self.img_labels.iloc[idx, 1]
        if self.transform:
            image = self.transform(image)
        if self.target_transform:
            label = self.target_transform(label)
        return image, label

In [None]:
# Load FashionMNIST into the DataLoader and iterate through the dataset as needed
train_dataloader = DataLoader(train_data, batch_size=64, shuffle=True)
test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)
# If you have successfully setup your custom datset, you can run
# training_data = CustomImageDataset(**kwargs)
# test_data = CustImageDataset(**kwargs)
# train_dataloader = DataLoader(training_data, batch_size=64, shuffle=True)
# test_dataloader = DataLoader(test_data, batch_size=64, shuffle=True)

In [None]:
# Display a random image and its label.
train_features, train_labels = next(iter(train_dataloader))
print(f"Feature batch shape: {train_features.size()}")
print(f"Labels batch shape: {train_labels.size()}")
img = train_features[0].squeeze()
label = train_labels[0]
plt.imshow(img, cmap="gray")
plt.show()
print(f"Label: {label}")

### Model Structure

#### Network Components

##### Classical layers for a CNN network:

- **Convolution layer**: The first layer of a CNN is always a Convolutional Layer. Convolutional layers apply a convolution operation to the input, passing the result to the next layer. A convolution converts all the pixels in its receptive field into a single value.
- **Activation layer**: The choice of activation function in the hidden layer will control how well the network model learns the training dataset. The choice of activation function in the output layer will define the type of predictions the model can make.
- **Normalization layer**: Layer normalization normalizes input across the features instead of normalizing input features across the batch dimension in batch normalization.
- **Pooling layer**: The main purpose of pooling layer is to progressively reduce the spatial size of the input image, so that number of computations in the network are reduced
- **Dropout layer**: The dropout layer randomly sets input units to 0 with a frequency of rate at each step during training time, which helps prevent overfitting.
- **Linear layer**: Linear layers use matrix multiplication to transform their input features into output features using a weight matrix.

##### Important parameters for the layers:

- in_channels (int): number of channels in the input image
- out_channels (int): number of channels produced by the convolution
- kernel_size (int or tuple): size of the convolving kernel
- stride (int or tuple, optional): number of pixels to pass at a time when sliding the convolutional kernel. Default: 1
- padding (int or tuple, optional): use zero padding on the border of the image to preserve its size. Default: 0
- dilation (int or tuple, optional): space between kernel elements. Default: 1

##### Calculate the output matrix size based on input size and the layer's parameters

<img width="700px" src="https://s2.loli.net/2022/05/25/JeXZ317hgKNOIF4.png">

<!-- $$
H_{\textrm{out}} = \left\lfloor \frac{H_{\textrm{in}}+2\times \textrm{padding}[0]-\textrm{dilation}[0]\times(\textrm{kernel_size}[0]-1)-1}{\textrm{stride}[0]} +1 \right\rfloor
$$
$$
W_{\textrm{out}} = \left\lfloor \frac{H_{\textrm{in}}+2\times \textrm{padding}[1]-\textrm{dilation}[1]\times(\textrm{kernel_size}[1]-1)-1}{\textrm{stride}[1]} +1 \right\rfloor
$$ -->

<table style="width:100%; table-layout:fixed;">
    <tr>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/no_padding_no_strides.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/arbitrary_padding_no_strides.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/same_padding_no_strides.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/full_padding_no_strides.gif"></td>
    </tr>
    <tr>
        <td>No padding, no strides</td>
        <td>Arbitrary padding, no strides</td>
        <td>Half padding, no strides</td>
        <td>Full padding, no strides</td>
  </tr>
    <tr>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/no_padding_strides.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/padding_strides.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/padding_strides_odd.gif"></td>
    </tr>
    <tr>
        <td>No padding, strides</td>
        <td>Padding, strides</td>
        <td>Padding, strides (odd)</td>
    </tr>
    <tr>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/no_padding_no_strides_transposed.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/arbitrary_padding_no_strides_transposed.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/same_padding_no_strides_transposed.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/full_padding_no_strides_transposed.gif"></td>
    </tr>
    <tr>
        <td>No padding, no strides, <br>transposed</td>
        <td>Arbitrary padding, no strides, <br>transposed</td>
        <td>Half padding, no strides, <br>transposed</td>
        <td>Full padding, no strides, <br>transposed</td>
    </tr>
    <tr>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/no_padding_strides_transposed.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/padding_strides_transposed.gif"></td>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/padding_strides_odd_transposed.gif"></td>
    </tr>
    <tr>
        <td>No padding, strides, <br>transposed</td>
        <td>Padding, strides, <br>transposed</td>
        <td>Padding, strides, <br>transposed (odd)</td>
    </tr>
    <tr>
        <td><img width="150px" src="https://raw.githubusercontent.com/vdumoulin/conv_arithmetic/master/gif/dilation.gif"></td>
    </tr>
    <tr>
        <td>No padding, no stride, dilation</td>
    </tr>
</table>

#### Famous CNNs

##### VGGNet (2014)

<img width="100%" src="https://miro.medium.com/max/1400/1*hs8Ud3X2LBzf5XMAFTmGGw.jpeg">

##### U-Net (2015)

<img width="100%" src="https://nchlis.github.io/2019_10_30/architecture_unetV2.png">

##### ResNet (2015)

<img width="100%" src="https://lh5.googleusercontent.com/fIXDrntrxU-YJewW148x4VsJICzisvWOj6voUUq0eU2bdxv54e2OWJEIrgyjC4K3c2Y_zHOrpT7AQl5UP-laUEo_U0HXdAtSanORH6JudtCwwGK6oUjhSH8Coj3d2mqovYBhEe0s
">


##### Inception v3 （2015）

<img width="100%" src="https://lh3.googleusercontent.com/bA_Rkj4a0sA3NZ1wjUYIO5_eq0hUmiBNbagOFb84C8Y9GxeedGUYNd-LIbhAlpW-1o8xSeNypMnbD6p-XsrAQvup3FeWXrAoZig7l7Y9WIK3uDHooEMEKiNNQ2qt0PfA4Zfsyltn
">



In [None]:
class TypicalCNN(nn.Module):
    def __init__(self):
        # batch size is 64
        # input size 64 x 1 x 28 x 28
        super(TypicalCNN, self).__init__()
        self.features = nn.Sequential(
            nn.Conv2d(in_channels=1, out_channels=16, kernel_size=3, stride=1), # 64 x 16 x 26 x 26
            nn.ReLU(),
            nn.Conv2d(16, 32, 3, 1), # 64 x 32 x 24 x 24
            nn.ReLU(),
        )
        # fully connected layer, output 10 classes
        self.out = nn.Linear(32*24*24, 10)

    def forward(self, x):
        x = self.features(x)
        # flatten the output of conv2 to (64, 32 x 24 x 24)
        x = x.view(x.size(0), -1)
        output = self.out(x)
        return output, x

In [None]:
cnn_model = TypicalCNN()
print(cnn_model)
cnn_model.to(device)

#### Loss

In [None]:
criterion = nn.CrossEntropyLoss()
print(criterion)

#### Optimizer

In [None]:
optimizer = optim.Adam(cnn_model.parameters(), lr=0.01)
print(optimizer)

#### Train

In [None]:
def train(num_epochs, model, data_loader):
    model.train()

    # train the model
    total_step = len(data_loader)
    for epoch in range(num_epochs):
        for images, labels in data_loader:
            batch_x = images.to(device)
            batch_y = labels.to(device)

            output = model(batch_x)[0]
            loss = criterion(output, batch_y)
            # clear gradients for this training step 
            optimizer.zero_grad()
            # backpropagation, compute gradients
            loss.backward()
            # apply gradients
            optimizer.step()

            
        print ('Epoch {}, Loss: {:.4f}'.format(epoch, loss.item()))

In [None]:
train(10, cnn_model, train_dataloader)

#### Test

In [None]:
def test(model, data_loader):
    model.eval()
    with torch.no_grad():
        correct = 0
        total = 0
        for images, labels in data_loader:
            batch_x = images.to(device)
            batch_y = labels.to(device)

            test_output, last_layer = model(batch_x)
            pred_y = torch.max(test_output, 1)[1].data.squeeze()
            
            correct += (pred_y == batch_y).sum().item()
            total += batch_y.size(0)
        accuracy = correct / total
        print('Test Accuracy of the model on the 10000 test images: %.2f' % accuracy)


In [None]:
test(cnn_model, test_dataloader)