# Coding Project: Deep Learning Basics

* ### Based on the paper:K. He, X. Zhang, S. Ren and J. Sun, “Deep Residual Learning for Image Recognition,” IEEE Conference on Computer Vision and Pattern Recognition (CVPR),2016.

* ### Assignment

  1. Get familiar with our coding environment (on cloud)!
  2. Find a codebase of this paper, download the CIFAR10 and CIFAR100 datasets
  3. Run the basic code on the server, with deep residual networks with 20, 56 and 110 layers, and obtain results (3-time average) on both CIFAR10 and CIFAR100
  4. Finish the required task and one of the optional tasks (see the following slides) –of course, you can do more than one optional tasks if you wish (bonus points)
  5. If you have more ideas, please specify a new task by yourself (bonus points)
  6. Remember: integrate your results into your reading report
  7. Submit your report(as PDF) and code (as README doc) on the iLearningX: https://ilearningx-ru.huaweiuniversity.com/courses/course-v1:HuaweiX+WHURU001+Self-paced/courseware/8825cc7815fa444696520baaf31fa2b0/77b7babd6ae34949bc209d7a8f0ba409/(8)  

Date assigned: Oct. 15, 2019;    Date Due: Dec 31, 2019

## Optional Task 2
* Modifying network architecture
    * Based on the results of the basic (required) experiments
    * How does the change of network structure impact final performance?
* Questions to be discussed in the report
    * What if we adjust the number of residual blocks in different stages? For fair comparison, please keep the number of residual blocks unchanged
    * What if we train residual networks with 50 or 62 layers? How do they compare against the network with 56 layers? 3-time average required!
    * What if we remove all skip connections in residual networks? What if we add a skip connection after each 1 or 3 (not 2) convolutional layers? For fair comparison, please keep the number of convlayers unchanged
    * Note: do not simply report accuracy, discussion on reasons is expected!

## Preparation
One time installation of required libraries from requirement.txt and creating data path

In [None]:
!pip install -r requirements.txt
!mkdir data

Downloading CIFAR10 and CIFAR100 datasets

In [None]:
from dataset.dataset_dowloader_ import *

cifar10_dowloader()
cifar100_dowloader()

## The basic training and testing pipeline
### What if we adjust the number of residual blocks in different stages? For fair comparison, please keep the number of residual blocks unchanged.
* `layer_values = [[3, 3, 3], [1, 3, 5], [5, 3, 1], [3, 1, 5]]` - define ResNet20 by different number of layers on each of 3 stages
* `history_stages = []` - define train/validation logs' container
* `auto_resnet(layer_j, 100, 1, 180, history_stages)`:
    * `100` - CIFAR100 dataset
    * `1` - learning rate multiplier (base learning rate is 1*0.1)
    * `180` - number of epochs

In [None]:
%%time
from auto_resnet import * 

layer_values = [[3, 3, 3], [1, 3, 5], [5, 3, 1], [3, 1, 5]] # 20
history_stages = []

for layer_j in layer_values:
    auto_resnet(layer_j, 100, 1, 180, history_stages)

## Plot results

In [None]:
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

%load_ext autoreload
%autoreload 2

legend_stages = ['train_r20_333', 'train_r20_135', 'train_r20_531', 'train_r20_315']

plt_different_history(history_stages, legend_stages)

## The basic training and testing pipeline
### What if we train residual networks with 50 or 62 layers? How do they compare against the network with 56 layers?
* `layer_values = [[9, 9, 9], [8, 8, 8], [10, 10, 10]]` - define ResNet56 & ResNet50 & ResNet62
* `history_50_62 = []` - define train/validation logs' container
* `auto_resnet(layer_j, 100, 1, 180, history_50_62)`:
    * `100` - CIFAR100 dataset
    * `1` - learning rate multiplier (base learning rate is 1*0.1)
    * `180` - number of epochs

In [None]:
%%time
from auto_resnet import * 

layer_values = [[9, 9, 9], [8, 8, 8], [10, 10, 10]] # 56, 50, 62
history_50_62 = []

for layer_j in layer_values:
    auto_resnet(layer_j, 100, 1, 180, history_50_62)

## Plot results

In [None]:
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

%load_ext autoreload
%autoreload 2

legend_50_62 = ['train_r56', 'train_r50', 'train_r62']

plt_different_history(history_50_62, legend_50_62)

## The basic training and testing pipeline
### What if we remove all skip connections in residual networks? What if we add a skip connection after each 1 or 3 (not 2) convolutional layers? For fair comparison, please keep the number of convlayers unchanged.
* `rm_conn_values = [True, False]` - define skipconnection flags (True - remove all skip connections in residual network, False - leave all skip connections in residual network; base flaf is False)
* `conv_num_values = [1, 2, 3]` - define number of conv layers inside residual block (base number is 2)
* `history_structure = []` - define train/validation logs' container
* `auto_resnet([3,3,3], 100, 1, 180, history_structure, 0.8, 64, conv_num_j, rm_conn_i)`:
    * `[3,3,3]` - ResNet20 model
    * `100` - CIFAR100 dataset
    * `1` - learning rate multiplier (base learning rate is 1*0.1)
    * `180` - number of epochs
    * `0.8` - data parts multiplier (base ratio train/validation sub datasets is 0.8/0.2)
    * `64` - batch size valuse (base batch size is 64)

In [None]:
%%time
from auto_resnet import * 

rm_conn_values = [True, False]
conv_num_values = [1, 2, 3]
history_structure = []

for rm_conn_i in class_values:
    for conv_num_j in layer_values:
        auto_resnet([3,3,3], 100, 1, 180, history_structure, 0.8, 64, conv_num_j, rm_conn_i)

## Plot results

In [None]:
import matplotlib.pyplot as plt

%matplotlib inline
plt.rcParams['figure.figsize'] = (10.0, 8.0) # set default size of plots
plt.rcParams['image.interpolation'] = 'nearest'
plt.rcParams['image.cmap'] = 'gray'

%load_ext autoreload
%autoreload 2

legend_50_62 = ['skip_1', 'skip_2', 'skip_3', 'no_skip_1', 'no_skip_2', 'no_skip_3']
plt_different_history(history_structure, legend_50_62)