In [None]:
#@title ## Mount Your Google Drive

#@markdown The next two cells are **magic** cells.
#@markdown They look like text cells, but they run code behind the scenes.
#@markdown You can run them by either clicking on the ▶️ button (to the left of the cell), or by clicking on the cell and typing `Ctrl+Enter` (or `Shift+Enter`).

#@markdown Please run this cell and follow the steps printed after running it. Specifically, it will print a URL you should enter, follow the instructions there and paste the code in the textbox below (and type `Enter`).

from google.colab import drive
drive.mount('/content/gdrive')

In [None]:
#@title ## Map Your Directory
import os

def check_assignment(assignment_dir, files_list):
  files_in_dir = set(os.listdir(assignment_dir))
  for fname in files_list:
    if fname not in files_in_dir:
      raise FileNotFoundError(f'could not find file: {fname} in assignment_dir')

assignment_dest = "/content/hw3"
assignment_dir = "/content/gdrive/MyDrive/DL4CV/hw3"  #@param{type:"string"}
assignment_files = ['hw3.ipynb', 'autograd.py', 'functional.py', 'nn.py', 'optim.py',
                    'hw2_functional.py', 'hw2_nn.py', 'hw2_optim.py',
                    'models.py', 'train.py', 'utils.py',
                    'test_functional.py', 'test_nn.py', 'test_optim.py']

# check Google Drive is mounted
if not os.path.isdir("/content/gdrive"):
  raise FileNotFoundError("Your Google Drive isn't mounted. Please run the above cell.")

# check all files there
check_assignment(assignment_dir, assignment_files)

# create symbolic link
!rm -f {assignment_dest}
!ln -s "{assignment_dir}" "{assignment_dest}"
print(f'Succesfully mapped (ln -s) "{assignment_dest}" -> "{assignment_dir}"')

# cd to linked dir
%cd -q {assignment_dest}
print(f'Succesfully changed directory (cd) to "{assignment_dest}"')
#@markdown Set the path `assignment_dir` to the assignment directory in your Google Drive and run this cell.

#@markdown If you are not sure what is the path, you can use the **Files (📁)** menu (on the left side) to check the path.

## Imports and `autoreload`-Magic
Please run the cell below (only once) to load and set the `autoreload` magic, which automatically reloads the import calls to the python files with your solutions. That means that you can edit the files (in the right-side window), save them (`Ctrl+S`) and just re-run the relevant cells -- the new code will kick in automatically.

**Note:** You **MUST NOT** install any package. If you can't load something, you probably didn't follow the instructions (either didn't uploaded all the files, didn't mounted your Google driver or didn't mapped your directory).

**Note:** The exercise works as is. If you add or modify imports to things, it may break thing in the notebook. You may do so **AT YOUR OWN RISK**. We will not assist with issues in notebook with modified imports.

**Note:** Make sure you run **all the cells** up to the point. Some cells depends on previous cells (mainly imports). Furthermore, make sure to run the cell below (with the autoreload magic) before any cell below it.

In [1]:
import torch
import os

%load_ext autoreload
%autoreload 2

In [2]:
os.chdir('/home/labs/antebilab/guyilan/Courses/DL4CV/hw3')

# (A) Written Assignment

In addition to the coding assignment, there is also a theoretical written assignment that can be found in `hw3.pdf`. 
Please solve this assignment and upload your solution to the google drive folder as `hw3-sol.pdf`. It will be packed together with your coding solution in the **Submit Your Solution** section.

Your solution to the written part should be typed, not hand-written. We recommend using LyX or LaTex, but you can also use Word or similar text editor.


# (B) Implement CNN from scratch
In This part you will implement a deep **convolutional** neural network from scratch, including the necessary building blocks (similarly to HW2). You will implement it in the following order:
1. **Differentiable Functions:** a set of differentiable functions that are used as atomic building blocks.
2. **Learnable Layers:** Conolutional layer and MaxPool layer.
3. **Optimizer:** SGD *with momentum* (building on your previous vanilla SGD implementation of HW2).

Note that many functions you have implemented in HW2 will be useful in this assignment. They are already imported (since you uploaded `hw2_functional.py`).

## (B.1) Differentiable Functions

In this section you will implement a set of differentiable functions from scratch. For each function, you will implement the forward and backward methods. After the description of the method, there is a testing cell which we will test the correctness of your code.

The skeletons of the differential functions to implement are in the `functional.py` file. Open this file by clicking on this link: `/content/hw3/functional.py`. Alternatively, you can go the left menu, click on **Files (📁)**, go to the directory `hw3` (or `content/hw3`) and double-click on `functional.py` to open it. The tests can be found in `test_functional.py` (link: `/content/hw3/test_functional.py`).

In each step you should fill the blanks (between `# BEGIN SOLUTION` and `# END SOLUTION`) in the relevant methods. DO NOT change any other code segments. You are provided with a cell to run the tests, and with a cell to debug your code (with the relevant imports). As a reminder, for your code to take effect, make sure you save `.py` files using `Ctrl+S`.

### Reminder - `ctx`
As in HW2, in the "from scratch" implementation, you should use a `ctx` (context) variable. This variable is needed for the back propagation algorithm.

Specifically, `ctx` is just a list (or stack) of "backward calls", where each "backward call" is a pair (list/tuple) of two objects:

1. **`backward_fn`:** The backward function. A reference to the backward function to be called in the backward pass.
2. **`args`:** A list (or tuple) of arguments to be passed to `backward_fn`. This list usually consists of the inputs and the outputs of the forward function. Sometimes additional arguments are passed as well. It's important to pass the actual inputs and outputs (same pointer), otherwise it would break the chain of gradients propagation.

The "backward calls" in `ctx` should be ordered in according to the time of addition. That is, a backward call that was added later should have an higher index in the list `ctx`. If `ctx` is `None`, it means that gradients (i.e. backward calls) should not be tracked.



### (B.1.1) Implement the `conv2d` Function

Here you will implement a differentiable `conv2d` function. This includes the forward `conv2d` function and the backward `conv2d_backward` function. 

**Note:** Your solution (both `conv2d` and `conv2d_backward`) should be vectorized. Specifically, consider the `fold`, `unfold`, `einsum` functions of pytorch for vectorized solutions.

#### `conv2d`
The `conv2d` function receives five arguments (in addition to the autograd context `ctx`):

  * `x`: The batched input. Has shape `(batch_size, in_channels, in_height, in_width)`.
  * `w`: The convolution kernel. Has shape `(out_channels, in_channels, kernel_height, kernel_width)`.
  * `b`: The bias term. Has shape `(out_channels,)`.
  * `padding`: The padding in each dimension, has shape (`width_padding`, `height_padding`), or an int representing same padding in both dimensions.
  * `stride`: The stride in each dimension, has shape (`width_stride`, `height_stride`), or an int representing same stride in both dimensions.
  * `dilation` (**Bonus**): The dilation in each dimension, has shape (`width_dilation`, `height_dilation`), or an int representing same dilation in both dimensions.
  * `groups` (**Bonus**): Division of channels into groups. Read more in [Conv2d documentation](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html) of Pytorch.

It computes the (batched version of the) function: $$ \mathbf{y} = W*\mathbf{x} + \mathbf{b} $$
Note that the output `y` has shape which depends on the input parameters.

#### `conv2d_backward`
The `conv2d_backward` function receives the following arguments:

  * `y`: The batched output. Has shape `(batch_size, out_channels, out_height, out_width)`.
  * `x`: The batched input. Has shape `(batch_size, in_channels, in_height, in_width)`.
  * `w`: The weight matrix. Has shape `(out_channels, in_channels, kernel_height, kernel_width)`.
  * `b`: The bias term. Has shape `(out_channels,)`.
  * `padding`: The padding in each dimension, has shape (`width_padding`, `height_padding`), or an int representing same padding in both dimensions.
  * `stride`: The stride in each dimension, has shape (`width_stride`, `height_stride`), or an int representing same stride in both dimensions.
  * `dilation` (**Bonus**): The dilation in each dimension, has shape (`width_dilation`, `height_dilation`), or an int representing same dilation in both dimensions.
  * `groups` (**Bonus**): Division of channels into groups. Read more in [Conv2d documentation](https://pytorch.org/docs/stable/generated/torch.nn.Conv2d.html) of Pytorch.

It computes the gradients of `x`, `w` and `b` w.r.t the loss, given the gradient of `y` (in `y.grad`), and accumulates these gradients in `x.grad`, `w.grad` and `b.grad`, respectively. Note that `stride` and `padding` also have an effect on the backward calculation.

**Note:** If you choose not to do the bonus `groups` and `dilation` arguments, you may assume that their value is always 1 (their default value in the function). Note that some tests may fail in this case (with groups/dilation in their names).


---
You should test your solution by running the following cell. You can debug your solution in the cell below it.

In [3]:
!python -m unittest test_functional.TestConv2d


.................
----------------------------------------------------------------------
Ran 17 tests in 0.851s

OK


In [4]:
# Playground for debugging conv2d
from functional import conv2d, conv2d_backward

### (B.1.2) Implement the `max_pool2d` function

Here you will implement a differentiable MaxPool activation. This includes the forward `max_pool2d` function and the backward `max_pool2d_backward` function.

**Note:** Your solution should be vectorized. Please consider `fold`, `unfold`, `gather` and `scatter_add_` functions for a vectorized solution.

#### `max_pool2d`



The `max_pool2d` function receives the following arguments:

  * `x`: The input. Has shape `(batch_size, in_channels, in_height, in_width)`.
  * `kernel_size`: The window size which should be maxpooled, with shape `(width,height)`.
  * `padding`: The padding in each dimension, has shape (`width_padding`, `height_padding`), or an int representing same padding in both dimensions.
  * `stride`: The stride in each dimension, has shape (`width_stride`, `height_stride`), or an int representing same stride in both dimensions.
  * `dilation` (**Bonus**): The dilation in each dimension, has shape (`width_dilation`, `height_dilation`), or an int representing same dilation in both dimensions.

The output `y` is the max value in each window of size `kernel_size` (The windows locations are affected by the stride and padding as well).
Note that the output `y` has shape which depends on the input parameters.

#### `max_pool2d_backward`
The `max_pool2d_backward` function receives two arguments:

  * `y`: The output. has shape `(batch_size, in_channels, out_height, out_width)`.
  * `x`: The input. 
  * `index`: The indices of the maximum values in each window.
  * `kernel_size`: The window size which should be maxpooled, with shape `(width,height)`.
  * `padding`: The padding in each dimension, has shape (`width_padding`, `height_padding`), or an int representing same padding in both dimensions.
  * `stride`: The stride in each dimension, has shape (`width_stride`, `height_stride`), or an int representing same stride in both dimensions.
  * `dilation` (**Bonus**): The dilation in each dimension, has shape (`width_dilation`, `height_dilation`), or an int representing same dilation in both dimensions.


It computes the gradients of `x` w.r.t the loss, given the gradient of `y` (in `y.grad`), and accumulates this gradient in `x.grad`.

**Note:** If you choose not to do the bonus `dilation` argument, you may assume that its value is always 1 (its default value in the function). Note that some tests may fail in this case (with dilation in their names).


---
You should test your solution by running the following cell. You can debug your solution in the cell below it.

In [5]:
!python -m unittest test_functional.TestMaxPool2d

...............
----------------------------------------------------------------------
Ran 15 tests in 0.761s

OK


In [6]:
# Playground for debugging max_pool2d
from functional import max_pool2d, max_pool2d_backward

### (B.1.3) Implement the `view` Function

Here you will implement a differentiable `view` function. This function takes an input and changes its shape. It will be required when training a CNN, since the input is 2D, but the output should be 1D vector. You may use the `view` function of pytorch in your implementation.

#### `view`
The `view` function receives:

  * `x`: Input, with arbitrary shape.
  * `size`: The output shape.

#### `view_backward`
The `view_backward` function receives the following arguments:

  * `y`: The output with arbitrary shape. 
  * `x`: The input.


It computes the gradients of `x` w.r.t the loss, given the gradient of `y` (in `y.grad`), and accumulates these gradients in `x.grad`.

---
You should test your solution by running the following cell. You can debug your solution in the cell below it.

In [7]:
!python -m unittest test_functional.TestView

.......
----------------------------------------------------------------------
Ran 7 tests in 0.559s

OK


In [8]:
# Playground for debugging view
from functional import view, view_backward

### (B.1.4) Implement the `add` Function

Here you will implement a differentiable `add` function. This function takes two inputs with the same size and adds them. It will be required for designing a neural network with residual connections.

#### `add`
The `add` function receives:

  * `a`: Input, with arbitrary shape.
  * `b`: Input, should have the same shape as `a`.

#### `add_backward`
The `add_backward` function receives the following arguments:

  * `y`: The output with arbitrary shape. 
  * `a`: The first input.
  * `b`: The second input.

It computes the gradients of `a`,`b` w.r.t the loss, given the gradient of `y` (in `y.grad`), and accumulates these gradients in `a.grad`, `b.grad`.

---
You should test your solution by running the following cell. You can debug your solution in the cell below it.

In [9]:
!python -m unittest test_functional.TestAdd

.......
----------------------------------------------------------------------
Ran 7 tests in 0.540s

OK


In [10]:
# Playground for debugging add
from functional import add, add_backward

## (B.2) Autograd
The `autograd.py` file you have implemented as part of HW2 should be on the folder, you will use it again in this assignment.

Reminder:

`autograd.py` contains a general `backward` method. This method stands at the core of back-propagation.

This method receives two arguments:

* `loss`: The loss tensor. This tensor must be a scalar (Has shape `()`). The loss the other tensors will be computed w.r.t this `loss`.
* `ctx`: The autograd context. A list of backward calls. These backward calls should be evaluated to back-propagate the gradient from `loss` to the tensors used in the computation of `loss`.

## (B.3) Layers

So far in this exercise, you have implemented *stateless* differentiable functions. In this section, you will implement *stateful* layers, with parameters. 

In this section you will implement a learnable layer `Conv2d`. You will also implement a non-learnable layer - `MaxPool2d` (it has non-learnable parameters that we wish to store in a per-layer fashion). The implementation is similar to vanilla PyTorch.

You may use any layer you have implemented in HW2 in this assignment. It is already imported (since you uploaded `hw2_nn.py`).

The skeleton of the layers to implement is in the `nn.py` file (link: `/content/hw3/nn.py`). The tests can be found in `test_nn.py` (link: `/content/hw3/test_nn.py`).

Reminder: Layers (and networks) inherits from the provided class `Module` (which is similar to PyTorch's `nn.Module`). This abstract class implements some utility methods. 

In the `nn.py` file, you should fill the blanks (between `# BEGIN SOLUTION` and `# END SOLUTION`) in the relevant methods. DO NOT change any other code segments. You are provided with a cell to run the tests, and with a cell to debug your code (with the relevant imports). 

In your layers, you should:

1. **Create parameter tensors:** create tensors for the **learnable** parameters in the correct shape. The parameters should be attributes of the layer, i.e. set as `self.<param> = <tensor>`. This is done in `__init__`.
2. **Register learnable parameters:** add their names to `self._parameters`. This will be used by the provided `Module.parameters()` (to list module's parameters) and `Module.to()` (to trasfer module's parameters to a device) methods. This is done in `__init__`.
3. **Initialize learnable parameters:** initialization of the layer parameters has significant influence on the local minimum the network reaches during training. This is done in `init_parameters()`. You should call this method from `__init__`, so newly created linear layers are initialized.
4. **Store non-learnable arguments:** Create instance attributes (i.e. `self.<something>`) to hold the non-learnable arguments. These will be used to call the stateless functions implemented in B.1 with the correct parameters.
5. **Implement a forward method:** use the existing differentiable function from part A, and implement the `forward()` method.


**Note:** Since this part doesn't use PyTorch's built-in autograd mechanism, please do not use tensors' `requires_grad` (this will result in errors/warnings).
Furthermore, do not use `nn.Parameter` in _from scratch_ layers.

### (B.3.1) Implement `Conv2d` Layer

The learnable parameters of the `Conv2d` layer are the convolution kernels `weight`, and the bias term `bias`. Note that there are non-learnable arguments which should be stored in your layers (e.g. `stride`).

**Note:** When the `Conv2d` doesn't require `bias`, do not store it in `self._parameters`.

In [11]:
!python -m unittest test_nn.TestConv2d

...
----------------------------------------------------------------------
Ran 3 tests in 0.028s

OK


In [12]:
# Playground for debugging Conv2d
from nn import Conv2d

### (B.3.2) Implement `MaxPool2d` Layer

The `MaxPool2d` layer doesn't have learnable parameters. Note that there are non-learnable arguments which should be stored in your layers (e.g. `stride`). 

In [13]:
!python -m unittest test_nn.TestMaxPool2d

...
----------------------------------------------------------------------
Ran 3 tests in 0.008s

OK


In [14]:
# Playground for debugging MaxPool2d
from nn import MaxPool2d

## (B.4) Optimizer

In this section you will implement an advanced optimizer. In HW2, a vanilla SGD optimizer has been implemented. In this assignment, you will implement *SGD with momentum*. Reminder - The optimizer has three main functions:

1. `__init__`: Receives the list of parameters (weights) to update their values and save them. May receive additional arguments, such as learning-rate and **momentum**.
2. `step`: Updates the parameters values based on the value of their gradients. Doesn't receive any argument.
3. `zero_grad`: Zeros the gradients of the tracked parameters. This is necessary since gradients are accumulated in each backward pass, and we don't want to mix between batches. Doesn't receive any argument.

The skeleton of the optimizer is in the `optim.py` file (link: `/content/hw3/optim.py`). The tests can be found in `test_optim.py` (link: `/content/hw3/test_optim.py`). You should fill the blanks between `# BEGIN SOLUTION` and `# END SOLUTION`. DO NOT change any other code segments. 


### (B.4.1) MomentumSGD Optimizer
In this part, you'll implement an MomentumSGD optimizer. This optimizer has the following update rule:
$$
\mathbf{v}_{0} \leftarrow \mathbf{0}
\\
\mathbf{v}_{n+1} \leftarrow \mu \cdot \mathbf{v}_{n} + \mathbf{g}_{n+1}
\\
\mathbf{x}_{n+1} \leftarrow \mathbf{x}_{n} - \eta \cdot \mathbf{v}_{n+1}
$$
Where $\mathbf{x}_{n}$ is the parameter at step $n$, $\mathbf{g}_{n}$ is its gradient at step $n$, $\eta$ is the learning rate (also called `lr`), and $\mu$ is the momentum parameter.

You should implement the `__init__`, `step` and `zero_grad` methods of `SGD` optimizer in `optim.py`. You are encouraged to base you solution on your SGD solution of HW2.

**Note:** Parameters (tensors) should be updated **in-place** (i.e. with the `-=` operator) in `step`.

**Note:** A gradient (`param.grad`) which is set to `None` is also considered as zero.

In [15]:
!python -m unittest test_optim.TestMomentumSGD

  torch.testing.assert_allclose(param, ref_param, msg=dbg)
.......
----------------------------------------------------------------------
Ran 7 tests in 1.064s

OK


In [16]:
# Playground for debugging MomentumSGD
from optim import MomentumSGD

# Setup Before Training

In this part you will need to use GPU (this will have a significant impact on the training speed). To get a GPU in Google Colab, please go to the top menu and to: **Runtime ➔ Change runtime type**. Then, select **GPU** as **Hardware accelerator**.

Please run the cell below to set your pytorch device (either GPU or CPU), to load the dataset and to create data loaders.

In HW2, the dataset was the MNIST digits dataset (where you classified black and white digits). In this exercise, you will be required to classifiy RGB images into ten categories, using the [CIFAR-10](https://www.cs.toronto.edu/~kriz/cifar.html) dataset. These are RGB images (i.e. has 3 channels) of size $32 \times 32$.



In [3]:
from utils import load_cifar10

# Set the device
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
pin_memory = device.type == 'cuda'

# Load the training and test sets
train_data = load_cifar10(mode='train')
test_data = load_cifar10(mode='test')

# Create dataloaders for training and test sets
train_loader = torch.utils.data.DataLoader(train_data, batch_size=64, shuffle=True, pin_memory=pin_memory)
test_loader = torch.utils.data.DataLoader(test_data, batch_size=64, pin_memory=pin_memory)

Files already downloaded and verified
Files already downloaded and verified


# (C) Define and Train Convolutional Neural Networks From Scratch


In this part, you will define and train neural networks from scratch. You will use your differentiable functions from section (B).

The skeletons for this assignment can be found in the `models.py` (link: `/content/hw3/models.py`) and `train.py` (link: `/content/hw3/train.py`). You should fill the blanks between `# BEGIN SOLUTION` and `# END SOLUTION`. 

### `train.py`
Your solution for `train.py` should be very similar to your solution of HW2 (except for a minor change).

**Note:** Vectorized implementation of the differntiable functions (in `functional.py` and `hw2_functional.py`) has a dramatic effect on the training speed. If your solution to hw2 (i.e. in `hw2_functional.py`) was not vectorized, considered re-implementing it vectorically. You may consult and share vectorized code **of the already submitted hw2** with your colleagues.

### `models.py`
Here you will implement the `ConvNet`.

In /content/hw3/models.py, define the architecture (depth, width, etc.) in the `__init__` function. Remember to register every layer in the following way: `self._modules = ['conv1', 'conv2', ..., 'fc']`. 

The `forward` method will be used for defining the forward pass in your CNN.

---

Once finished, please run the cell below to import the relevant objects in order to train the models.

In [7]:
from functional import cross_entropy_loss as cross_entropy_scratch
from models import ConvNet
from optim import MomentumSGD
from train import train_loop as train_loop_scratch

## (C.1) Implement and Train a ConvNet

 

Once the network is designed, use the following cell to train the network. This includes the following parts:

1. Create the model.
2. (Optional) Transfer the model to `device`.
3. Create an optimizer, and set its params. (this should be done when the model is in its final device. It will not work otherwise).
4. Set other hyper-parameters (loss function, number of epochs, etc.).
5. Train the model.

Your goal is to reach high accuracy (>65%). Achieving substantially higher accuracy (>80%) will be awarded with bonus. You are encouraged to fine tune the following:
* Network architecture (number of layers, convolution parameters, number of channels, etc.)..
* Weights initialization.
* Optimizer parameters.
* Number of epochs.

Furthermore, you may also try other ways to optimize your performance, such as:
* Adding dilation and groups to your conv implementation.
* Using residual-connections in your network.
* Changing the learning rate during training (called scheduling, can be done in the notebook, by calling training loop several times)
* Implementing and adding `BatchNorm2d` ([read this before](https://kratzert.github.io/2016/02/12/understanding-the-gradient-flow-through-the-batch-normalization-layer.html)). <br>**Note:** This is very technically difficult and time demanding. You would have to implement `functional.batchnorm2d`, `functional.batchnorm2d_backward` and `nn.BatchNorm2d`. The `mu` and `sigma` should be accumulated in **buffers** (`self._buffers`) of `nn.BatchNorm2d`, not **parameters**. You may implement additional auxilary differentiable functions for that task, and consult with us regardind this.

In [9]:
# BEGIN SOLUTION
# Define your model
model = ConvNet(in_channels=3, num_classes=10)

# Transfer it to device
model = model.to(device=device)

# Set a criterion (loss function)
criterion = cross_entropy_scratch

# Set the number of epochs
epochs = 200
lr = 0.002
momentum = 0.9

# Train your model
optimizer = MomentumSGD(model.parameters(), lr=lr, momentum=momentum)
train_loop_scratch(model=model,
                   criterion=criterion,
                   optimizer=optimizer,
                   train_loader=train_loader,
                   test_loader=test_loader,
                   device=device,
                   epochs=epochs,
                   rounds = 10)



Train   Epoch: 001 / 200   Loss:   2.302   Accuracy: 0.117
 Test   Epoch: 001 / 200   Loss:     2.3   Accuracy: 0.166
Train   Epoch: 002 / 200   Loss:   2.152   Accuracy: 0.211
 Test   Epoch: 002 / 200   Loss:   1.983   Accuracy: 0.259
Train   Epoch: 003 / 200   Loss:   1.791   Accuracy: 0.340
 Test   Epoch: 003 / 200   Loss:   1.638   Accuracy: 0.400
Train   Epoch: 004 / 200   Loss:   1.553   Accuracy: 0.435
 Test   Epoch: 004 / 200   Loss:   1.481   Accuracy: 0.464
Train   Epoch: 005 / 200   Loss:   1.412   Accuracy: 0.494
 Test   Epoch: 005 / 200   Loss:    1.33   Accuracy: 0.520
Train   Epoch: 006 / 200   Loss:   1.275   Accuracy: 0.546
 Test   Epoch: 006 / 200   Loss:   1.227   Accuracy: 0.566
Train   Epoch: 007 / 200   Loss:   1.149   Accuracy: 0.594
 Test   Epoch: 007 / 200   Loss:   1.142   Accuracy: 0.601
Train   Epoch: 008 / 200   Loss:   1.046   Accuracy: 0.633
 Test   Epoch: 008 / 200   Loss:   1.049   Accuracy: 0.634
Train   Epoch: 009 / 200   Loss:  0.9551   Accuracy: 0.6

# (D) Define and Train Convolutional Neural Network Using Pytorch
In this section, you will use Pytorch to build and train a neural network for the task of super resolution of a single image. 

Please open the notebook located in `zssr/main.ipynb` (in a different tab), and follow the instructions there. Save and close this notebook.