# Computing gradient

In this notebook your task will be to implement forward and backward pass of Linear module.

Resources:

* Backprop with focus on PyTorch: https://www.youtube.com/watch?v=ma2KXWblllc (see also other lectures from this series)

* Lecture on backpropagation https://www.cs.ox.ac.uk/people/nando.defreitas/machinelearning/ (Lecture 8)

# Setup

In [2]:
%load_ext autoreload
%autoreload 2

import numpy as np
import tqdm
import json

import torch
import torch.nn.functional as F

from torch import optim
from torch import nn
from torch.autograd import Variable

from keras.datasets import fashion_mnist
from keras.utils import np_utils

%matplotlib inline
import matplotlib.pylab as plt
import matplotlib as mpl

from torch.autograd import gradcheck

Using Theano backend.


# Linear

Your task is to implement backward and forward pass of a Linear module. 

Hint: try to implement first using for loops, and then coe using matrices

In [3]:
# Inherit from Function
class Linear(torch.autograd.Function):
    # bias is an optional argument
    def forward(self, input, weight, bias=None):
        self.save_for_backward(input, weight, bias)
        output = input.mm(weight.t())
        if bias is not None:
            output += bias.unsqueeze(0).expand_as(output)
        return output

    # This function has only a single output, so it gets only one gradient
    def backward(self, grad_output):
        # This is a pattern that is very convenient - at the top of backward
        # unpack saved_tensors and initialize all gradients w.r.t. inputs to
        # None. Thanks to the fact that additional trailing Nones are
        # ignored, the return statement is simple even when the function has
        # optional inputs.
        input, weight, bias = self.saved_tensors
        grad_input = grad_weight = grad_bias = None
        
        print(grad_output.shape)

        # These needs_input_grad checks are optional and there only to
        # improve efficiency. If you want to make your code simpler, you can
        # skip them. Returning gradients for inputs that don't require it is
        # not an error.
        if self.needs_input_grad[0]:
            grad_input = None# dL/dw = dL/dout dout/dx
        if self.needs_input_grad[1]:
            grad_weight = grad_output.t().mm(input) # dL/dw = dL/dout dout/w
        if bias is not None and self.needs_input_grad[2]:
            grad_bias = grad_output.sum(0).squeeze(0) # dL/dw = dL/dout dout/db 

        # dL/dinput_i !, where i=0 is input, i=1 is weight, i=2 is bias
        return grad_input, grad_weight, grad_bias

In [5]:
input = (Variable(torch.randn(20,20).double(), requires_grad=False), 
         Variable(torch.randn(15,20).double(), requires_grad=True))
test = gradcheck(Linear(), input, eps=1e-6, atol=1e-4)
print(test)

torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([2

# Conv1d

Your task will be to implement forward and backward pass of 1d convolutional layer.

We will have separate lab on convolutions. A crash course on CNNs:

<img width=300 src=http://cs231n.github.io/assets/nn1/neural_net2.jpeg>

<img width=400 src=http://cs231n.github.io/assets/cnn/stride.jpeg>

# Tests

In [6]:
input = (Variable(torch.randn(20,20).double(), requires_grad=True), 
         Variable(torch.randn(15,20).double(), requires_grad=True))
result['linear'] = 0.5*int(gradcheck(Linear(), input, eps=1e-6, atol=1e-4))

torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([20, 15])
torch.Size([2

NameError: name 'result' is not defined