# EECS 445 PyTorch Tutorial

Adapted from the `EECS 498 - Deep Learning` tutorial found on ( https://github.com/honglaklee/deep-learning-course/blob/master/CodeExamples/01_basics.ipynb ).

## PyTorch

WHAT IS PYTORCH?

It’s a Python-based scientific computing package targeted at two sets of audiences:

* A replacement for NumPy to use the power of GPUs
* a deep learning research platform that provides maximum flexibility and speed

### Tensors

Tensors are similar to NumPy’s ndarrays, with the addition being that Tensors can also be used on a GPU to accelerate computing.

PyTorch provides functions similar to numpy to create tensors:

In [5]:
import torch
import numpy as np
import matplotlib.pyplot as plt
%matplotlib inline

In [6]:
x = torch.empty(5, 3) # Construct a 5x3 matrix, uninitialized
print(x)
print(x.size()) # torch.Size is a tuple, so it supports all tuple operations

tensor([[0.0000e+00, 3.6893e+19, 0.0000e+00],
        [3.6893e+19, 5.6052e-45, 0.0000e+00],
        [0.0000e+00, 0.0000e+00, 0.0000e+00],
        [3.6893e+19, 3.8246e-34, 8.5920e+09],
        [2.8026e-45, 3.6893e+19, 3.8226e-34]])
torch.Size([5, 3])


In [7]:
x = torch.rand(2, 3) # Construct a randomly initialized matrix
print(x)

tensor([[0.0207, 0.8633, 0.0724],
        [0.2607, 0.4335, 0.4868]])


In [8]:
x = torch.zeros(5, 3, dtype=torch.long) # Construct a matrix filled zeros and of dtype long
print(x)

tensor([[0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0],
        [0, 0, 0]])


In [9]:
x = torch.tensor([5.5, 3]) # Construct a tensor directly from data, similarly to how you construct NumPy Arrays
print(x)

tensor([5.5000, 3.0000])


In [11]:
# Create a tensor based on an existing tensor.
# These methods will reuse properties of the input tensor, e.g. dtype, unless new values are provided by user
x = x.new_ones(2, 3)      # new_* methods take in sizes
x = x.new_ones((2, 3), dtype=torch.double)      # this is equivalent
print(x)

x = torch.randn_like(x, dtype=torch.float)    # override dtype!
print(x)

tensor([[1., 1., 1.],
        [1., 1., 1.]], dtype=torch.float64)
tensor([[ 1.4561,  0.1384,  1.4581],
        [-0.3650,  0.6728, -1.1754]])


### Operations

There are multiple syntaxes for operations. In the following example, we will take a look at the addition operation.

In [12]:
y = torch.rand(2, 3)
print(x + y)

print(torch.add(x, y))

result = torch.empty(2, 3)
torch.add(x, y, out=result) # Providing an output tensor as argument
print(result)

y.add_(x) # In-place addition
print(y)

tensor([[ 2.0091,  0.8896,  2.2819],
        [-0.1125,  1.3262, -0.3386]])
tensor([[ 2.0091,  0.8896,  2.2819],
        [-0.1125,  1.3262, -0.3386]])
tensor([[ 2.0091,  0.8896,  2.2819],
        [-0.1125,  1.3262, -0.3386]])
tensor([[ 2.0091,  0.8896,  2.2819],
        [-0.1125,  1.3262, -0.3386]])


**Note**: Any operation that mutates a tensor in-place is post-fixed with an \_. For example: x.copy_(y), x.t_(), will change x.

### Indexing

You can use standard numpy-like indexing

If you have a one element tensor, use .item() to get the value as a Python number

In [14]:
x = torch.randn(1)
print(x)
print(x.item())

tensor([-1.0010])
-1.00099778175354


**Read later:**


  100+ Tensor operations, including transposing, indexing, slicing,
  mathematical operations, linear algebra, random numbers, etc.,
  are described here <https://pytorch.org/docs/torch>.

### NumPy Bridge

Converting a Torch Tensor to a NumPy array and vice versa is a breeze.

**Important**: The Torch Tensor and NumPy array will share their underlying memory
locations, and changing one will change the other.


In [15]:
a = torch.ones(5)
print(a)
b = a.numpy() # Converts torch tensor to numpy array
print(b)

a.add_(1)
print(a)
print(b)

tensor([1., 1., 1., 1., 1.])
[1. 1. 1. 1. 1.]
tensor([2., 2., 2., 2., 2.])
[2. 2. 2. 2. 2.]


In [16]:
import numpy as np
a = np.ones(5)
b = torch.from_numpy(a) # Converts numpy array to torch tensor
np.add(a, 1, out=a) # broadcasting
print(a)
print(b)

[2. 2. 2. 2. 2.]
tensor([2., 2., 2., 2., 2.], dtype=torch.float64)


### CUDA Tensor
Tensors can be moved onto any device using the .to method.

In [17]:
# let us run this cell only if CUDA is available
# We will use ``torch.device`` objects to move tensors in and out of GPU
if torch.cuda.is_available():
    device = torch.device("cuda")          # a CUDA device object
    y = torch.ones_like(x, device=device)  # directly create a tensor on GPU
    x = x.to(device)                       # or just use strings ``.to("cuda")``
    z = x + y
    print(z)
    print(z.to("cpu", torch.double))       # ``.to`` can also change dtype together!

### Linear Regression example

torch.nn is a Neural Network package useful for constructing models which provides layers, activation functions, loss functions, etc. ( https://github.com/torch/nn )

In [1]:
from itertools import count

import torch
import torch.autograd
import torch.nn as nn
import torch.nn.functional as F

Define a polynomial function

In [2]:
POLY_DEGREE = 4
W_target = torch.randn(POLY_DEGREE, 1) * 5
b_target = torch.randn(1) * 5

def f(x):
    """Approximated function."""
    return x.mm(W_target) + b_target.item()

Data loader and other utils

In [3]:
def make_features(x):
    """Builds features i.e. a matrix with columns [x, x^2, x^3, x^4]."""
    # change shape of x from (N,) to (N,1)
    x = x.unsqueeze(1)
    # create POLY_DEGREE (N,1) tensors and concatenate them across dimension 1 into 1 (N, POLY_DEGREE) tensor
    return torch.cat([x ** i for i in range(1, POLY_DEGREE+1)], 1)

def get_batch(batch_size=32):
    """Builds a batch i.e. (x, f(x)) pair."""
    random = torch.randn(batch_size)
    x = make_features(random)
    y = f(x)
    return x, y

def poly_desc(W, b):
    """Creates a string description of a polynomial."""
    result = 'y = '
    for i, w in enumerate(W):
        result += '{:+.2f} x^{} '.format(w, len(W) - i)
    result += '{:+.2f}'.format(b[0])
    return result

Defining the model

In [6]:
# Define model
class myNN(nn.Module):
    def __init__(self):
        super().__init__()
        self.fc = nn.Linear(W_target.size(0), 1)
        self.init_weights()
    
    def init_weights(self):
        nn.init.normal_(self.fc.weight, 0.0, 1 / np.sqrt(W_target.size(0)))
        nn.init.constant_(self.fc.bias, 0.0)
        
    def forward(self, x):
        N, POLY_DEGREE = x.shape
        z = self.fc(x)
        h = z.clone()  # normally you would apply some activation function here
        return h

    
model = myNN()

Training loop

In [25]:
for batch_idx in count(1):
    # Get data
    batch_x, batch_y = get_batch()

    # Reset gradients
    # This is necessary because otherwise gradients would accumulate, and end results would be skewed
    model.zero_grad()

    # Forward pass
    output = F.smooth_l1_loss(model(batch_x), batch_y)
    loss = output.item()

    # Backward pass using backpropagation of the gradient
    output.backward()
    
    # Optimize weights using SGD
    for param in model.parameters():
        param.data.add_(-0.1 * param.grad.data)

    # Stop criterion
    if loss < 1e-3:
        break

print('Loss: {:.6f} after {} batches'.format(loss, batch_idx))
print('==> Learned function:\t' + poly_desc(model.fc.weight.view(-1), model.fc.bias))
print('==> Actual function:\t' + poly_desc(W_target.view(-1), b_target))

Loss: 0.000881 after 378 batches
==> Learned function:	y = +8.82 x^4 -1.26 x^3 -4.52 x^2 -1.83 x^1 -5.72
==> Actual function:	y = +8.85 x^4 -1.26 x^3 -4.52 x^2 -1.83 x^1 -5.71


Here's a slightly more advanced example of a Convolutional Neural Network for MNIST digit classification. <https://github.com/pytorch/examples/blob/master/mnist/main.py>

### Other Resources

Here are some other useful resources on PyTorch

* https://cs230-stanford.github.io/pytorch-getting-started.html
* https://github.com/jcjohnson/pytorch-examples
* https://pytorch.org/tutorials/beginner/deep_learning_60min_blitz.html