## PyTorch Tutorial

This Jupyter Notebook will cover the PyTorch functions that you will find the most useful in Homework 7

In [1]:
import torch
import numpy as np
import torch.nn as nn
import torch.nn.functional as F
import torch.optim as optim

### Network Layers
The following are neural network layers. In Homework 7, we will be using `nn.Linear`, `nn.Conv2d`, and `nn.MaxPool2d`.

In [2]:
# The module torch.nn contains network layers

# creates a linear layer with 128 input dimensions and 64 output dimensions. 
in_dim, out_dim = 128, 64
linear = nn.Linear(in_dim, out_dim)

# creates a 2d convolutional layer with the following params
in_channels = 4
out_channels = 12
kernel_size = 5
stride = 2
dilation = 2
conv2d = nn.Conv2d(in_channels, out_channels, kernel_size, stride=stride, dilation=dilation)

# creates a max pool layer
kernel_size = 5
max_pool  = nn.MaxPool2d(kernel_size=kernel_size)

Convolutional layers can get pretty complex, you can find some information about some of the params here:

https://github.com/vdumoulin/conv_arithmetic/blob/master/README.md

In [3]:
# You can chain layers together using nn.Sequential
seq_layer = nn.Sequential(
    nn.Linear(128, 64),
    nn.ReLU(),
    nn.Linear(64, 32),
    nn.ReLU(),
    nn.Linear(32, 2),
    nn.Sigmoid()
)

### Activation Functions

Activation functions are also from Module `torch.nn`

In [4]:
# Rectified Linear Unit (ReLU)
relu = nn.ReLU()

# Sigmoid
sig = nn.Sigmoid()

## Using Layers and Activation functions

In [5]:
# After initializing a layer or activation function, you can pass input to it simply by using __call__()
# This is true for all classes that implement nn.Module

# Create 30 samples with dimension 128
x = torch.rand(30, 128)
out = linear(x)
out = relu(out)
out.shape

torch.Size([30, 64])

In [6]:
seq_out = seq_layer(x)
seq_out.shape

torch.Size([30, 2])

### Loss function
We will only be using Cross Entropy loss function for this homework. 

In [7]:
pred = out
targets = torch.randint(0, 64, (30,))

# For Cross Entropy, the predicted classes are the one-hot encoded in pred. 
# And targets[i] is the target for feature row i

# You can run cross entropy in two ways
loss_func = nn.CrossEntropyLoss()
loss = loss_func(pred, targets)
print(loss)
# or
loss = F.cross_entropy(pred, targets)
print(loss)

tensor(4.2204, grad_fn=<NllLossBackward>)
tensor(4.2204, grad_fn=<NllLossBackward>)


### Optimizer

The optimizer updates the parameters of a network

In [8]:
# Initialize SGD Optimizer to update the parameters of a linear layer with learning rate of .001
optimizer = optim.SGD(linear.parameters(), lr=.001)

# To update the gradients of the optimizer, after taking calculating the loss
loss.backward()
# To update the parameters according to a gradient
optimizer.step()
# To reset gradients
optimizer.zero_grad()

### Other useful functions

In [9]:
# torch.tensor.numpy
# Convert tensor to numpy array
np_x = x.numpy()

# For tensors involved with gradient descent, such as network parameters or network outputs, 
# you will need to do .detach() first
np_out = out.detach().numpy()

# Convert numpy to torch
ones = np.ones(100)
ones = torch.tensor(ones)

In [10]:
# torch.rand
# random floats from 0 to 1 with shape (a, b, c)
a = 5
b = 10
c = 30
r = torch.rand(5, 10, 30)

# torch.randint
# random integers from 0 to n-1 with shape (a, b, c)
n = 100
ri = torch.randint(100, (a, b, c))

In [11]:
# torch.tensor.flatten
# Flatten a n-D tensor
r = torch.rand(100, 100)
print(r.shape)
r = r.flatten()
print(r.shape)

torch.Size([100, 100])
torch.Size([10000])


In [14]:
# torch.tensor.permute
# Rerange the axes of a tensor
# For example, this transforms a tensor with shape (4, 2, 1) to (1, 4, 2)
x = torch.zeros(4, 2, 1)
print(x.shape)
permute_x = x.permute(2, 0, 1)
print(permute_x.shape)

torch.Size([4, 2, 1])
torch.Size([1, 4, 2])
