# PyTorch





In [13]:
import torch
from torch import nn
import torch.nn.functional as F
from torchvision import datasets, transforms
from torch import optim

# setting device on GPU if available, else CPU
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
print()

#Additional Info when using cuda
if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))
    print('Memory Usage:')
    print('Allocated:', round(torch.cuda.memory_allocated(0)/1024**3,1), 'GB')
    print('Cached:   ', round(torch.cuda.memory_reserved(0)/1024**3,1), 'GB')

Using device: cuda

GeForce RTX 2070 SUPER
Memory Usage:
Allocated: 0.0 GB
Cached:    0.0 GB


## Tensors

A `torch.Tensor` is a multi-dimensional matrix containing elements of a single data type. (Similar to Numpy Array)


In [4]:
torch.tensor([
    [1,2,3],
    [4,5,6],
    [7,8,9]
], device=device)

tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]], device='cuda:0')

### Constructors

 - `torch.randn(size)` - Returns a tensor filled with random numbers from a normal distribution with mean 0 and variance 1.
 - `torch.randn_like(array_like)` - Returns a tensor with the same size as input that is filled with random numbers from a normal distribution with mean 0 and variance 1.
 - `torch.zeros(size)` - Returns a tensor filled with the scalar value 0.
 - `torch.zeros_like(array_like)` - Returns a tensor filled with the scalar value 0, with the same size as input.

### Parameters
- `data` (array_like) – The returned Tensor copies data.
- `dtype` (`torch.dtype`, optional) – the desired type of returned tensor. Default: if None, same torch.dtype as this tensor.
- `device` (`torch.device`, optional) – the desired device of returned tensor. Default: if None, same torch.device as this tensor.
- `requires_grad` (bool, optional) – If autograd should record operations on the returned tensor. Default: False.

### Data Types

| Data type | dtype |
|---|---|
| 32-bit floating point | `torch.float32` or `torch.float` |
| 64-bit floating point | `torch.float64` or `torch.double` |
| 64-bit complex | `torch.complex64` or `torch.cfloat` |
| 128-bit complex | `torch.complex128` or `torch.cdouble` |
| 16-bit floating point 1 | `torch.float16` or `torch.half` |
| 16-bit floating point 2 | `torch.bfloat16` |
| 8-bit integer (unsigned) | `torch.uint8` |
| 8-bit integer (signed) | `torch.int8` |
| 16-bit integer (signed) | `torch.int16` or `torch.short` |
| 32-bit integer (signed) | `torch.int32` or `torch.int` |
| 64-bit integer (signed) | `torch.int64` or `torch.long` |
| Boolean | `torch.bool` |

### Tensor Arithmitic

- `torch.mm(input1, input2)` - Performs a matrix multiplication of the matrices input1 and input2.
- `torch.sum(input)` - Returns the sum of all elements in the input tensor.
- `torch.matmul(input1, input2)` - Matrix product of two tensors. (The behavior depends on the dimensionality of the tensors)

### Tensor Reshape

- `tensor.reshape(a,b)`
- `tensor.resize_(a,b)` - Returns the same tensor with a different shape
- `tensor.view(a,b)`

## Datasets

Torchvision module provides some datasets that can be downloaded.

See [torchvision.datasets](https://pytorch.org/docs/stable/torchvision/datasets.html)

In [11]:
# define transform to normalize data
data_transform = transforms.Compose([
                    transforms.ToTensor(), 
                    transforms.Normalize((0.5,), (0.5,))
                ])

trainset = datasets.MNIST('~/.pytorch/MNIST_data/', download=True, train=True, transform=data_transform)
trainloader = torch.utils.data.DataLoader(trainset, batch_size=64, shuffle=True)

# Get our data
images, labels = next(iter(trainloader))

labels

tensor([9, 0, 9, 7, 7, 8, 3, 3, 5, 4, 9, 6, 3, 8, 2, 9, 0, 6, 9, 1, 6, 0, 6, 2,
        5, 5, 3, 8, 4, 1, 8, 8, 9, 1, 3, 3, 4, 7, 9, 0, 8, 3, 1, 3, 5, 6, 2, 2,
        7, 0, 6, 7, 9, 1, 4, 9, 8, 1, 7, 3, 9, 7, 5, 7])

## Neural Network



In [15]:
# Define feed-forward network
model = nn.Sequential(
            nn.Linear(784,128), # hidden layer 1
            nn.ReLU(),          # ReLU activation function
            nn.Linear(128,64),  # hidden layer 2
            nn.ReLU(),          # ReLU activation function
            nn.Linear(64,10)    # output layer
          )

# Define loss function
criterion = nn.CrossEntropyLoss()

# Define optimizer
optimizer = optim.SGD(model.parameters(), lr=0.01)

epochs = 2
for e in range(epochs):
    running_loss = 0
    for images, labels in trainloader:
        # Flatten MNIST images into a 784 long vector
        images = images.view(images.shape[0], -1)
    
        optimizer.zero_grad()

        # Training pass
        logits = model.forward(images)
        loss = criterion(logits, labels)
        loss.backward()
        optimizer.step()
        
        running_loss += loss.item()
    else:
        print(f"Training loss: {running_loss/len(trainloader)}")

Training loss: 1.013490302110913
Training loss: 0.38199806323787294
Training loss: 0.3228171606069562
Training loss: 0.2900727862345257
Training loss: 0.26480596926388966


### Linear Transformation

- `nn.Linear(in_features, out_features, bias=True)` - Applies a linear transformation to the incoming data: $y = xA^T + b$
  - `in_features` – size of each input sample
  - `out_features` – size of each output sample
  - `bias` – If set to False, the layer will not learn an additive bias. Default: True

### Activation Functions

- `nn.ReLU()` - Applies the rectified linear unit function element-wise $ReLU(x)=(x)^+=max(0,x)$
- `nn.Sigmoid()` - Applies the sigmoid function $Sigmoid(x) = \sigma(x) = 1 \div (1 + e^{-x})$
- `nn.Softmax(dim)` - Applies the Softmax function to an n-dimensional input Tensor rescaling them so that the elements of the n-dimensional output Tensor lie in the range [0,1] and sum to 1. $Sofmax(x_i) = e^{x_i} \div (\sum_j e^{x_j})$

### Loss Functions

- `nn.NLLLoss()` - The negative log likelihood loss. It is useful to train a classification problem with C classes.
- `nn.CrossEntropyLoss()` - This criterion combines `nn.LogSoftmax()` and `nn.NLLLoss()` in one single class.