# Machine Learning Notes - Pytorch

## A List of Related Posts
1. [Pytorch(this)]({% post_url 2021-05-04-machine-learning-pytorch %})
2. [Loss function]({% post_url 2021-05-07-machine-learning-loss %})
3. [Backpropagation]({% post_url 2021-05-07-machine-learning-backpropagation %})

## Pytorch Introduction
[PyTorch](https://pytorch.org/) is an open source machine learning framework. 
You can find more information about PyTorch by following one of the [oficial tutorials](https://pytorch.org/tutorials/) or by [reading the documentation](https://pytorch.org/docs/stable/).

## Import Pythorch
<code>
    torch.cude.is_available()
</code>

In [8]:
# Import pytorch and check its version

import torch
import numpy as np
print(torch.__version__)
print(f'Is cuda available? {torch.cuda.is_available()}')
#

1.8.1
Is cuda available? False


## Pytorch Tensor

### Tensor Initialization
<code>
    torch.tensor(),
    torch.from_numpy(),
    torch.zerors_like(),
    torch.ones_like(),
    torch.rand(),
    torch.ones(),
    torch.zeros(),
    torch.eye(),
    torch.full(),
</code>

In [28]:
# Tensor Initialization

# Directly from data
data = [[1, 2],[3, 4]]
x_data = torch.tensor(data)
print(f"Direct Tensor: \n {x_np} \n")

# From a NumPy array
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
print(f"Numpy Tensor: \n {x_np} \n")

# From another tensor:
x_zeros = torch.zeros_like(x_data)
print(f"Zeros Tensor: \n {x_zeros} \n")

x_ones = torch.ones_like(x_data) 
print(f"Ones Tensor: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) 
print(f"Random Tensor: \n {x_rand} \n")

shape = (2,3,)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
full_tensor = torch.full(shape, 2)
print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}\n")
print(f"Full twos Tensor: \n {full_tensor}\n")
#

Direct Tensor: 
 tensor([[1, 2],
        [3, 4]]) 

Numpy Tensor: 
 tensor([[1, 2],
        [3, 4]]) 

Zeros Tensor: 
 tensor([[0, 0],
        [0, 0]]) 

Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Random Tensor: 
 tensor([[0.0389, 0.8098],
        [0.6647, 0.5246]]) 

Random Tensor: 
 tensor([[0.7586, 0.7846, 0.7909],
        [0.0545, 0.1561, 0.5322]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]])

Full twos Tensor: 
 tensor([[2, 2, 2],
        [2, 2, 2]])



### Tensor Attributes
<code>
    tensor.dim(),
    tensor.shape,
    tensor.dtype,
    tensor.device,
</code>

In [27]:
# Tensor Atrributes
tensor = torch.rand(3,4)

print(f"Dimension of tensor: {tensor.dim()}\n")
print(f"Shape of tensor: {tensor.shape}\n")
print(f"Datatype of tensor: {tensor.dtype}\n")
print(f"Device tensor is stored on: {tensor.device}\n")

#

Dimension of tensor: 2

Shape of tensor: torch.Size([3, 4])

Datatype of tensor: torch.float32

Device tensor is stored on: cpu



### Tensor Operations
<code>
    tensor.cat(),
</code>

In [23]:
# Tensor Operations

# Standard numpy-like indexing and slicing
tensor = torch.ones(4, 4)
tensor[:,1] = 0
tensor[2,2] = 2
tensor[3,3] = 3
print(f'{tensor}\n')

# Joining tensors
t_h = torch.cat([tensor, tensor], dim=1)
print(f'horizontal cat:\n {t_h}\n')

t_v = torch.cat([tensor, tensor], dim=0)
print(f'vetical cat:\n {t_v}\n')
# 

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 2., 1.],
        [1., 0., 1., 3.]])

horizontal cat:
 tensor([[1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 2., 1., 1., 0., 2., 1.],
        [1., 0., 1., 3., 1., 0., 1., 3.]])

vetical cat:
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 2., 1.],
        [1., 0., 1., 3.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 2., 1.],
        [1., 0., 1., 3.]])



**Note**:
- When use `torch.cat` and specify `dim=x`, then the dimension `x` will increase. 

### Tensor Data Type
<code>
    tensor.to(),
    tensor.new_zeros(),
    tensor.float(),
    tensor.double(),
</code>

In [26]:
# Tensor Data Type

# Let torch choose the datatype
x0 = torch.tensor([1, 2])   # List of integers
x1 = torch.tensor([1., 2.]) # List of floats
x2 = torch.tensor([1., 2])  # Mixed list
print('dtype when torch chooses for us:')
print('List of integers:', x0.dtype)
print('List of floats:', x1.dtype)
print('Mixed list:', x2.dtype)

# Force a particular datatype
y0 = torch.tensor([1, 2], dtype=torch.float32)  # 32-bit float
y1 = torch.tensor([1, 2], dtype=torch.int32)    # 32-bit (signed) integer
y2 = torch.tensor([1, 2], dtype=torch.int64)    # 64-bit (signed) integer
print('\ndtype when we force a datatype:')
print('32-bit float: ', y0.dtype)
print('32-bit integer: ', y1.dtype)
print('64-bit integer: ', y2.dtype)

# Other creation ops also take a dtype argument
z0 = torch.ones(1, 2)  # Let torch choose for us
z1 = torch.ones(1, 2, dtype=torch.int16) # 16-bit (signed) integer
z2 = torch.ones(1, 2, dtype=torch.uint8) # 8-bit (unsigned) integer
print('\ntorch.ones with different dtypes')
print('default dtype:', z0.dtype)
print('16-bit integer:', z1.dtype)
print('8-bit unsigned integer:', z2.dtype)

x0 = torch.eye(3, dtype=torch.int64)
x1 = x0.float()  # Cast to 32-bit float
x2 = x0.double() # Cast to 64-bit float
x3 = x0.to(torch.float32) # Alternate way to cast to 32-bit float
x4 = x0.to(torch.float64) # Alternate way to cast to 64-bit float
print('\nx0:', x0.dtype)
print('x1:', x1.dtype)
print('x2:', x2.dtype)
print('x3:', x3.dtype)
print('x4:', x4.dtype)

x0 = torch.eye(3, dtype=torch.float64)  # Shape (3, 3), dtype torch.float64
x1 = torch.zeros_like(x0)               # Shape (3, 3), dtype torch.float64
x2 = x0.new_zeros(4, 5)                 # Shape (4, 5), dtype torch.float64
x3 = torch.ones(6, 7).to(x0)            # Shape (6, 7), dtype torch.float64)
print('\nx0 shape is %r, dtype is %r' % (x0.shape, x0.dtype))
print('x1 shape is %r, dtype is %r' % (x1.shape, x1.dtype))
print('x2 shape is %r, dtype is %r' % (x2.shape, x2.dtype))
print('x3 shape is %r, dtype is %r' % (x3.shape, x3.dtype))
#

dtype when torch chooses for us:
List of integers: torch.int64
List of floats: torch.float32
Mixed list: torch.float32

dtype when we force a datatype:
32-bit float:  torch.float32
32-bit integer:  torch.int32
64-bit integer:  torch.int64

torch.ones with different dtypes
default dtype: torch.float32
16-bit integer: torch.int16
8-bit unsigned integer: torch.uint8

x0: torch.int64
x1: torch.float32
x2: torch.float64
x3: torch.float32
x4: torch.float64

x0 shape is torch.Size([3, 3]), dtype is torch.float64
x1 shape is torch.Size([3, 3]), dtype is torch.float64
x2 shape is torch.Size([4, 5]), dtype is torch.float64
x3 shape is torch.Size([6, 7]), dtype is torch.float64


### Tensor Indexing
<code>
    tensor[start:stop:step]
</code>

In [36]:
# Tensor slicing
x = torch.tensor([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print('Original tensor:')
print(x)
print('shape: ', x.shape)

# Get row 1, and all columns. 
print('\nSingle row:')
print(x[1, :].shape, x[1, :])
print('shape: ', x[1, :].shape, x[1].shape)  

print('\nSingle column:')
print(x[:, 1])
print('shape: ', x[:, 1].shape)

# Get the first two rows and the last three columns
print('\nFirst two rows, last two columns:')
print(x[:2, -3:])
print('shape: ', x[:2, -3:].shape)

# Get every other row, and columns at index 1 and 2
print('\nEvery other row, middle columns:')
print(x[::2, 1:3])
print('shape: ', x[::2, 1:3].shape)

#

Original tensor:
tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])
shape:  torch.Size([3, 4])

Single row:
torch.Size([4]) tensor([5, 6, 7, 8])
shape:  torch.Size([4]) torch.Size([4])

Single column:
tensor([ 2,  6, 10])
shape:  torch.Size([3])

First two rows, last two columns:
tensor([[2, 3, 4],
        [6, 7, 8]])
shape:  torch.Size([2, 3])

Every other row, middle columns:
tensor([[ 2,  3],
        [10, 11]])
shape:  torch.Size([2, 2])

Reordered columns:
tensor([[ 4,  3,  2,  1],
        [ 8,  7,  6,  5],
        [12, 11, 10,  9]])


More generally, given index arrays `idx0` and `idx1` with `N` elements each, `a[idx0, idx1]` is equivalent to:

```
torch.tensor([
  a[idx0[0], idx1[0]],
  a[idx0[1], idx1[1]],
  ...,
  a[idx0[N - 1], idx1[N - 1]]
])
```

(A similar pattern extends to tensors with more than two dimensions)

In [40]:
# Integer Index

x = torch.tensor([[1,2,3,4], [5,6,7,8], [9,10,11,12]])
print('Original tensor:')
print(x)
print('shape: ', x.shape)

idx = torch.tensor([3, 2, 1, 0])  # Index arrays can be int64 torch tensors
print('\nReordered columns:')
print(x[:, idx])


a = torch.tensor([0, 1, 0])  # Index arrays can be int64 torch tensors
b = torch.tensor([3, 2, 1])  # Index arrays can be int64 torch tensors
print('\nReordered rows/columns:')
print(x[a, b])

#

Original tensor:
tensor([[ 1,  2,  3,  4],
        [ 5,  6,  7,  8],
        [ 9, 10, 11, 12]])
shape:  torch.Size([3, 4])

Reordered columns:
tensor([[ 4,  3,  2,  1],
        [ 8,  7,  6,  5],
        [12, 11, 10,  9]])

Reordered rows/columns:
tensor([4, 7, 2])


In [41]:
# Boolen indexing
x = torch.tensor([[1,2], [3, 4], [5, 6]])
print('Original tensor:')
print(x)

mask = (x > 3)
print('\nMask tensor:')
print(mask)

# We can use the mask to construct a rank-1 tensor containing the elements of a
# that are selected by the mask
print('\nSelecting elements with the mask:')
print(x[mask])

# We can also use boolean masks to modify tensors; for example this sets all
# elements <= 3 to zero:
x[x <= 3] = 0
print('\nAfter modifying with a mask:')
print(x)
#

Original tensor:
tensor([[1, 2],
        [3, 4],
        [5, 6]])

Mask tensor:
tensor([[False, False],
        [False,  True],
        [ True,  True]])

Selecting elements with the mask:
tensor([4, 5, 6])

After modifying with a mask:
tensor([[0, 0],
        [0, 4],
        [5, 6]])


### Tensor reshape
<code>
    tensor.view(), tensor.reshape(), tensor.transpose()
</code>

## Pytorch Autograd

## Pytorch Loss Functions
<code>
   torch.nn.CrossEntropyLoss(),torch.nn.MultiLabelMarginLoss()
</code>

### CrossEntropyLoss()
- Input: (N, C), N is batch size, C is number of classes
- Target: (N,), $0 \leq target[i] \leq C-1 $
- Outtput: (N,)

In [49]:
# torch.nn.CrossEntropyLoss(weight=None, size_average=None, ignore_index=-100, reduce=None, reduction='mean')

import torch
loss = torch.nn.CrossEntropyLoss(reduction='mean')
input = torch.randn(3, 5, requires_grad=True)
target = torch.empty(3, dtype=torch.long).random_(5)
output = loss(input, target)
output.backward()

# Image pixel level classifiction for 5 classes
width = 10
height = 10
loss = torch.nn.CrossEntropyLoss(reduction='mean')
input = torch.randn(3, 5, width, height, requires_grad=True)
target = torch.empty(3, width, height, dtype=torch.long).random_(5)
output = loss(input, target)
output.backward()

#

### MultiLabelMarginLoss()
Creates a criterion that optimizes a multi-class multi-classification hinge loss. This means that for a sample x, it could have multiple correct labels. 
- Input: (C,) or (N, C),  N is batch size, C is number of classes
- Target: (C) or (N, C), label targets after first -1 are ignored
- Output: Scalar. If reduction is 'none', then (N,)

In [63]:
# torch.nn.MultiLabelMarginLoss(size_average=None, reduce=None, reduction='mean')

import torch
loss = torch.nn.MultiLabelMarginLoss()
x = torch.FloatTensor([[0.1, 0.2, 0.4, 0.8]])
# Single class hinge loss, so label == 3
y = torch.LongTensor([[3, -1, -1, -1]])
output = loss(x, y)
print(f'single class loss: {output.item():.4f}')
expected = torch.tensor([0.25 * ((1-(0.8-0.1)) + (1-(0.8-0.2)) + (1-(0.8-0.4)))])
assert(torch.isclose(output, expected))

# Multi-class hinge loss, so label == 3 and label == 1
y = torch.LongTensor([[3, 0, -1, -1]])
output = loss(x, y)
print(f'multi-class loss: {output.item():.4f}')
expected = torch.tensor([0.25 * ((1-(0.1-0.2)) + (1-(0.1-0.4)) + (1-(0.8-0.2)) + (1-(0.8-0.4)))])
assert(torch.isclose(output, expected))

#

single class loss: 0.3250
multi-class loss: 0.8500
