<!--BOOK_INFORMATION-->
<img align="left" style="width:80px;height:98px;padding-right:20px;" src="https://raw.githubusercontent.com/joe-papa/pytorch-book/main/files/pytorch-book-cover.jpg">

This notebook contains an excerpt from the [PyTorch Pocket Reference](http://pytorchbook.com) book by [Joe Papa](http://joepapa.ai); content is available [on GitHub](https://github.com/joe-papa/pytorch-book).

In [1]:
import numpy as np
import matplotlib.pyplot as plt
import torch

In [2]:
torch.__version__

'2.1.2+cu121'

In [3]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(device)

cuda


## PyTorch Tensors
<hr>

Tensors are the fundamental data types of PyTorch. A tensor is a multi-dimensional matrix similar to Numpy's ndarrays:

- A scalar can be represented as a zero-dimensional tensor.
- A vector can be represented as a one-dimensional tensor.
- ...

<div style="text-align:center">
    <img src="images/tensor_intro.png" width=500>
    <center><caption><br><font color="purple"><b><u>Figure 1:</u></b> Tensors of different shapes</font></caption></center>
</div>

In [4]:
x = torch.tensor([[1, 2]])    # 2D matrix of size: (1, 2)
y = torch.tensor([[1], [2]])  # 2D matrix of size: (2, 1)
z = torch.tensor([1, 2])      # 1D vector of size: (2,  )

In [5]:
# tensor.shape OR tensor.size() to print the tensor's dimensions

print(x.shape)
print(y.shape)
print(z.shape)

torch.Size([1, 2])
torch.Size([2, 1])
torch.Size([2])


In [6]:
zeros = torch.zeros((3, 4))   # 2D tensor of zeros of size: (3, 4)
ones = torch.ones((2, 2))     # 2D tensor of ones of size: (2, 2)

In [7]:
zeros

tensor([[0., 0., 0., 0.],
        [0., 0., 0., 0.],
        [0., 0., 0., 0.]])

In [11]:
# x = 2D tensor of size (3, 4) containing random integers [0, 10)
x = torch.randint(low=0, high=10, size=(3, 4))
print(x)

tensor([[4, 6, 7, 5],
        [1, 3, 1, 1],
        [4, 5, 8, 8]])


In [67]:
# y = 2D tensor of size (3, 4) taken from the standard normal distribution
y = torch.randn((3, 4))
print(y)

tensor([[-0.1624, -1.0752, -0.2584,  1.0793],
        [ 1.5542,  1.5877, -0.3448,  0.3777],
        [-0.1208,  0.5605,  1.9917, -0.4709]])


In [68]:
y.type()

'torch.FloatTensor'

In [69]:
y = y.type(torch.int32)

In [70]:
y

tensor([[ 0, -1,  0,  1],
        [ 1,  1,  0,  0],
        [ 0,  0,  1,  0]], dtype=torch.int32)

In [72]:
# Use to() method to cast to a new type
y = y.to(dtype=torch.float64)
y

tensor([[ 0., -1.,  0.,  1.],
        [ 1.,  1.,  0.,  0.],
        [ 0.,  0.,  1.,  0.]], dtype=torch.float64)

### Casting Numpy Arrays into Torch Tensors

- `torch.tensor()`
- `torch.from_numpy()`

In [27]:
x = np.array([
    [1, 2, 3],
    [4, 5, 6],
])

y = torch.tensor(x)
# y = torch.from_numpy(x)

print(type(x), type(y))

<class 'numpy.ndarray'> <class 'torch.Tensor'>


In [28]:
y

tensor([[1, 2, 3],
        [4, 5, 6]], dtype=torch.int32)

### Operations on Tensors

In [32]:
x = torch.tensor([
    [1, 2, 3],
    [4, 5, 6]
])

In [37]:
x

tensor([[1, 2, 3],
        [4, 5, 6]])

In [33]:
print(x * 10)        # multiples tensor by 10

tensor([[10, 20, 30],
        [40, 50, 60]])


In [34]:
print(x.add(10))     # adds 10 to the tensor elements

tensor([[11, 12, 13],
        [14, 15, 16]])


In [40]:
y = x.view(3, 2)  # reshape x into tensor (3, 2)

In [41]:
y

tensor([[1, 2],
        [3, 4],
        [5, 6]])

In [42]:
print(x.view(4, -1))  # invalid because (3, 2) can't fit to (4, ?)

RuntimeError: shape '[4, -1]' is invalid for input of size 6

In [43]:
# another way to reshape a tensor is by using the `squeeze` method
x = torch.randn(10, 1, 10)     # 3D tensor of size (10, 1, 10)

# remove the dimension at index 1
y = torch.squeeze(x, 1)        # 2D tensor of size (10, 10)
# y = x.squeeze(1)

In [45]:
# the opposite of `squeeze` is `unsqueeze` to add a dimension

reverse_y = y.unsqueeze(1)    # add a new dimension at index 1, default index is 0
print(reverse_y.shape)        # original y.shape=(10, 10); after adding dim, y.shape(10, 1, 10)

torch.Size([10, 1, 10])


In [46]:
# unsqueezing can also be done using [None] indexing
# adding None will auto-create a fake dimension at the specified axis

x = torch.randn(10, 10)
z1, z2, z3 = x[None], x[:,None], x[:,:,None]

print(x.shape)
print(z1.shape, z2.shape, z3.shape)

torch.Size([10, 10])
torch.Size([1, 10, 10]) torch.Size([10, 1, 10]) torch.Size([10, 10, 1])


In [2]:
# MATRIX MULTIPLICATION

x = torch.tensor([[1, 2],[3, 4]])    # shape = (2, 2)
y = torch.tensor([[5],[6]])          # shape = (2, 1)
z = torch.matmul(x, y)
print(z)

tensor([[17],
        [39]])


In [3]:
# alternatively, multiplication can be performed as
print(x @ y)

tensor([[17],
        [39]])


In [4]:
batch = 10
a = torch.rand(batch, 2, 3)
b = torch.rand(batch, 3, 4)

# batch matrix-matrix multiplication
z = torch.bmm(a, b)

In [6]:
z.shape

torch.Size([10, 2, 4])

In [50]:
# CONCATENATION

x = torch.randn(10, 10, 10)
y = torch.cat([x, x], axis=0)

print('Cat axis 0:', x.shape, y.shape)

Cat axis 0: torch.Size([10, 10, 10]) torch.Size([20, 10, 10])


In [54]:
# AGGREGATE FUNCTIONS

x = torch.arange(25).reshape(5, 5)

print(x)
print('Max:', x.max())
print('\nThe maximum row (dim=0) is the following: (entries and the row number) \n', x.max(dim=0))
print('\nThe maximum col (dim=1) is the following: (entries and the col number) \n', x.max(dim=1))

tensor([[ 0,  1,  2,  3,  4],
        [ 5,  6,  7,  8,  9],
        [10, 11, 12, 13, 14],
        [15, 16, 17, 18, 19],
        [20, 21, 22, 23, 24]])
Max: tensor(24)

The maximum row (dim=0) is the following: (entries and the row number) 
 torch.return_types.max(
values=tensor([20, 21, 22, 23, 24]),
indices=tensor([4, 4, 4, 4, 4]))

The maximum col (dim=1) is the following: (entries and the col number) 
 torch.return_types.max(
values=tensor([ 4,  9, 14, 19, 24]),
indices=tensor([4, 4, 4, 4, 4]))


In [55]:
# Permute the dimensions of a tensor object
# if you need to swap dimensions, use permute and not view; although torch will not throw an error

x = torch.randn(10, 20, 30)
z = x.permute(2, 0, 1)
print('Permute dimensions:', x.shape, z.shape)

Permute dimensions: torch.Size([10, 20, 30]) torch.Size([30, 10, 20])


### Moving Tensors between CPU & GPU

In [60]:
x = torch.tensor([[1, 2, 3],[4, 5, 6]])
y = torch.tensor([[7, 8, 9],[4, 5, 6]])

x.to(device)
y.to(device)

z = x + y
z.to(device)

tensor([[ 8, 10, 12],
        [ 8, 10, 12]], device='cuda:0')

In [61]:
x = torch.tensor([[1, 2, 3],[4, 5, 6]], device=device)
y = torch.tensor([[7, 8, 9],[4, 5, 6]], device=device)

z = torch.add(x, y)

In [62]:
z

tensor([[ 8, 10, 12],
        [ 8, 10, 12]], device='cuda:0')

### Indexing

In [73]:
x = torch.tensor([[1,2],[3,4],[5,6],[7,8]])
print(x)

tensor([[1, 2],
        [3, 4],
        [5, 6],
        [7, 8]])


In [74]:
# Indexing, returns a tensor
print(x[1,1])

# Indexing, returns value as Python number
print(x[1,1].item())

tensor(4)
4


In [75]:
# Slicing
print(x[:2,1])

# Boolean indexing: only keep elements less than 5
print(x[x<5])

tensor([2, 4])
tensor([1, 2, 3, 4])


In [76]:
# Transpose array, x.t() or x.T can be used
print(x.T)

tensor([[1, 3, 5, 7],
        [2, 4, 6, 8]])


In [77]:
# Combining tensors
y = torch.stack((x, x))
print(y)

tensor([[[1, 2],
         [3, 4],
         [5, 6],
         [7, 8]],

        [[1, 2],
         [3, 4],
         [5, 6],
         [7, 8]]])


In [78]:
# Splitting tensors
a,b = x.unbind(dim=1)
print(a,b)

tensor([1, 3, 5, 7]) tensor([2, 4, 6, 8])


### Automatic Differentiation (Autograd)

In [81]:
x = torch.tensor([[1,2,3],[4,5,6]], 
         dtype=torch.float, requires_grad=True)

# let's define result as the sum of squares of the entries in x
f = x.pow(2).sum()

print(f)

# the gradient of f=x^2 is 2x
f.backward()

x.grad

tensor(91., grad_fn=<SumBackward0>)


tensor([[ 2.,  4.,  6.],
        [ 8., 10., 12.]])

## NumPy vs PyTorch
<hr>

While developing deep neural nets, we carry out tens of thousands of matrix multiplication operations that are time consuming as well as require a lot of computation. We use PyTorch tensors because they speed up the matrix multiplication process. Below is a comparison between NumPy Arrays with PyTorch Tensors.

In [82]:
x = torch.rand(1, 6400)
y = torch.rand(6400, 5000)

x, y = x.to(device), y.to(device)

`x` and `y` are fairly big matrices. Time taken by $z=x \times y$ as tensors on gpu is about **0.5ms**

In [83]:
%timeit z = (x@y)

502 µs ± 2.39 µs per loop (mean ± std. dev. of 7 runs, 1,000 loops each)


Time taken by $z=x \times y$ as tensors on cpu is **4.6ms**

In [84]:
x, y = x.cpu(), y.cpu()

%timeit z = (x@y)

4.6 ms ± 61.9 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)


Time taken by $z=x \times y$ as numpy arrays is **5.73ms**

In [85]:
%timeit z = np.matmul(x, y)

5.73 ms ± 130 µs per loop (mean ± std. dev. of 7 runs, 100 loops each)
