Pytorch is a popular deep-learning framework that allows you to 
- build a neural network of aribitrary complexity
- perform computations on <b>hardware accelerators</b> (GPUs, TPUs, ...)
- automatically compute the gradient of the loss function w.r.t. the weight vectors in the neural network

Tensors are at the core of PyTorch, as they are the only way data is being represented in Pytorch.

Whether you have texts, images, videos or even molecules as your input data, you will have to convert them into tensors somehow before unleashing the power of PyTorch!


#### Tensors

Tensors are a specialized data structure that are very similar to arrays and matrices. In PyTorch we use tensors to encode the inputs and outputs of a model, as well as the model's parameters.

Tensors are similar to NumPy's ndarrays, except that tensors can run on GPUs or other <b>hardware accelerators</b>. In fact, tensors and NumPy arrays can often share the same underlying memory, eliminating the need to copy data. Tensors are also optimized for automatic <b>automatic differentiation</b>.

In [4]:
import torch
import numpy as np


#### Initializing a Tensor

Tensors can be created directly from data. The data type is automatically inferred.

In [2]:
data = [[1,2], [3,4]]
print(data)

x_data = torch.tensor(data)
x_data


[[1, 2], [3, 4]]


tensor([[1, 2],
        [3, 4]])

##### From a NumPy array

Tensors can be created from NumPy arrays (and vice versa)

In [5]:
np_array = np.array(data)
print(np_array)
x_np = torch.from_numpy(np_array)
x_np

[[1 2]
 [3 4]]


tensor([[1, 2],
        [3, 4]])

##### From another tensor

The new tensor retains the properties (shape, data type) of the argument tensor, unless explicitly overridden

In [20]:
x_ones = torch.ones_like(x_data) # retains the properties of x_data
print(f"Ones Tensor: \n {x_ones} \n")

x_fives = torch.full_like(x_data, 5)
print(f"Fives Tensor: \n {x_fives}\n")

# Original x_data inside tensors were of int type. Now it is float
print(f"X dtype: {x_data.dtype}")
x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Random Tensor: \n {x_rand} \n")
print(f"X_rand dtype: {x_rand.dtype}")


Ones Tensor: 
 tensor([[1, 1],
        [1, 1]]) 

Fives Tensor: 
 tensor([[5, 5],
        [5, 5]])

X dtype: torch.int64
Random Tensor: 
 tensor([[0.6179, 0.2519],
        [0.2334, 0.1559]]) 

X_rand dtype: torch.float32


##### With random or constant values

<i>shape</i> is a tuple of tensor dimensions. In the functions below, it determines the dimensionality of the output tensor. 

Another commonly used one is torch.randn that draws numbers from the standard normal distribution

In [17]:
# We will not use "_like" since we are not copying another tensor's shape, 
# but we are initializing a new tensor while initializing its shape

shape = (2,3)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
fives_tensor = torch.full(shape, fill_value=5.0)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor} \n")
print(f"Fives Tensor: \n {fives_tensor} \n")

Random Tensor: 
 tensor([[0.4174, 0.6166, 0.2953],
        [0.4896, 0.0758, 0.3680]]) 

Ones Tensor: 
 tensor([[1., 1., 1.],
        [1., 1., 1.]]) 

Zeros Tensor: 
 tensor([[0., 0., 0.],
        [0., 0., 0.]]) 

Fives Tensor: 
 tensor([[5., 5., 5.],
        [5., 5., 5.]]) 



### Attributes of a Tensor

Tensor attributes describe their shape, data type, and the device on which they are stored

In [23]:
tensor = torch.rand(3,4)

print(f"Shape of tensor: {tensor.shape}")
print(f"Datatype of tensor: {tensor.dtype}")
print(f"Device tensor is stored on: {tensor.device}")



Shape of tensor: 3
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


A tensor also has a <b>method</b> called <i>size</i> to return its shape

In [21]:
tensor.size()

torch.Size([3, 4])

Because <i>size</i> is a <b>method</b>, you can actually pass in an argument specifying a single dimension that you want to know the size of. 

In [22]:
print(f"Size of dim 0: {tensor.size(dim=0)}") # "dim" is whats known as an "axis" in NumPy
print(f"Size of dim 1: {tensor.size(dim=1)}")

Size of dim 0: 3
Size of dim 1: 4


You can also just slice the shape attribute

In [25]:
print(f"Shape of tensor 0: {tensor.shape[0]}")
print(f"Shape of tensor 1: {tensor.shape[1]}")


Shape of tensor 0: 3
Shape of tensor 1: 4


#### Using GPUs

How can we know if a GPU is available in our machine or not? 

### Operations on Tensors

Over 100 tensor operations, including arithmetic, linear algebra, matrix manipulation (transposing, indexing, slicing)

Each of these operations can be run on GPU (at typically higher speed than on a CPU)

In [27]:
# Returns a boolean. True, if CUDA is set up and there is a GPU
torch.cuda.is_available()

True

"GPU"s are often referred to as "cuda" in PyTorch. 

CUDA is a software toolkit that supports programming on Nvidia's GPUs. Luckily, we don't have to program in CUDA ourselves thanks to PyTorch. But the forerunners of deep learning must have written CUDA code themselves in C++ if they were to use GPUS

The <span style="background-color:black">nvidia-smi</span> terminal command is useful for checking the status of your GPU

In [28]:
!nvidia-smi

Fri Oct 27 10:49:49 2023       
+---------------------------------------------------------------------------------------+
| NVIDIA-SMI 530.30.02              Driver Version: 530.30.02    CUDA Version: 12.1     |
|-----------------------------------------+----------------------+----------------------+
| GPU  Name                  Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf            Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                                         |                      |               MIG M. |
|   0  NVIDIA RTX A3000 Laptop GPU     On | 00000000:01:00.0 Off |                  N/A |
| N/A   65C    P0               88W / 115W|   5128MiB /  6144MiB |     96%      Default |
|                                         |                      |                  N/A |
+-----------------------------------------+----------------------+----------------------+
                                                                    

To switch seamleslly from CPU to GPU or vice versa, you can define a <span style="background-color:black">device</span> object that represents your current choice of device to run tensor operations on.

In [29]:
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
device

device(type='cuda')

Now, for any tensor creation operations that accept a keyword argument <span style="background-color:black">device</span>, you can pass this <span style="background-color:black">device</span>  in and oyur new tensor will be right on that device. For example:

In [30]:
ones_tensor = torch.ones((2,3), device=device)
ones_tensor

tensor([[1., 1., 1.],
        [1., 1., 1.]], device='cuda:0')

In [31]:
ones_tensor.device

device(type='cuda', index=0)

If one tensor is already on a specific device, you can create a new tensor on the same device using <span style="background-color:black">.*_like</span> methods. In this case, you dont need to pass in <span style="background-color:black">device</span>. 

In [34]:
fives_tensor = torch.full_like(ones_tensor, 5.0)
fives_tensor.device

device(type='cuda', index=0)

You may often find yourself in a situation where you first create a tensor on CPU but later on want to move it to GPU. In that case, you can use the <span style="background-color:black">.to</span> method of a tensor:

In [35]:
data = [[1,2], [3,4]]
x_data = torch.tensor(data)
x_data.device

device(type='cpu')

In [36]:
x_data = x_data.to(device)
x_data.device

device(type='cuda', index=0)

But keep in mind that copying large tensors across devices can be expensive in terms of time and memory!

<b> Standar numpy-like indexing and slicing </b>

In [37]:
tensor = torch.ones(4,4)
print(f"First row: {tensor[0]}")
print(f"First column {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}")
tensor[:,1] = 0
print(tensor)

First row: tensor([1., 1., 1., 1.])
First column tensor([1., 1., 1., 1.])
Last column: tensor([1., 1., 1., 1.])
tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])


<b> Joining tensors </b>

You can use <span style="background-color:black">torch.cat</span> to concatenate a sequence of tensors along a given dimension.

In [39]:
t1 = torch.cat([tensor, tensor, tensor], dim=1) # same as np.hstack
t1

tensor([[1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.],
        [1., 0., 1., 1., 1., 0., 1., 1., 1., 0., 1., 1.]])

In [44]:
t2 = torch.cat([tensor, tensor, tensor], dim=0) # same as np.vstack
print(t2.shape)
t2

torch.Size([12, 4])


tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]])

In [43]:
t3 = torch.stack([tensor, tensor, tensor], dim=0) # same as np.stack
print(t3.shape)
t3

torch.Size([3, 4, 4])


tensor([[[1., 0., 1., 1.],
         [1., 0., 1., 1.],
         [1., 0., 1., 1.],
         [1., 0., 1., 1.]],

        [[1., 0., 1., 1.],
         [1., 0., 1., 1.],
         [1., 0., 1., 1.],
         [1., 0., 1., 1.]],

        [[1., 0., 1., 1.],
         [1., 0., 1., 1.],
         [1., 0., 1., 1.],
         [1., 0., 1., 1.]]])

<b> Arithmetic Operations </b>

In [50]:
# This computes the matrix multiplication between two sensors. y1, y2, y3 will have the same value
y1 = tensor @ tensor.T 
print(f"y1: \n {y1} \n")
y2 = tensor.matmul(tensor.T)
print(f"y2: \n {y2} \n")

y3 = torch.rand_like(y1)
print(f"y3: \n {y3} \n")

matmul_1 = torch.matmul(tensor, tensor.T, out=y3)
print(f"matmul_1: \n {matmul_1} \n")
print(f"y3: \n {y3} \n")



# This computes the element-wise product. z1, z2, z3 will have the same value
z1 = tensor * tensor
print(f"z1: \n {z1} \n")

z2 = tensor.mul(tensor)
print(f"z2: \n {z2} \n")

z3 = torch.rand_like(tensor)
matmul_2 = torch.mul(tensor, tensor, out=z3)
print(f"matmul_2: \n {matmul_2} \n")




y1: 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 

y2: 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 

y3: 
 tensor([[0.4798, 0.1759, 0.8198, 0.9446],
        [0.7742, 0.6961, 0.5536, 0.0042],
        [0.0060, 0.1001, 0.1396, 0.5689],
        [0.5408, 0.9642, 0.7642, 0.9462]]) 

matmul_1: 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 

y3: 
 tensor([[3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.],
        [3., 3., 3., 3.]]) 

z1: 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

z2: 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

matmul_2: 
 tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 



<b>Single-element tensors</b> If you have a one-element tensor, for example by aggregating all values of a tensor into one vlaue, you can convert it to a Python numerical value using <span style="background-color:black">item()</span> 

In [51]:
agg = tensor.sum()
agg_item = agg.item()
print(agg_item, type(agg_item))


12.0 <class 'float'>


<b>In-place operations</b> Operations that store the result into the operand are called in-place. They are denoted by a <span style="background-color:black">__</span> suffix. For example: <span style="background-color:black">x.copy_(y)</span> , <span style="background-color:black">x.t_()</span> , will change to <span style="background-color:black">x</span> 

In [52]:
print(f"{tensor} \n")
tensor.add_(5)
print(tensor)

tensor([[1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.],
        [1., 0., 1., 1.]]) 

tensor([[6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.],
        [6., 5., 6., 6.]])


In-place operations save some memory, but can be problematic when computing derivatives because of an immediate loss of history. <b> Hence, their use is discouraged by the community. </b>

Typically, a tensor operation can be invoked in three ways, especially for the arithmetic operations.

Take the <span style="background-color:black">sqrt</span> operation as an example:

In [53]:
rand_tensor = torch.rand(2,3)

sqrt_first = torch.sqrt(rand_tensor) # torch.[op](tensor, ...)
sqrt_second = rand_tensor.sqrt()     # tensor.[op]()
rand_tensor.sqrt_()                  # rensor.[op]_(...) (in-place)


tensor([[0.1707, 0.5972, 0.7887],
        [0.6107, 0.9641, 0.8558]])

In [56]:
# Compares two tensors if they are equal or not. Or all close
torch.allclose(sqrt_first, sqrt_second)

True

In [55]:
torch.allclose(sqrt_first, rand_tensor)

True

#### Numpy array from Tensor and to Tensor

In [57]:
t = torch.ones(5)
print(f"t:{t}")
n = t.numpy()
print(f"n: {n}")

t:tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]


A change in the tensor reflects in the NumPy array. Because they share the same memory

In [58]:
t.add_(1)
print(f"t:{t}")
print(f"n: {n}")

t:tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]


NumPy array to Tensor

In [60]:
n = np.ones(5)
t = torch.torch.from_numpy(n)

A change in the NumPy array reflects in the tensor

In [61]:
np.add(n, 1, out=n)
print(f"t:{t}")
print(f"n: {n}")

t:tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]
