# Introduction to PyTorch 

`PyTorch` is a Python-based scientific computing package serving two broad purposes:

- A replacement for NumPy to use the power of GPUs.
- An automatic differentiation library, This will come handy when having to calculate gradients, as these are essential for neural networks optmization.

Some advantages of `PyTorch` over other similar libraries (e.g., `TensorFlow`) are its simplicity and the fact that it was built natively in Python, making it more intuitive and easier to use for Python developers.

At the end of this lesson you should:

- Understand why we use `PyTorch`.
- Learn the main features of `PyTorch`.


## 1. Installing and importing PyTorch

We will first install PyTorch through the Python's pip installer by executing the following command:

In [None]:
!pip install torch torchvision

Once installed, we can import it. We will also import `NumPy`

In [3]:
import torch
import numpy as np

## 2. Tensors

Tensors are the main data structure in PyTorch. They are used to encode inputs and outputs and to perform computations. In many ways, they are similar to `NumPy` arrays, except that they can run on GPUs.

### 2.1 Initilization

`.tensor()`: directly from a list of data.

In [4]:
data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)
x_data

tensor([[1, 2],
        [3, 4]])

`.from_numpy()`: from a numpy array

In [5]:
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
x_np

tensor([[1, 2],
        [3, 4]])

`.zeros()`, `.ones()`, `.rand()`: They initialize tensors with constant or random values, respectively, given a specific shape.

In [6]:
shape = (2, 2, 2)
rand_tensor = torch.rand(shape)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)

print(f"Random Tensor: \n {rand_tensor} \n")
print(f"Ones Tensor: \n {ones_tensor} \n")
print(f"Zeros Tensor: \n {zeros_tensor}")

Random Tensor: 
 tensor([[[0.8834, 0.5425],
         [0.8239, 0.8566]],

        [[0.2978, 0.1709],
         [0.3613, 0.9654]]]) 

Ones Tensor: 
 tensor([[[1., 1.],
         [1., 1.]],

        [[1., 1.],
         [1., 1.]]]) 

Zeros Tensor: 
 tensor([[[0., 0.],
         [0., 0.]],

        [[0., 0.],
         [0., 0.]]])


`.ones_like()`, `.rand_like()`: They initialize tensors of ones and random numbers respectively, but retaining the properties (shape, datatype) of the original tensor.

In [7]:
x_ones = torch.ones_like(x_data) 
print(f" Tensor of ones: \n {x_ones} \n")

x_rand = torch.rand_like(x_data, dtype=torch.float) # overrides the datatype of x_data
print(f"Tensor of random numbers: \n {x_rand} \n")

 Tensor of ones: 
 tensor([[1, 1],
        [1, 1]]) 

Tensor of random numbers: 
 tensor([[0.8780, 0.3566],
        [0.4789, 0.4242]]) 



The following would fail because `rand_like` is only implemented for floating-point types and does not automatically upcast other types.

In [8]:
torch.rand_like(x_data)

RuntimeError: "check_uniform_bounds" not implemented for 'Long'

### 2.2 Attributes

In [10]:
rand_tensor = torch.rand((2,2,3))

print(f"Shape of tensor: {rand_tensor.shape}")
print(f"Datatype of tensor: {rand_tensor.dtype}")
print(f"Device tensor is stored on: {rand_tensor.device}")

Shape of tensor: torch.Size([2, 2, 3])
Datatype of tensor: torch.float32
Device tensor is stored on: cpu


### 2.3 Some methods

- `.flatten()`: Convert to a 1D tensor

In [45]:
rand_tensor_1d = rand_tensor.flatten()
print(rand_tensor_1d)
print(rand_tensor_1d.flatten().shape)

tensor([0.8953, 0.2115, 0.7669, 0.3520, 0.3678, 0.6777, 0.4700, 0.7755, 0.6938,
        0.5987, 0.0118, 0.0060], device='cuda:0')
torch.Size([12])


- `.reshape()`: Change the shape of the sensor

In [51]:
rand_tensor_reshaped = rand_tensor_1d.reshape((12,1))

print(rand_tensor_reshaped)
print(rand_tensor_reshaped.shape)

tensor([[0.8953],
        [0.2115],
        [0.7669],
        [0.3520],
        [0.3678],
        [0.6777],
        [0.4700],
        [0.7755],
        [0.6938],
        [0.5987],
        [0.0118],
        [0.0060]], device='cuda:0')
torch.Size([12, 1])


- `.squeeze()`:  Remove single-dimensional entries

In [53]:
rand_tensor_squeeze = rand_tensor_reshaped.squeeze()

print(rand_tensor_squeeze)
print(rand_tensor_squeeze.shape)

tensor([0.8953, 0.2115, 0.7669, 0.3520, 0.3678, 0.6777, 0.4700, 0.7755, 0.6938,
        0.5987, 0.0118, 0.0060], device='cuda:0')
torch.Size([12])


### Operations

An operation can be run on CPU or GPU, which, if available, is typically faster.


In [14]:
if torch.cuda.is_available():
  rand_tensor = rand_tensor.to('cuda')

- **indexing and slicing**

This is similar to `NumPy`

In [15]:
tensor = torch.rand((3,3))
print(tensor[:,:2])

tensor[0, :] = 0  
print(tensor)

tensor([[0.0059, 0.8722],
        [0.4371, 0.9406],
        [0.6000, 0.1199]])
tensor([[0.0000, 0.0000, 0.0000],
        [0.4371, 0.9406, 0.1961],
        [0.6000, 0.1199, 0.5274]])


**Joining**

- In a given dimension

In [26]:
t1 = torch.cat([tensor, tensor, tensor], dim=1)
print(t1)
print("Size:", t1.shape)

tensor([[0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000, 0.0000],
        [0.4371, 0.9406, 0.1961, 0.4371, 0.9406, 0.1961, 0.4371, 0.9406, 0.1961],
        [0.6000, 0.1199, 0.5274, 0.6000, 0.1199, 0.5274, 0.6000, 0.1199, 0.5274]])
Size: torch.Size([3, 9])


- Along  a new dimension

In [27]:
t2 = torch.stack([tensor, tensor, tensor])
print(t2)

print(t2.shape)

tensor([[[0.0000, 0.0000, 0.0000],
         [0.4371, 0.9406, 0.1961],
         [0.6000, 0.1199, 0.5274]],

        [[0.0000, 0.0000, 0.0000],
         [0.4371, 0.9406, 0.1961],
         [0.6000, 0.1199, 0.5274]],

        [[0.0000, 0.0000, 0.0000],
         [0.4371, 0.9406, 0.1961],
         [0.6000, 0.1199, 0.5274]]])
torch.Size([3, 3, 3])


**Multiplying**

- Element-wise:

In [28]:
print(f"tensor.mul(tensor) \n {tensor.mul(tensor)} \n")

# Alternatively:
print(f"tensor * tensor \n {tensor * tensor}")

tensor.mul(tensor) 
 tensor([[0.0000, 0.0000, 0.0000],
        [0.1911, 0.8848, 0.0385],
        [0.3600, 0.0144, 0.2782]]) 

tensor * tensor 
 tensor([[0.0000, 0.0000, 0.0000],
        [0.1911, 0.8848, 0.0385],
        [0.3600, 0.0144, 0.2782]])


- Algebraically:

In [29]:
tensor_a = torch.rand((3,3))
tensor_b = torch.rand((3,2))
print(f"tensor_a.matmul(tensor_b) \n {tensor.matmul(tensor_b)} \n")

# Alternatively:
print(f"tensor_a @ tensor_b \n {tensor @ tensor_b}")

tensor_a.matmul(tensor_b) 
 tensor([[0.0000, 0.0000],
        [0.7385, 1.0233],
        [0.6661, 0.5493]]) 

tensor_a @ tensor_b 
 tensor([[0.0000, 0.0000],
        [0.7385, 1.0233],
        [0.6661, 0.5493]])


- In-place:

Any tensor's method that has a ``_`` suffix are in-place (e.g. `.copy_()` will create a copy, `.t_()` will transpose, etc).



In [30]:
tensor = torch.rand((3,2))

print(tensor, "\n")
tensor.add_(10)
print(tensor, "\n")

tensor.t_()
print(tensor)

tensor([[0.2734, 0.5806],
        [0.1785, 0.6944],
        [0.7930, 0.6472]]) 

tensor([[10.2734, 10.5806],
        [10.1785, 10.6944],
        [10.7930, 10.6472]]) 

tensor([[10.2734, 10.1785, 10.7930],
        [10.5806, 10.6944, 10.6472]])


More tensor operations can be found in https://pytorch.org/docs/stable/torch.html


## 3. PyTorch and NumPy

Tensors on the CPU and NumPy arrays can share their underlying memory locations, and changing one will change	the other.


- `.numpy()`: Tensor to array

In [40]:
t = torch.ones(5)
print(f"t: {t}")
n = t.numpy()
print(f"n: {n}")

t: tensor([1., 1., 1., 1., 1.])
n: [1. 1. 1. 1. 1.]


A change in the tensor reflects in the array.

In [38]:
t.add_(1)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.])
n: [2. 2. 2. 2. 2.]


- `.from_numpy()`: Array to tensor

In [41]:
n = np.ones(5)
t = torch.from_numpy(n)

Again, changes in the array reflects in the tensor.

In [42]:
np.add(n, 1, out=n)
print(f"t: {t}")
print(f"n: {n}")

t: tensor([2., 2., 2., 2., 2.], dtype=torch.float64)
n: [2. 2. 2. 2. 2.]


## 📝 4. Exercises

1. **Tensor Creation & Shape Inspection.**
   
    **a. Create a 3×4 tensor filled with zeros.**  
    **b. Create a 2×3×5 tensor with random values between 0 and 1.**  
    **c. Print the shape, number of dimensions, and number of elements for each.**


In [59]:
#YOUR CODE HERE

2. **Tensor operations.**

    **a. Create two 2×2 tensors and perform:**
   
    **- Element-wise addition**  
    **- Element-wise multiplication**  
    **- Element-wise division**
   
    **b. Perform matrix multiplication between a 2×3 and a 3×4 tensor.**  

   **c. Multiply two incompatible shapes and write down the error message.**


In [60]:
#YOUR CODE HERE

3. **Indexing & Slicing**
   
    **a. Given a 4×5 tensor of random integers between 1 and 10:**
   
   **- Select the second row.**  
   **- Select the third column.**  
   **- Extract a 2×2 block from the top-left corner.**  

    **b. Set all values in the first row to zero.**


In [61]:
#YOUR CODE HERE

4. **In-place vs Out-of-place Operations**

**a. Create a tensor and square its elements:**  
    - **First using an out-of-place operation.**  
    - **Then using an in-place operation (e.g., `.pow_()`).**  

 **b. Print the memory ID (id()) of the tensor before and after each operation and explain the result.**

In [70]:
#YOUR CODE HERE