## Introduction to PyTorch

### Overview

PyTorch is an open-source machine learning library that provides a flexible and efficient platform for deep learning research. It is widely used in academia and industry due to its ease of use and strong GPU acceleration capabilities.

PyTorch serves two primary purposes:
- It functions **similarly to NumPy but with built-in GPU support** for accelerated computations.
- It provides an **intuitive framework for building and training deep neural networks**.

PyTorch’s **dynamic computation graph** and **autograd engine** make it particularly suitable for deep learning applications, including natural language processing (NLP), computer vision, and solving partial differential equations (PDEs).


### (1) Installing PyTorch

Before installing PyTorch, ensure that Anaconda is installed, as it provides an easy environment for managing dependencies.

**Installation via pip:** Run the following command to install PyTorch and its dependencies:
```python
pip3 install torch torchvision torchaudio
```

Alternatively, refer to the **official installation guide** for the latest installation instructions and CUDA compatibility: [PyTorch Installation Guide](https://pytorch.org/get-started/locally/)

Checking if Pytorch is successfully installed

In [1]:
import torch # if this gives "ModuleNotFoundError: No module named 'torch'", then your pytorch installation is not complete
print(torch.__version__)

2.2.0


### (2) Understanding Tensors in PyTorch

Tensors are the fundamental data structure in PyTorch, similar to NumPy arrays but with **GPU acceleration capabilities**. They allow efficient mathematical operations on large datasets, making them essential for deep learning applications.

A tensor can be thought of as a generalization of scalars, vectors, and matrices to higher dimensions:
- **Scalar (0D tensor)**: A single numerical value (e.g., 5)
- **Vector (1D tensor)**: A 1D array of numbers (e.g., [1, 2, 3])
- **Matrix (2D tensor)**: A 2D grid of numbers
- **Higher-Dimensional Tensor**: A multi-dimensional array

#### (2.1) Converting from NumPy to PyTorch Tensor

PyTorch provides methods to create tensors from NumPy arrays.

In [2]:
import torch 
import numpy as np

# Create a NumPy array
ndarray = np.array([0, 1, 2])

# Convert NumPy array to a PyTorch tensor
tensor1 = torch.from_numpy(ndarray)  
tensor2 = torch.tensor(ndarray)  

print(tensor1)  
print(tensor2)

tensor([0, 1, 2])
tensor([0, 1, 2])


How to return to NumPy type?

In [6]:
tensor_numpy = tensor2.numpy()
print(type(tensor_numpy))

<class 'numpy.ndarray'>


**Note:** Both methods create tensors, but ```torch.from_numpy()``` shares memory with the original NumPy array, while ```torch.tensor()``` creates a new copy.

#### (2.2) Creating Tensors from Python Lists

You can directly create a tensor from a Python list:

In [6]:
tensor3 = torch.tensor([0, 1, 2])
print(tensor3)

tensor([0, 1, 2])


Tensors can also be multi-dimensional:

In [7]:
tensor4 = torch.tensor([[0, 1, 2], [3, 4, 5]])
print(tensor4)

tensor([[0, 1, 2],
        [3, 4, 5]])


#### (2.3) Creating Random Tensors

PyTorch provides built-in functions to create tensors with specific properties:

In [8]:
# Create a tensor with random values
random_tensor = torch.rand(3, 3)  
print(random_tensor)

# Create a tensor filled with zeros
zero_tensor = torch.zeros(2, 3)  
print(zero_tensor)

# Create a tensor filled with ones
ones_tensor = torch.ones(4, 4)  
print(ones_tensor)

# Create an identity matrix (diagonal ones)
identity_tensor = torch.eye(3)  
print(identity_tensor)

tensor([[0.5869, 0.3458, 0.0021],
        [0.2556, 0.6001, 0.5145],
        [0.3606, 0.0298, 0.2418]])
tensor([[0., 0., 0.],
        [0., 0., 0.]])
tensor([[1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.],
        [1., 1., 1., 1.]])
tensor([[1., 0., 0.],
        [0., 1., 0.],
        [0., 0., 1.]])


#### (2.4) Creating Tensors with Similar Properties

PyTorch allows creating new tensors that inherit properties from existing tensors:

In [9]:
# Create a tensor based on an existing one
new_tensor = torch.ones_like(random_tensor)  
print(new_tensor)

tensor([[1., 1., 1.],
        [1., 1., 1.],
        [1., 1., 1.]])


#### (2.5) Tensor Attributes

Every PyTorch tensor has three key attributes:
- **Shape:** The dimensions of the tensor
- **Data type:** The type of elements in the tensor (e.g., torch.float32, torch.int64)
- **Device:** The storage location (CPU or GPU)

In [10]:
print('Shape:', new_tensor.shape)  
print('Data Type:', new_tensor.dtype)  
print('Device:', new_tensor.device)  

Shape: torch.Size([3, 3])
Data Type: torch.float32
Device: cpu


In [15]:
shape = (3,4)
dtype = torch.float32
device = 'mps' # 'cpu' or 'cuda'
# Create a random tensor with desired attributes
new_tensor = torch.rand(size=shape, dtype=dtype).to(device)
print('Shape:', new_tensor.shape)  
print('Data Type:', new_tensor.dtype)  
print('Device:', new_tensor.device)  

Shape: torch.Size([3, 4])
Data Type: torch.float32
Device: mps:0


### (3) Tensor Operations

PyTorch provides a rich set of operations that can be performed on tensors, similar to NumPy.

#### (3.1) Slicing a Tensor

Slicing allows us to extract specific elements from a tensor. It follows Python’s indexing rules.

In [3]:
# Create a tensor
tensor = torch.tensor([[1, 2, 3], [4, 5, 6], [7, 8, 9]])
print("Original Tensor:\n", tensor)

# Extract a single element (0-based index)
print("Element at (1,2):", tensor[1, 2])  # Output: 6

# Extract a row
print("First row:", tensor[0])  # Output: tensor([1, 2, 3])

# Extract a column
print("Second column:", tensor[:, 1])  # Output: tensor([2, 5, 8])

# Slice a submatrix
print("Submatrix:\n", tensor[0:2, 1:3])  # Output: tensor([[2, 3], [5, 6]])

Original Tensor:
 tensor([[1, 2, 3],
        [4, 5, 6],
        [7, 8, 9]])
Element at (1,2): tensor(6)
First row: tensor([1, 2, 3])
Second column: tensor([2, 5, 8])
Submatrix:
 tensor([[2, 3],
        [5, 6]])


#### (3.2) Reshaping a Tensor

Reshaping changes the dimensions of a tensor without altering its data.

In [15]:
tensor = torch.arange(12)  # Create a 1D tensor with values from 0 to 11
print("Original Tensor:", tensor)

# Reshape to a 3x4 matrix
reshaped_tensor = tensor.view(3, 4)
print("Reshaped Tensor:\n", reshaped_tensor)

# Reshape to a 2x6 matrix
reshaped_tensor = tensor.reshape(2, 6)
print("Reshaped Tensor:\n", reshaped_tensor)

# Reshape to a 4x3 matrix
reshaped_tensor = tensor.reshape(-1, 3)
print("Reshaped Tensor:\n", reshaped_tensor)

Original Tensor: tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
Reshaped Tensor:
 tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
Reshaped Tensor:
 tensor([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]])
Reshaped Tensor:
 tensor([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]])


#### (3.3) Transposing a Tensor

Transposing swaps dimensions of a tensor.

In [5]:
tensor = torch.tensor([[1, 2, 3], [4, 5, 6]])
transposed = tensor.T  # Equivalent to tensor.transpose(0, 1)
print("Transposed Tensor:\n", transposed)

Transposed Tensor:
 tensor([[1, 4],
        [2, 5],
        [3, 6]])


#### (3.4) Element-wise Operations

Operations can be applied element-wise on tensors.

In [6]:
tensor1 = torch.tensor([1, 2, 3])
tensor2 = torch.tensor([4, 5, 6])

# Addition
print("Addition:", tensor1 + tensor2)

# Multiplication
print("Multiplication:", tensor1 * tensor2)

# Power
print("Squared:", tensor1 ** 2)

Addition: tensor([5, 7, 9])
Multiplication: tensor([ 4, 10, 18])
Squared: tensor([1, 4, 9])


#### (3.5) Matrix Multiplication

In [7]:
A = torch.tensor([[1, 2], [3, 4]])
B = torch.tensor([[5, 6], [7, 8]])

# Matrix multiplication
C = torch.matmul(A, B)  # Equivalent to A @ B
print("Matrix Multiplication:\n", C)

Matrix Multiplication:
 tensor([[19, 22],
        [43, 50]])


#### (3.6) Moving Tensors to GPU

To leverage GPU acceleration, PyTorch allows you to move tensors between CPU and GPU.

In [13]:
# Check if CUDA is available
device = torch.device("cuda" if torch.cuda.is_available() else "mps")

# Move tensor to GPU
tensor = torch.rand((2,3))
print('The original device:', tensor.device)
gpu_tensor = tensor.to(device)
print('The current device:', gpu_tensor.device)

The original device: cpu
The current device: mps:0


### (5) Automatic Differentiation with Autograd

PyTorch’s ```autograd``` module automates the computation of gradients, making it easier to implement backpropagation in neural networks.

#### (5.1) Enabling Autograd with ```requires_grad```

In [7]:
x = torch.tensor(2.0, requires_grad=True)
y = x ** 2 + 3 * x + 5
print("y:", y)

y: tensor(15., grad_fn=<AddBackward0>)


#### (5.2) Computing Gradients using ```backward()```

In [8]:
y.backward()  # Compute dy/dx
print("Gradient of y with respect to x:", x.grad)

Gradient of y with respect to x: tensor(7.)


#### (5.3) Stopping Gradient Tracking

Sometimes, we may want to stop PyTorch from tracking gradients:

In [10]:
with torch.no_grad():
    y = x * 2 + 5  # No gradient will be computed
# y.backward()

Alternatively, disable tracking manually:

In [11]:
x.requires_grad_(False)

tensor(2.)

## Exercise

- 1. Create a 3×3 tensor filled with random numbers and print its shape, data type, and device.
- 2. Convert the NumPy array ```np.array([1, 2, 3, 4, 5, 6])``` into a PyTorch tensor:
     - Reshape it into a (3,2) tensor and denote it as ```tensorA```
     - Reshape it into a (2,3) tensor and denote it as ```tensorB```
     - Compute the transpose of ```tensorB``` and compare it with ```tensorA```.
     - Perform matrix multiplication on ```tensorB``` and ```tensorA```.
- 3.	Create a 4×4 tensor and
     - Extract the first row
     - Extract the last column
     - Modify the first row of the tensor by setting all its values to 10.
     - Check if a GPU is available and move a tensor to the GPU (if available), then back to the CPU.
- 4.	Create a tensor ```x = torch.tensor(2.0, requires_grad=True)```. Define a function $y = x^3 + 3x^2 + 5$ and compute its derivative using ```backward()```.
     - Print the gradient (```x.grad```).
     - Disable gradient tracking using ```torch.no_grad()```, then verify that ```requires_grad``` is False in the new tensor.