# Session 5: PyTorch Fundamentals - Tensors and Gradients

**Objective:** To introduce the core concepts of PyTorch, focusing on its fundamental data structure (the Tensor) and its automatic differentiation engine (`autograd`).

In [None]:
! pip install torch

## Part 1: Concepts 

### 1. What is PyTorch?

PyTorch is an open-source machine learning library. It's widely used for applications such as computer vision and natural language processing. Its two core features are:

- **Tensors:** A multi-dimensional array, similar to NumPy's `ndarray`, but with the ability to run on GPUs for accelerated computing.
- **Automatic Differentiation:** A system called `torch.autograd` that automatically calculates gradients, which are essential for training neural networks.

### 2. Tensors: The Building Blocks
Everything in PyTorch is based on Tensors.

In [None]:
import torch
import numpy as np

# --- Creating Tensors ---

# From a Python list
data = [[1, 2], [3, 4]]
x_data = torch.tensor(data)
print(f"Tensor from list:\n {x_data}\n")

# From a NumPy array (and vice-versa)
np_array = np.array(data)
x_np = torch.from_numpy(np_array)
print(f"Tensor from NumPy array:\n {x_np}\n")

np_from_tensor = x_np.numpy()
print(f"NumPy array from Tensor:\n {np_from_tensor}\n")

# Tensors of ones, zeros, and random numbers
shape = (2, 3,)
ones_tensor = torch.ones(shape)
zeros_tensor = torch.zeros(shape)
rand_tensor = torch.rand(shape)

print(f"Ones Tensor:\n {ones_tensor} \n")
print(f"Random Tensor:\n {rand_tensor} \n")

### 3. Tensor Operations
Operations have a syntax similar to NumPy.

In [None]:
tensor = torch.ones(4, 4)

# --- Indexing and Slicing ---
print(f"First row: {tensor[0]}")
print(f"First column: {tensor[:, 0]}")
print(f"Last column: {tensor[..., -1]}\n")

# --- Arithmetic Operations ---
y1 = tensor + tensor
y2 = tensor * tensor

# Matrix multiplication
y3 = tensor.matmul(tensor.T) # Using .matmul()
y4 = tensor @ tensor.T      # Using the @ operator (more common)

print(f"Matrix Multiplication Result:\n {y4}")

### 4. Automatic Differentiation with `torch.autograd`
This is the magic of PyTorch. When training a model, you need to calculate the gradient of the loss function with respect to the model's parameters. PyTorch does this automatically.

- We can tell PyTorch to track operations on a tensor by setting `requires_grad=True`.
- When we finish our computation (e.g., `y = 3*x**2`), we can call `y.backward()`.
- PyTorch will then compute the gradients `dy/dx` and store them in the `.grad` attribute of the tensor `x`.

In [None]:
# Create a tensor and set requires_grad=True to track computation
x = torch.tensor(2.0, requires_grad=True)

# Define a simple function
y = 3*x**2 + 5
# In math, the derivative dy/dx is 6*x. At x=2, the gradient should be 12.

# Use autograd to calculate the gradients
y.backward()

# The calculated gradient is now stored in x.grad
print(f"The gradient dy/dx at x=2 is: {x.grad}")

### 5. Disabling Gradient Tracking with `torch.no_grad()`

By default, tensors with `requires_grad=True` will track their history. However, there are times when we don't need this, especially during model evaluation (inference). Disabling it saves memory and speeds up computation.

We do this using a `with torch.no_grad():` block.

In [None]:
z = torch.tensor(5.0, requires_grad=True)
print(f"z requires grad: {z.requires_grad}")

with torch.no_grad():
    # Inside this block, operations are not tracked
    q = z**2

print(f"q was created without tracking, so requires_grad is: {q.requires_grad}")

# This would cause an error because the computation history for q was not saved:
try:
    q.backward()
except RuntimeError as e:
    print(f"\nError trying to call backward() on q: {e}")

## Part 2: Exercises & Debugging (90 mins)

### Lab 5.1: Tensor Creation and Manipulation
* **Task:** 
  1. Create a 3x3 tensor with random values.
  2. Create a 3x3 tensor of all ones.
  3. Add the two tensors together.
  4. Multiply the resulting tensor by a scalar (e.g., 5).

In [None]:
# Your code here
rand_t = torch.rand(3, 3)
ones_t = torch.ones(3, 3)

added_t = rand_t + ones_t
final_t = added_t * 5

print("Final result of Lab 5.1:")
print(final_t)

### Lab 5.2: Simple Linear Equation
* **Task:** Manually perform the calculation for a simple linear layer: `y = Wx + b`.
  1. Create a weight tensor `W` of shape (1, 3) with random values.
  2. Create an input tensor `x` of shape (3, 1) with random values.
  3. Create a bias tensor `b` of shape (1, 1) with a value of 0.5.
  4. Calculate `y` using matrix multiplication (`@`).

In [None]:
# Your code here
W = torch.rand(1, 3)
x_in = torch.rand(3, 1)
b = torch.tensor([[0.5]])

y_out = W @ x_in + b

print("Result of y = Wx + b:")
print(y_out)

### Lab 5.3: Practice with Autograd
* **Task:** Calculate the gradient for the function `z = 2a^3 + 3b`.
  1. Create tensors `a` and `b` with values `2.0` and `5.0` respectively. Make sure to set `requires_grad=True`.
  2. Calculate `z`.
  3. Call `backward()` on `z`.
  4. Print the gradients stored in `a.grad` and `b.grad`.
  5. **Question:** Before running, what do you expect the gradients `dz/da` and `dz/db` to be?

In [None]:
# Your code here
# Math check: dz/da = 6a^2. At a=2, this is 6 * 4 = 24.
# Math check: dz/db = 3.

a = torch.tensor(2.0, requires_grad=True)
b = torch.tensor(5.0, requires_grad=True)

z_calc = 2*a**3 + 3*b

z_calc.backward()

print(f"The gradient dz/da is: {a.grad}")
print(f"The gradient dz/db is: {b.grad}")