# Introduction to PyTorch

## Part 1 Bonus Exercises: In-Class Practice

Additional hands-on exercises for practicing tensor operations, reshaping, and autograd.

**Instructions:**
- Complete the code in each cell where you see `# YOUR CODE HERE`
- Run the assertion cells to check your work
- These exercises increase in difficulty

In [None]:
import torch

print(f"PyTorch version: {torch.__version__}")

---
## Exercise B1: Tensor Manipulation Challenge

Practice combining multiple tensor operations.

### B1.1 Build a specific tensor

Create a 4x4 tensor that looks like this (an identity matrix with 5s on the diagonal):
```
[[5, 0, 0, 0],
 [0, 5, 0, 0],
 [0, 0, 5, 0],
 [0, 0, 0, 5]]
```

Hint: Use `torch.eye()` and multiplication.

In [None]:
# YOUR CODE HERE
diagonal_5 = ...

print(f"Result:\n{diagonal_5}")

In [None]:
# Verification
expected = torch.tensor([[5., 0., 0., 0.],
                         [0., 5., 0., 0.],
                         [0., 0., 5., 0.],
                         [0., 0., 0., 5.]])
assert torch.equal(diagonal_5, expected), f"Got:\n{diagonal_5}"
print("B1.1 Passed!")

### B1.2 Create a checkerboard pattern

Create a 4x4 tensor that looks like a checkerboard:
```
[[1, 0, 1, 0],
 [0, 1, 0, 1],
 [1, 0, 1, 0],
 [0, 1, 0, 1]]
```

Hint: Think about how row index + column index determines the value.

In [None]:
# YOUR CODE HERE
# Approach: (row + col) % 2 gives alternating 0s and 1s, then flip with 1 - x
checkerboard = ...

print(f"Checkerboard:\n{checkerboard}")

In [None]:
# Verification
expected = torch.tensor([[1, 0, 1, 0],
                         [0, 1, 0, 1],
                         [1, 0, 1, 0],
                         [0, 1, 0, 1]])
assert torch.equal(checkerboard, expected), f"Got:\n{checkerboard}"
print("B1.2 Passed!")

### B1.3 Create a lower triangular matrix of ones

Create a 4x4 tensor:
```
[[1, 0, 0, 0],
 [1, 1, 0, 0],
 [1, 1, 1, 0],
 [1, 1, 1, 1]]
```

Hint: Look up `torch.tril()` (triangular lower).

In [None]:
# YOUR CODE HERE
lower_tri = ...

print(f"Lower triangular:\n{lower_tri}")

In [None]:
# Verification
expected = torch.tensor([[1, 0, 0, 0],
                         [1, 1, 0, 0],
                         [1, 1, 1, 0],
                         [1, 1, 1, 1]])
assert torch.equal(lower_tri, expected), f"Got:\n{lower_tri}"
print("B1.3 Passed!")

---
## Exercise B2: Broadcasting Puzzles

Practice understanding and using broadcasting rules.

### B2.1 Predict the output shape

For each pair of tensors, predict what shape the result will be after addition.
Then run the code to verify.

In [None]:
# Predict: What will be the shape of (a + b)?
a = torch.ones(3, 1)
b = torch.ones(1, 4)

# YOUR PREDICTION (as a tuple): predicted_shape_1 = ...
predicted_shape_1 = ...

result_1 = a + b
print(f"a shape: {a.shape}, b shape: {b.shape}")
print(f"Result shape: {result_1.shape}")
print(f"Your prediction: {predicted_shape_1}")

In [None]:
# Predict: What will be the shape of (c + d)?
c = torch.ones(2, 3, 4)
d = torch.ones(4)

# YOUR PREDICTION: predicted_shape_2 = ...
predicted_shape_2 = ...

result_2 = c + d
print(f"c shape: {c.shape}, d shape: {d.shape}")
print(f"Result shape: {result_2.shape}")
print(f"Your prediction: {predicted_shape_2}")

In [None]:
# Verification
assert predicted_shape_1 == (3, 4), f"First prediction should be (3, 4), got {predicted_shape_1}"
assert predicted_shape_2 == (2, 3, 4), f"Second prediction should be (2, 3, 4), got {predicted_shape_2}"
print("B2.1 Passed!")

### B2.2 Normalize rows using broadcasting

Given a matrix, subtract the mean of each row from that row (row-wise normalization).

Result should have each row's mean equal to 0.

In [None]:
data = torch.tensor([[1.0, 2.0, 3.0],
                     [4.0, 5.0, 6.0],
                     [7.0, 8.0, 9.0]])

# YOUR CODE HERE
# Step 1: Compute the mean of each row (should have shape (3, 1) or (3,))
row_means = ...

# Step 2: Subtract row means from data (use broadcasting)
# Hint: You may need to reshape row_means to (3, 1) for broadcasting
normalized = ...

print(f"Original:\n{data}")
print(f"Row means: {row_means}")
print(f"Normalized:\n{normalized}")
print(f"Row means after normalization: {normalized.mean(dim=1)}")

In [None]:
# Verification
expected = torch.tensor([[-1., 0., 1.],
                         [-1., 0., 1.],
                         [-1., 0., 1.]])
assert torch.allclose(normalized, expected), f"Got:\n{normalized}"
assert torch.allclose(normalized.mean(dim=1), torch.zeros(3)), "Each row mean should be 0"
print("B2.2 Passed!")

### B2.3 Outer product using broadcasting

Compute the outer product of two vectors without using `torch.outer()`.

Given vectors u = [1, 2, 3] and v = [4, 5], compute the 3x2 outer product matrix.

In [None]:
u = torch.tensor([1.0, 2.0, 3.0])  # shape (3,)
v = torch.tensor([4.0, 5.0])       # shape (2,)

# YOUR CODE HERE
# Reshape u and v so that multiplication broadcasts to shape (3, 2)
# Hint: Make u shape (3, 1) and v shape (1, 2) or just (2,)
outer_product = ...

print(f"u: {u}")
print(f"v: {v}")
print(f"Outer product:\n{outer_product}")

In [None]:
# Verification
expected = torch.tensor([[4., 5.],
                         [8., 10.],
                         [12., 15.]])
assert torch.equal(outer_product, expected), f"Got:\n{outer_product}"
print("B2.3 Passed!")

---
## Exercise B3: Advanced Indexing

Practice more complex indexing patterns.

### B3.1 Extract diagonal elements

Extract the main diagonal of a matrix without using `torch.diag()`.

Hint: Use fancy indexing with `torch.arange()`.

In [None]:
matrix = torch.tensor([[1, 2, 3],
                       [4, 5, 6],
                       [7, 8, 9]])

# YOUR CODE HERE
# Hint: matrix[rows, cols] where rows = cols = [0, 1, 2]
diagonal = ...

print(f"Matrix:\n{matrix}")
print(f"Diagonal: {diagonal}")

In [None]:
# Verification
assert torch.equal(diagonal, torch.tensor([1, 5, 9])), f"Got: {diagonal}"
print("B3.1 Passed!")

### B3.2 Select specific elements

Given a batch of vectors, select a specific element from each vector based on an index tensor.

In [None]:
# Batch of 4 vectors, each with 3 elements
batch = torch.tensor([[10, 20, 30],
                      [40, 50, 60],
                      [70, 80, 90],
                      [100, 110, 120]])

# Indices: for each row, which column to select
indices = torch.tensor([0, 2, 1, 2])  # Select col 0 from row 0, col 2 from row 1, etc.

# YOUR CODE HERE
# Result should be [10, 60, 80, 120]
selected = ...

print(f"Batch:\n{batch}")
print(f"Indices: {indices}")
print(f"Selected elements: {selected}")

In [None]:
# Verification
assert torch.equal(selected, torch.tensor([10, 60, 80, 120])), f"Got: {selected}"
print("B3.2 Passed!")

### B3.3 Replace values using boolean indexing

Given a tensor, replace all values outside the range [-1, 1] with the boundary values (clamp).
Values < -1 become -1, values > 1 become 1.

In [None]:
values = torch.tensor([-2.5, -0.5, 0.0, 0.8, 1.5, 3.0])

# YOUR CODE HERE
# Don't use torch.clamp() - practice boolean indexing!
clamped = values.clone()  # Start with a copy
# Set values < -1 to -1
...
# Set values > 1 to 1
...

print(f"Original: {values}")
print(f"Clamped:  {clamped}")

In [None]:
# Verification
expected = torch.tensor([-1.0, -0.5, 0.0, 0.8, 1.0, 1.0])
assert torch.equal(clamped, expected), f"Got: {clamped}"
print("B3.3 Passed!")

---
## Exercise B4: Reshape Challenges

Practice more complex reshaping scenarios.

### B4.1 Transpose a batch of matrices

Given a batch of 2x3 matrices (shape: batch x 2 x 3), transpose each matrix to get shape (batch x 3 x 2).

In [None]:
batch_matrices = torch.arange(12).reshape(2, 2, 3)  # 2 matrices, each 2x3
print(f"Original shape: {batch_matrices.shape}")
print(f"Matrix 0:\n{batch_matrices[0]}")
print(f"Matrix 1:\n{batch_matrices[1]}")

# YOUR CODE HERE
# Transpose the last two dimensions (hint: use .transpose() or .permute())
transposed = ...

print(f"\nTransposed shape: {transposed.shape}")
print(f"Transposed Matrix 0:\n{transposed[0]}")

In [None]:
# Verification
assert transposed.shape == torch.Size([2, 3, 2]), f"Expected (2, 3, 2), got {transposed.shape}"
assert transposed[0, 0, 0] == 0 and transposed[0, 0, 1] == 3, "Transpose not correct"
print("B4.1 Passed!")

### B4.2 Flatten and unflatten

Convert a batch of images (batch x channels x height x width) to a batch of flat vectors, then back.

In [None]:
# Simulating a batch of 3 RGB images, each 4x4 pixels
images = torch.arange(3 * 3 * 4 * 4).reshape(3, 3, 4, 4).float()
print(f"Images shape: {images.shape}")

# YOUR CODE HERE
# Flatten each image to a vector (keep batch dimension)
# Result shape should be (3, 48) where 48 = 3*4*4
flattened = ...

# Unflatten back to original shape
unflattened = ...

print(f"Flattened shape: {flattened.shape}")
print(f"Unflattened shape: {unflattened.shape}")

In [None]:
# Verification
assert flattened.shape == torch.Size([3, 48]), f"Flattened should be (3, 48), got {flattened.shape}"
assert unflattened.shape == torch.Size([3, 3, 4, 4]), f"Unflattened should be (3, 3, 4, 4), got {unflattened.shape}"
assert torch.equal(images, unflattened), "Round-trip should preserve values"
print("B4.2 Passed!")

### B4.3 Stack vs Concatenate

Understand the difference between `torch.stack()` and `torch.cat()`.

In [None]:
t1 = torch.tensor([1, 2, 3])
t2 = torch.tensor([4, 5, 6])
t3 = torch.tensor([7, 8, 9])

# YOUR CODE HERE
# Use torch.stack() to create a 3x3 matrix (stack as rows)
stacked = ...

# Use torch.cat() to create a 1D tensor of length 9
concatenated = ...

print(f"Stacked (shape {stacked.shape}):\n{stacked}")
print(f"Concatenated (shape {concatenated.shape}): {concatenated}")

In [None]:
# Verification
assert stacked.shape == torch.Size([3, 3]), f"Stacked should be (3, 3), got {stacked.shape}"
assert concatenated.shape == torch.Size([9]), f"Concatenated should be (9,), got {concatenated.shape}"
assert torch.equal(stacked[0], t1), "First row of stacked should be t1"
print("B4.3 Passed!")

---
## Exercise B5: Autograd Applications

Apply autograd to practical scenarios.

### B5.1 Gradient of a polynomial

Compute the gradient of f(x) = x^3 - 2x^2 + x at x = 2.

Mathematical derivation:
- f'(x) = 3x^2 - 4x + 1
- f'(2) = 3(4) - 4(2) + 1 = 12 - 8 + 1 = 5

In [None]:
# YOUR CODE HERE
x = torch.tensor([2.0], requires_grad=True)

# Compute f(x) = x^3 - 2x^2 + x
f = ...

# Compute gradient
...

print(f"x = {x.item()}")
print(f"f(x) = {f.item()}")
print(f"f'(x) = {x.grad.item()}")

In [None]:
# Verification
assert f.item() == 2.0, f"f(2) should be 2.0, got {f.item()}"  # 8 - 8 + 2 = 2
assert x.grad.item() == 5.0, f"f'(2) should be 5.0, got {x.grad.item()}"
print("B5.1 Passed!")

### B5.2 Gradient with multiple inputs

Compute gradients of z = x^2 * y + y^3 with respect to both x and y at (x=2, y=3).

Mathematical derivation:
- dz/dx = 2xy = 2(2)(3) = 12
- dz/dy = x^2 + 3y^2 = 4 + 27 = 31

In [None]:
# YOUR CODE HERE
x = torch.tensor([2.0], requires_grad=True)
y = torch.tensor([3.0], requires_grad=True)

# Compute z = x^2 * y + y^3
z = ...

# Compute gradients
...

print(f"x = {x.item()}, y = {y.item()}")
print(f"z = {z.item()}")
print(f"dz/dx = {x.grad.item()}")
print(f"dz/dy = {y.grad.item()}")

In [None]:
# Verification
assert z.item() == 39.0, f"z should be 39.0 (4*3 + 27), got {z.item()}"
assert x.grad.item() == 12.0, f"dz/dx should be 12.0, got {x.grad.item()}"
assert y.grad.item() == 31.0, f"dz/dy should be 31.0, got {y.grad.item()}"
print("B5.2 Passed!")

### B5.3 Simple gradient descent step

Perform one step of gradient descent to minimize f(x) = (x - 3)^2.

Starting at x = 0, with learning rate 0.1, compute the new value of x after one update.

In [None]:
# YOUR CODE HERE
x = torch.tensor([0.0], requires_grad=True)
learning_rate = 0.1

# Forward pass: compute loss
loss = ...

# Backward pass: compute gradient
...

print(f"Initial x: {x.item()}")
print(f"Loss: {loss.item()}")
print(f"Gradient: {x.grad.item()}")

# Gradient descent update: x_new = x - learning_rate * gradient
# Use torch.no_grad() to prevent tracking this update
with torch.no_grad():
    x_new = ...

print(f"New x after one step: {x_new.item()}")

In [None]:
# Verification
# f(x) = (x-3)^2, f'(x) = 2(x-3)
# At x=0: f'(0) = 2(0-3) = -6
# x_new = 0 - 0.1 * (-6) = 0.6
assert loss.item() == 9.0, f"Loss should be 9.0, got {loss.item()}"
assert x.grad.item() == -6.0, f"Gradient should be -6.0, got {x.grad.item()}"
assert abs(x_new.item() - 0.6) < 1e-6, f"x_new should be 0.6, got {x_new.item()}"
print("B5.3 Passed!")

---
## Exercise B6: Mini-Project - Linear Regression by Hand

Implement a single step of linear regression using raw PyTorch operations.

### B6.1 Set up the problem

We want to fit y = wx + b to some data points.

In [None]:
# Data: 4 points
X = torch.tensor([1.0, 2.0, 3.0, 4.0])
y_true = torch.tensor([3.0, 5.0, 7.0, 9.0])  # True relationship: y = 2x + 1

# YOUR CODE HERE
# Initialize parameters (start with random guesses)
w = torch.tensor([0.0], requires_grad=True)  # weight
b = torch.tensor([0.0], requires_grad=True)  # bias

# Forward pass: compute predictions
y_pred = ...

# Compute MSE loss: mean((y_pred - y_true)^2)
loss = ...

print(f"Predictions: {y_pred.detach()}")
print(f"True values: {y_true}")
print(f"MSE Loss: {loss.item()}")

In [None]:
# Verification
# With w=0, b=0: predictions are all 0
# MSE = mean([9, 25, 49, 81]) = 164/4 = 41
assert torch.allclose(y_pred, torch.zeros(4)), "Predictions should be zeros with w=0, b=0"
assert loss.item() == 41.0, f"Loss should be 41.0, got {loss.item()}"
print("B6.1 Passed!")

### B6.2 Compute gradients and update

Compute the gradients of the loss with respect to w and b, then perform one gradient descent update.

In [None]:
# Continuing from B6.1...
# YOUR CODE HERE
# Backward pass
...

print(f"Gradient w.r.t. w: {w.grad.item()}")
print(f"Gradient w.r.t. b: {b.grad.item()}")

# Update parameters with learning rate 0.01
learning_rate = 0.01
with torch.no_grad():
    w_new = ...
    b_new = ...

print(f"New w: {w_new.item():.4f}")
print(f"New b: {b_new.item():.4f}")

In [None]:
# Verification
# The gradients and updates should move w and b toward the true values (w=2, b=1)
assert w.grad is not None and b.grad is not None, "Gradients should be computed"
assert w_new.item() > 0, "w should increase (move toward 2)"
assert b_new.item() > 0, "b should increase (move toward 1)"
print("B6.2 Passed!")

---
## Congratulations!

You've completed all the bonus exercises. These covered:

1. **Tensor Manipulation** - Creating specific patterns (diagonal, checkerboard, triangular)
2. **Broadcasting** - Shape prediction, normalization, outer products
3. **Advanced Indexing** - Diagonal extraction, batch selection, boolean masking
4. **Reshape Challenges** - Batch transpose, flatten/unflatten, stack vs cat
5. **Autograd Applications** - Polynomial gradients, multi-input gradients, gradient descent
6. **Mini-Project** - Linear regression fundamentals

You're well-prepared for Part 2: Building & Training Neural Networks!