# Introduction to PyTorch

## Part 1 Exercises: Tensors, Operations & Autograd

Practice what you've learned with hands-on tensor exercises.

**Instructions:**
- Complete the code in each cell where you see `# YOUR CODE HERE`
- Run the assertion cells to check your work
- Each exercise builds on concepts from the slides

In [1]:
import torch

print(f"PyTorch version: {torch.__version__}")

PyTorch version: 2.6.0+cu124


---
## Exercise 1: Creating Tensors

Practice creating tensors from lists, using factory functions, and specifying data types.

### 1.1 Create tensors from lists

Create the following tensors:
- `t1`: A 1D tensor containing `[1, 2, 3, 4, 5]`
- `t2`: A 2D tensor (matrix) with shape (2, 3) containing `[[1, 2, 3], [4, 5, 6]]`

In [2]:
# YOUR CODE HERE
t1 = torch.tensor([1, 2, 3, 4, 5])
t2 = torch.tensor([[1, 2, 3], [4, 5, 6]])

print(f"t1: {t1}")
print(f"t1 shape: {t1.shape}")
print(f"t2:\n{t2}")
print(f"t2 shape: {t2.shape}")

t1: tensor([1, 2, 3, 4, 5])
t1 shape: torch.Size([5])
t2:
tensor([[1, 2, 3],
        [4, 5, 6]])
t2 shape: torch.Size([2, 3])


In [3]:
# Verification
assert t1.shape == torch.Size([5]), f"t1 should have shape (5,), got {t1.shape}"
assert t2.shape == torch.Size([2, 3]), f"t2 should have shape (2, 3), got {t2.shape}"
assert torch.equal(t1, torch.tensor([1, 2, 3, 4, 5])), "t1 values are incorrect"
assert torch.equal(t2, torch.tensor([[1, 2, 3], [4, 5, 6]])), "t2 values are incorrect"
print("1.1 Passed!")

1.1 Passed!


### 1.2 Use factory functions

Create the following tensors using factory functions:
- `zeros_t`: A tensor of zeros with shape (3, 4)
- `ones_t`: A tensor of ones with shape (2, 2)
- `rand_t`: A tensor of random values (uniform 0-1) with shape (5,)
- `arange_t`: A tensor with values from 0 to 9 (inclusive of 0, exclusive of 10)

In [5]:
# YOUR CODE HERE
zeros_t = torch.zeros((3, 4))
ones_t = torch.ones((2, 3))
rand_t = torch.rand((2, 2))
arange_t = torch.arange(0, 10)

print(f"zeros_t shape: {zeros_t.shape}")
print(f"ones_t:\n{ones_t}")
print(f"rand_t: {rand_t}")
print(f"arange_t: {arange_t}")

zeros_t shape: torch.Size([3, 4])
ones_t:
tensor([[1., 1., 1.],
        [1., 1., 1.]])
rand_t: tensor([[0.1659, 0.9825],
        [0.1531, 0.4871]])
arange_t: tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])


In [None]:
# Verification
assert zeros_t.shape == torch.Size([3, 4]), f"zeros_t should have shape (3, 4), got {zeros_t.shape}"
assert (zeros_t == 0).all(), "zeros_t should contain all zeros"
assert ones_t.shape == torch.Size([2, 2]), f"ones_t should have shape (2, 2), got {ones_t.shape}"
assert (ones_t == 1).all(), "ones_t should contain all ones"
assert rand_t.shape == torch.Size([5]), f"rand_t should have shape (5,), got {rand_t.shape}"
assert (rand_t >= 0).all() and (rand_t <= 1).all(), "rand_t values should be in [0, 1]"
assert torch.equal(arange_t, torch.tensor([0, 1, 2, 3, 4, 5, 6, 7, 8, 9])), "arange_t should be [0, 1, ..., 9]"
print("1.2 Passed!")

### 1.3 Specify data types

Create tensors with specific data types:
- `float_t`: A tensor `[1, 2, 3]` with dtype `float32`
- `int_t`: A tensor `[1.5, 2.7, 3.9]` converted to `int64` (note: this truncates!)
- `bool_t`: A boolean tensor from `[0, 1, 0, 1, 1]`

In [None]:
# YOUR CODE HERE
float_t = torch.tensor([1, 2, 3], dtype=torch.float)
int_t = torch.tensor([1.5, 2.7, 3.9], dtype=torch.int32)
bool_t = torch.tensor([0, 1, 0, 1, 1], dtype=torch.bool)

print(f"float_t: {float_t}, dtype: {float_t.dtype}")
print(f"int_t: {int_t}, dtype: {int_t.dtype}")
print(f"bool_t: {bool_t}, dtype: {bool_t.dtype}")

float_t: tensor([1., 2., 3.]), dtype: torch.float32
int_t: tensor([1, 2, 3], dtype=torch.int32), dtype: torch.int32
bool_t: tensor([False,  True, False,  True,  True]), dtype: torch.bool


In [None]:
# Verification
assert float_t.dtype == torch.float32, f"float_t should be float32, got {float_t.dtype}"
assert int_t.dtype == torch.int64, f"int_t should be int64, got {int_t.dtype}"
assert torch.equal(int_t, torch.tensor([1, 2, 3])), "int_t should be [1, 2, 3] after truncation"
assert bool_t.dtype == torch.bool, f"bool_t should be bool, got {bool_t.dtype}"
print("1.3 Passed!")

---
## Exercise 2: Tensor Operations

Practice element-wise arithmetic, matrix multiplication, and reduction operations.

### 2.1 Element-wise arithmetic

Given tensors `a` and `b`, compute:
- `add_result`: a + b
- `mul_result`: a * b (element-wise)
- `pow_result`: a raised to the power of 2

In [7]:
a = torch.tensor([1.0, 2.0, 3.0, 4.0])
b = torch.tensor([10.0, 20.0, 30.0, 40.0])

# YOUR CODE HERE
add_result = a + b
mul_result = a * b
pow_result = a ** 2

print(f"a + b = {add_result}")
print(f"a * b = {mul_result}")
print(f"a ** 2 = {pow_result}")

a + b = tensor([11., 22., 33., 44.])
a * b = tensor([ 10.,  40.,  90., 160.])
a ** 2 = tensor([ 1.,  4.,  9., 16.])


In [8]:
# Verification
assert torch.equal(add_result, torch.tensor([11.0, 22.0, 33.0, 44.0])), "add_result is incorrect"
assert torch.equal(mul_result, torch.tensor([10.0, 40.0, 90.0, 160.0])), "mul_result is incorrect"
assert torch.equal(pow_result, torch.tensor([1.0, 4.0, 9.0, 16.0])), "pow_result is incorrect"
print("2.1 Passed!")

2.1 Passed!


### 2.2 Matrix multiplication

Given matrices `m1` (2x3) and `m2` (3x2), compute their matrix product.

Hint: Use `@` operator or `torch.matmul()`

In [12]:
m1 = torch.tensor([[1.0, 2.0, 3.0],
                   [4.0, 5.0, 6.0]])  # shape (2, 3)

m2 = torch.tensor([[1.0, 2.0],
                   [3.0, 4.0],
                   [5.0, 6.0]])  # shape (3, 2)

# YOUR CODE HERE
matmul_result = m1 @ m2

print(f"m1 @ m2 =\n{matmul_result}")
print(f"Result shape: {matmul_result.shape}")

m1 @ m2 =
tensor([[22., 28.],
        [49., 64.]])
Result shape: torch.Size([2, 2])


In [11]:
# Verification
expected = torch.tensor([[22.0, 28.0], [49.0, 64.0]])
assert matmul_result.shape == torch.Size([2, 2]), f"Result should be (2, 2), got {matmul_result.shape}"
assert torch.equal(matmul_result, expected), f"Matrix multiplication result is incorrect"
print("2.2 Passed!")

2.2 Passed!


### 2.3 Reduction operations

Given a 2D tensor, compute:
- `total_sum`: Sum of all elements
- `row_mean`: Mean of each row (result should have shape (3,))
- `col_max`: Maximum of each column (result should have shape (4,))

In [15]:
data = torch.tensor([[1.0, 2.0, 3.0, 4.0],
                     [5.0, 6.0, 7.0, 8.0],
                     [9.0, 10.0, 11.0, 12.0]])  # shape (3, 4)

# YOUR CODE HERE
total_sum = data.sum()
row_mean = data.mean(dim=1)  # Mean along dim=1 (columns), result shape (3,)
col_max = data.max(dim=0).values   # Max along dim=0 (rows), result shape (4,)

print(f"Total sum: {total_sum}")
print(f"Row means: {row_mean}")
print(f"Column maxes: {col_max}")

Total sum: 78.0
Row means: tensor([ 2.5000,  6.5000, 10.5000])
Column maxes: tensor([ 9., 10., 11., 12.])


In [16]:
# Verification
assert total_sum == 78.0, f"total_sum should be 78.0, got {total_sum}"
assert torch.equal(row_mean, torch.tensor([2.5, 6.5, 10.5])), "row_mean is incorrect"
assert torch.equal(col_max, torch.tensor([9.0, 10.0, 11.0, 12.0])), "col_max is incorrect"
print("2.3 Passed!")

2.3 Passed!


---
## Exercise 3: Reshaping

Practice reshaping tensors, understanding views vs copies, and using squeeze/unsqueeze.

### 3.1 Reshape tensors

Given a 1D tensor with 12 elements, reshape it to:
- `reshaped_3x4`: Shape (3, 4)
- `reshaped_2x6`: Shape (2, 6)
- `reshaped_auto`: Shape (4, -1) - let PyTorch infer the second dimension

In [17]:
original = torch.arange(12)  # [0, 1, 2, ..., 11]

# YOUR CODE HERE
reshaped_3x4 = original.view(3, 4)
reshaped_2x6 = original.view(2, 6)
reshaped_auto = original.view(4, 3)  # Shape should be (4, 3)

print(f"Original: {original}")
print(f"Reshaped to (3, 4):\n{reshaped_3x4}")
print(f"Reshaped to (2, 6):\n{reshaped_2x6}")
print(f"Reshaped to (4, ?): {reshaped_auto.shape}\n{reshaped_auto}")

Original: tensor([ 0,  1,  2,  3,  4,  5,  6,  7,  8,  9, 10, 11])
Reshaped to (3, 4):
tensor([[ 0,  1,  2,  3],
        [ 4,  5,  6,  7],
        [ 8,  9, 10, 11]])
Reshaped to (2, 6):
tensor([[ 0,  1,  2,  3,  4,  5],
        [ 6,  7,  8,  9, 10, 11]])
Reshaped to (4, ?): torch.Size([4, 3])
tensor([[ 0,  1,  2],
        [ 3,  4,  5],
        [ 6,  7,  8],
        [ 9, 10, 11]])


In [18]:
# Verification
assert reshaped_3x4.shape == torch.Size([3, 4]), f"Expected (3, 4), got {reshaped_3x4.shape}"
assert reshaped_2x6.shape == torch.Size([2, 6]), f"Expected (2, 6), got {reshaped_2x6.shape}"
assert reshaped_auto.shape == torch.Size([4, 3]), f"Expected (4, 3), got {reshaped_auto.shape}"
print("3.1 Passed!")

3.1 Passed!


### 3.2 Views vs Copies

Demonstrate the difference between views and copies:
1. Create a view of `source` using `.view()`
2. Create a copy of `source` using `.clone()`
3. Modify element [0, 0] in both the view and the copy
4. Observe what happens to `source`

In [None]:
source = torch.tensor([1, 2, 3, 4, 5, 6])

# YOUR CODE HERE
# Create a view with shape (2, 3)
view_tensor = ...

# Modify view_tensor[0, 0] to 99
# YOUR CODE HERE
...

print(f"After modifying view:")
print(f"  source: {source}")
print(f"  view_tensor:\n{view_tensor}")

# Check: did source change?
source_changed_by_view = source[0].item() == 99
print(f"  Source changed by view modification: {source_changed_by_view}")

In [None]:
# Reset and test with clone
source = torch.tensor([1, 2, 3, 4, 5, 6])

# YOUR CODE HERE
# Create a copy with shape (2, 3)
copy_tensor = ...

# Modify copy_tensor[0, 0] to 99
# YOUR CODE HERE
...

print(f"After modifying copy:")
print(f"  source: {source}")
print(f"  copy_tensor:\n{copy_tensor}")

# Check: did source change?
source_changed_by_copy = source[0].item() == 99
print(f"  Source changed by copy modification: {source_changed_by_copy}")

In [None]:
# Verification
assert source_changed_by_view == True, "View should share memory with source"
assert source_changed_by_copy == False, "Clone should NOT share memory with source"
print("3.2 Passed!")

### 3.3 Squeeze and Unsqueeze

Practice adding and removing dimensions:
- `unsqueezed`: Add a dimension at position 0 to `vec` (result shape: (1, 4))
- `squeezed`: Remove the size-1 dimensions from `bulky` (result shape: (3, 4))

In [None]:
vec = torch.tensor([1, 2, 3, 4])  # shape (4,)
bulky = torch.zeros(1, 3, 1, 4, 1)  # shape (1, 3, 1, 4, 1)

# YOUR CODE HERE
unsqueezed = ...
squeezed = ...

print(f"vec shape: {vec.shape}")
print(f"unsqueezed shape: {unsqueezed.shape}")
print(f"bulky shape: {bulky.shape}")
print(f"squeezed shape: {squeezed.shape}")

In [None]:
# Verification
assert unsqueezed.shape == torch.Size([1, 4]), f"Expected (1, 4), got {unsqueezed.shape}"
assert squeezed.shape == torch.Size([3, 4]), f"Expected (3, 4), got {squeezed.shape}"
print("3.3 Passed!")

---
## Exercise 4: Indexing & Slicing

Practice basic indexing, slicing patterns, and boolean masking.

### 4.1 Basic indexing

Given a 2D tensor, extract:
- `element`: The element at row 1, column 2
- `row`: The entire second row (index 1)
- `col`: The entire third column (index 2)

In [None]:
matrix = torch.tensor([[1, 2, 3, 4],
                       [5, 6, 7, 8],
                       [9, 10, 11, 12]])

# YOUR CODE HERE
element = ...
row = ...
col = ...

print(f"Element at [1, 2]: {element}")
print(f"Row 1: {row}")
print(f"Column 2: {col}")

In [None]:
# Verification
assert element == 7, f"element should be 7, got {element}"
assert torch.equal(row, torch.tensor([5, 6, 7, 8])), "row is incorrect"
assert torch.equal(col, torch.tensor([3, 7, 11])), "col is incorrect"
print("4.1 Passed!")

### 4.2 Slicing patterns

Extract the following slices:
- `top_left`: Top-left 2x2 submatrix
- `every_other_col`: All rows, but only columns 0 and 2 (every other column)
- `reversed_rows`: All columns, but rows in reverse order

In [None]:
matrix = torch.tensor([[1, 2, 3, 4],
                       [5, 6, 7, 8],
                       [9, 10, 11, 12]])

# YOUR CODE HERE
top_left = ...
every_other_col = ...
reversed_rows = ...

print(f"Top-left 2x2:\n{top_left}")
print(f"Every other column:\n{every_other_col}")
print(f"Reversed rows:\n{reversed_rows}")

In [None]:
# Verification
assert torch.equal(top_left, torch.tensor([[1, 2], [5, 6]])), "top_left is incorrect"
assert torch.equal(every_other_col, torch.tensor([[1, 3], [5, 7], [9, 11]])), "every_other_col is incorrect"
assert torch.equal(reversed_rows, torch.tensor([[9, 10, 11, 12], [5, 6, 7, 8], [1, 2, 3, 4]])), "reversed_rows is incorrect"
print("4.2 Passed!")

### 4.3 Boolean masking

Use boolean indexing to:
- `positives`: Extract all positive values from `values`
- `large_values`: Extract values greater than 5 from `values`
- `modified`: Create a copy of `values` where all negative values are replaced with 0

In [None]:
values = torch.tensor([-3, 1, 4, -1, 5, 9, -2, 6])

# YOUR CODE HERE
positives = ...
large_values = ...

# For modified: first clone, then use boolean indexing to set negatives to 0
modified = ...

print(f"Original: {values}")
print(f"Positives: {positives}")
print(f"Values > 5: {large_values}")
print(f"Negatives replaced with 0: {modified}")

In [None]:
# Verification
assert torch.equal(positives, torch.tensor([1, 4, 5, 9, 6])), "positives is incorrect"
assert torch.equal(large_values, torch.tensor([9, 6])), "large_values is incorrect"
assert torch.equal(modified, torch.tensor([0, 1, 4, 0, 5, 9, 0, 6])), "modified is incorrect"
print("4.3 Passed!")

---
## Exercise 5: Autograd Basics

Practice enabling gradient tracking, computing gradients, and using no_grad context.

### 5.1 Enable gradient tracking

Create tensors with gradient tracking:
- `x`: A tensor `[2.0, 3.0]` with `requires_grad=True`
- `y`: Convert an existing tensor to require gradients

In [None]:
# YOUR CODE HERE
x = ...

# Start with a tensor without gradient tracking, then enable it
y_initial = torch.tensor([1.0, 2.0])
y = ...  # Enable requires_grad on y_initial (hint: use requires_grad_() method)

print(f"x requires_grad: {x.requires_grad}")
print(f"y requires_grad: {y.requires_grad}")

In [None]:
# Verification
assert x.requires_grad == True, "x should have requires_grad=True"
assert y.requires_grad == True, "y should have requires_grad=True"
print("5.1 Passed!")

### 5.2 Compute gradients

Compute the gradient of `z = x^2 + 2x` with respect to `x` at `x = 3.0`.

Mathematical derivation:
- dz/dx = 2x + 2
- At x = 3: dz/dx = 2(3) + 2 = 8

In [None]:
# YOUR CODE HERE
x = torch.tensor([3.0], requires_grad=True)

# Compute z = x^2 + 2x
z = ...

# Compute gradients
...

print(f"x = {x.item()}")
print(f"z = x^2 + 2x = {z.item()}")
print(f"dz/dx = {x.grad.item()}")

In [None]:
# Verification
assert z.item() == 15.0, f"z should be 15.0, got {z.item()}"
assert x.grad.item() == 8.0, f"dz/dx should be 8.0, got {x.grad.item()}"
print("5.2 Passed!")

### 5.3 Use no_grad context

Demonstrate the difference between operations inside and outside `torch.no_grad()`:
1. Perform an operation WITH gradient tracking
2. Perform the same operation WITHOUT gradient tracking (using `torch.no_grad()`)

In [None]:
x = torch.tensor([2.0], requires_grad=True)

# Operation WITH gradient tracking
y_with_grad = x ** 2

# YOUR CODE HERE
# Operation WITHOUT gradient tracking (use torch.no_grad() context manager)
with ...:
    y_without_grad = x ** 2

print(f"With grad tracking: y.requires_grad = {y_with_grad.requires_grad}")
print(f"Without grad tracking: y.requires_grad = {y_without_grad.requires_grad}")

In [None]:
# Verification
assert y_with_grad.requires_grad == True, "y_with_grad should have requires_grad=True"
assert y_without_grad.requires_grad == False, "y_without_grad should have requires_grad=False"
print("5.3 Passed!")

### 5.4 Gradient accumulation (Bonus)

Demonstrate that gradients accumulate by default, and how to clear them.

In [None]:
x = torch.tensor([2.0], requires_grad=True)

# First backward: y1 = x^2, dy1/dx = 2x = 4
y1 = x ** 2
y1.backward()
grad_after_first = x.grad.clone()

# Second backward WITHOUT clearing: y2 = x^3, dy2/dx = 3x^2 = 12
# Gradient should accumulate: 4 + 12 = 16
y2 = x ** 3
y2.backward()
grad_accumulated = x.grad.clone()

# YOUR CODE HERE
# Clear the gradient (set x.grad to None or use zero_())
...

# Third backward after clearing: y3 = x^3, dy3/dx = 3x^2 = 12
y3 = x ** 3
y3.backward()
grad_after_clear = x.grad.clone()

print(f"Gradient after first backward: {grad_after_first.item()}")
print(f"Gradient after second backward (accumulated): {grad_accumulated.item()}")
print(f"Gradient after clearing and third backward: {grad_after_clear.item()}")

In [None]:
# Verification
assert grad_after_first.item() == 4.0, f"First gradient should be 4.0, got {grad_after_first.item()}"
assert grad_accumulated.item() == 16.0, f"Accumulated gradient should be 16.0, got {grad_accumulated.item()}"
assert grad_after_clear.item() == 12.0, f"Gradient after clear should be 12.0, got {grad_after_clear.item()}"
print("5.4 Passed!")

---
## Congratulations!

You've completed all the Part 1 exercises. You now have hands-on experience with:

1. **Creating Tensors** - from lists, factory functions, and with specific dtypes
2. **Tensor Operations** - arithmetic, matrix multiplication, and reductions
3. **Reshaping** - view, reshape, squeeze, unsqueeze, and understanding memory sharing
4. **Indexing & Slicing** - basic indexing, slicing patterns, and boolean masking
5. **Autograd** - gradient tracking, backward(), no_grad(), and gradient accumulation

Next week in Part 2, you'll learn to build neural networks using `nn.Module`, loss functions, optimizers, and the complete training loop!