# PyTorch Fundamentals - Student Guide

This notebook is designed to accompany you through the PyTorch Fundamentals lesson. It includes explanations, code examples, and solutions to the exercises from the provided notebook. We'll cover tensors, their operations, and how to work with them effectively in PyTorch.

**Note**: Ensure you're running this on a GPU-enabled environment (e.g., Google Colab with GPU runtime) for exercises involving GPU computations.

## What You'll Learn
- Creating and manipulating tensors
- Tensor operations (addition, multiplication, etc.)
- Working with tensor shapes and indexing
- Interfacing with NumPy
- Reproducibility with random seeds
- Running tensors on GPUs
- Solving practical exercises using PyTorch operations

Let's get started!

## 1. Setting Up PyTorch

First, let's import PyTorch and check its version to ensure everything is set up correctly.

In [None]:
import torch
print(f"PyTorch Version: {torch.__version__}")

## 2. Introduction to Tensors

Tensors are the core data structure in PyTorch, used to represent data numerically. They can be scalars (0D), vectors (1D), matrices (2D), or higher-dimensional arrays.

Let's create some basic tensors:

In [None]:
# Scalar (0D tensor)
scalar = torch.tensor(7)
print(f"Scalar: {scalar}, Dimensions: {scalar.ndim}, Shape: {scalar.shape}")

# Vector (1D tensor)
vector = torch.tensor([7, 7])
print(f"Vector: {vector}, Dimensions: {vector.ndim}, Shape: {vector.shape}")

# Matrix (2D tensor)
matrix = torch.tensor([[7, 8], [9, 10]])
print(f"Matrix: \n{matrix}, Dimensions: {matrix.ndim}, Shape: {matrix.shape}")

# Tensor (3D tensor)
tensor = torch.tensor([[[1, 2, 3], [3, 6, 9], [2, 4, 5]]])
print(f"Tensor: \n{tensor}, Dimensions: {tensor.ndim}, Shape: {tensor.shape}")

**Key Points**:
- Use `ndim` to check the number of dimensions.
- Use `shape` to see how elements are arranged.
- Count square brackets on one side to determine dimensions.

## 3. Creating Random Tensors

Machine learning models often start with random tensors. Use `torch.rand()` to create tensors with random values.

In [None]:
# Random tensor of size (3, 4)
random_tensor = torch.rand(size=(3, 4))
print(f"Random Tensor: \n{random_tensor}, Dtype: {random_tensor.dtype}")

# Random tensor for an image shape (224, 224, 3)
random_image_tensor = torch.rand(size=(224, 224, 3))
print(f"Image Tensor Shape: {random_image_tensor.shape}, Dimensions: {random_image_tensor.ndim}")

## 4. Tensors of Zeros and Ones

Use `torch.zeros()` and `torch.ones()` to create tensors filled with zeros or ones, often used for masking or initialization.

In [None]:
# Tensor of zeros
zeros = torch.zeros(size=(3, 4))
print(f"Zeros: \n{zeros}, Dtype: {zeros.dtype}")

# Tensor of ones
ones = torch.ones(size=(3, 4))
print(f"Ones: \n{ones}, Dtype: {ones.dtype}")

## 5. Creating Ranges and Tensors Like Others

Use `torch.arange()` for ranges and `torch.zeros_like()` or `torch.ones_like()` to create tensors with the same shape as another.

In [None]:
# Range from 0 to 9
zero_to_ten = torch.arange(start=0, end=10, step=1)
print(f"Range: {zero_to_ten}")

# Zeros like zero_to_ten
ten_zeros = torch.zeros_like(zero_to_ten)
print(f"Zeros Like: {ten_zeros}")

## 6. Tensor Datatypes

Tensors have different datatypes (e.g., `float32`, `float16`, `int64`). The default is `float32`. Specify `dtype` to change it.

In [None]:
# Float32 tensor
float_32_tensor = torch.tensor([3.0, 6.0, 9.0], dtype=None)
print(f"Float32: Shape: {float_32_tensor.shape}, Dtype: {float_32_tensor.dtype}, Device: {float_32_tensor.device}")

# Float16 tensor
float_16_tensor = torch.tensor([3.0, 6.0, 9.0], dtype=torch.float16)
print(f"Float16: Dtype: {float_16_tensor.dtype}")

**Common Issues**:
- Mismatched shapes
- Different datatypes (e.g., `float32` vs. `float16`)
- Tensors on different devices (CPU vs. GPU)

## 7. Manipulating Tensors

Tensor operations include addition, subtraction, multiplication, division, and matrix multiplication.

In [None]:
# Basic operations
tensor = torch.tensor([1, 2, 3])
print(f"Original: {tensor}")
print(f"Add 10: {tensor + 10}")
print(f"Multiply by 10: {tensor * 10}")

# Matrix multiplication
tensor_a = torch.tensor([[1, 2], [3, 4]])
tensor_b = torch.tensor([[5, 6], [7, 8]])
matmul = torch.matmul(tensor_a, tensor_b)
print(f"Matrix Multiplication: \n{matmul}")

## 8. Dealing with Tensor Shapes

Use `view()`, `reshape()`, `squeeze()`, `unsqueeze()`, and `permute()` to manipulate tensor shapes.

In [None]:
# Reshape
x = torch.arange(1., 8.)
x_reshaped = x.reshape(1, 7)
print(f"Reshaped: {x_reshaped}, Shape: {x_reshaped.shape}")

# Squeeze
x_squeezed = x_reshaped.squeeze()
print(f"Squeezed: {x_squeezed}, Shape: {x_squeezed.shape}")

# Unsqueeze
x_unsqueezed = x_squeezed.unsqueeze(dim=0)
print(f"Unsqueezed: {x_unsqueezed}, Shape: {x_unsqueezed.shape}")

# Permute
x_original = torch.rand(size=(224, 224, 3))
x_permuted = x_original.permute(2, 0, 1)
print(f"Permuted Shape: {x_permuted.shape}")

## 9. PyTorch and NumPy

Convert between PyTorch tensors and NumPy arrays using `torch.from_numpy()` and `tensor.numpy()`.

In [None]:
import numpy as np

# NumPy to PyTorch
array = np.arange(1.0, 8.0)
tensor = torch.from_numpy(array).type(torch.float32)
print(f"NumPy to Tensor: {tensor}, Dtype: {tensor.dtype}")

# PyTorch to NumPy
tensor = torch.ones(7)
numpy_tensor = tensor.numpy()
print(f"Tensor to NumPy: {numpy_tensor}, Dtype: {numpy_tensor.dtype}")

## 10. Reproducibility

Set random seeds with `torch.manual_seed()` to make experiments reproducible.

In [None]:
RANDOM_SEED = 42
torch.manual_seed(RANDOM_SEED)
random_tensor_C = torch.rand(3, 4)
torch.manual_seed(RANDOM_SEED)
random_tensor_D = torch.rand(3, 4)
print(f"Tensor C: \n{random_tensor_C}")
print(f"Tensor D: \n{random_tensor_D}")
print(f"Equal: {(random_tensor_C == random_tensor_D).all()}")

## 11. Running Tensors on GPU

Use `to(device)` to move tensors to GPU for faster computations.

In [None]:
# Check for GPU
device = "cuda" if torch.cuda.is_available() else "cpu"
print(f"Device: {device}, GPU Count: {torch.cuda.device_count()}")

# Move tensor to GPU
tensor = torch.tensor([1, 2, 3])
tensor_on_gpu = tensor.to(device)
print(f"Tensor on GPU: {tensor_on_gpu}")

# Move back to CPU for NumPy
tensor_back_on_cpu = tensor_on_gpu.cpu().numpy()
print(f"Back to CPU (NumPy): {tensor_back_on_cpu}")

## 12. Exercises

Below are solutions to the exercises from the provided notebook. Each solution uses only PyTorch/NumPy operations, avoiding loops and conditionals.

### Ex 1: Create a 10x10 Tensor with Numbers 1 to 100

Create a tensor representing a table with numbers 1 to 100 in a 10x10 grid.

In [None]:
# Solution
table_tensor = torch.arange(1, 101).view(10, 10)
print(f"10x10 Table: \n{table_tensor}")

**Explanation**: `torch.arange(1, 101)` creates a 1D tensor with numbers 1 to 100. `view(10, 10)` reshapes it into a 10x10 matrix, filling row-wise.

### Ex 2: Generate a Linear Classification Dataset

Generate a dataset with 10 samples, 3 features drawn from a uniform distribution (-0.5, 0.5), and binary labels based on a linear combination.

In [None]:
# Solution
X = torch.rand(10, 3) - 0.5  # Uniform [-0.5, 0.5]
a = torch.tensor([1.0, 2.0, -1.0])  # Coefficients a1, a2, a3
a0 = 0.1  # Bias
y = (torch.matmul(X, a) + a0 > 0).float()
print(f"Features X: \n{X}")
print(f"Labels y: {y}")

**Explanation**: `torch.rand(10, 3) - 0.5` generates features in [-0.5, 0.5]. `torch.matmul(X, a) + a0` computes the linear combination, and `> 0` creates binary labels (0 or 1).

### Ex 3: Classification Output

Convert classification probabilities to predicted classes and their probabilities.

In [None]:
# Given probabilities
probs = torch.softmax(torch.rand(10, 3), dim=1)

# Solution
predicted_classes = torch.argmax(probs, dim=1)
predicted_probs = torch.max(probs, dim=1).values
print(f"Probabilities: \n{probs}")
print(f"Predicted Classes: {predicted_classes}")
print(f"Predicted Probabilities: {predicted_probs}")

**Explanation**: `torch.argmax(probs, dim=1)` finds the index of the maximum probability per sample (predicted class). `torch.max(probs, dim=1).values` gets the corresponding probabilities.

### Ex 4: Concat Datasets (Vertically)

Combine two parts of a dataset (X1, y1) and (X2, y2) vertically.

In [None]:
# Given
X1 = torch.rand(10, 5)
y1 = torch.rand(10)
X2 = torch.rand(15, 5)
y2 = torch.rand(15)

# Solution
X_combined = torch.cat((X1, X2), dim=0)
y_combined = torch.cat((y1, y2), dim=0)
print(f"Combined X Shape: {X_combined.shape}")
print(f"Combined y Shape: {y_combined.shape}")

**Explanation**: `torch.cat((X1, X2), dim=0)` concatenates along the first dimension (rows), combining 10 and 15 samples into 25. Similarly for `y1` and `y2`.

### Ex 5: Concat Datasets (Horizontally)

Combine two feature sets X1 and X2 horizontally.

In [None]:
# Given
X1 = torch.rand(10, 5)
X2 = torch.rand(10, 3)

# Solution
X_combined = torch.cat((X1, X2), dim=1)
print(f"Combined X Shape: {X_combined.shape}")

**Explanation**: `torch.cat((X1, X2), dim=1)` concatenates along the second dimension (columns), combining 5 and 3 features into 8.

### Ex 6: Implementing Softmax

Implement the softmax function for a tensor of shape (n_samples, n_logits).

In [None]:
# Given
n_samples = 10
n_logits = 7
X = torch.rand(n_samples, n_logits)

# Solution
exp_X = torch.exp(X)
sum_exp_X = exp_X.sum(dim=1, keepdim=True)
softmax_X = exp_X / sum_exp_X
print(f"Input: \n{X}")
print(f"Softmax Output: \n{softmax_X}")
print(f"Sum per sample: {softmax_X.sum(dim=1)}")

**Explanation**: Compute `exp(X)` for each logit, sum across logits per sample (`dim=1`), and divide to normalize. `keepdim=True` ensures broadcasting works correctly.

### Ex 7: Multiplication Table

Create a 10x10 multiplication table for numbers 1 to 10 using matrix multiplication.

In [None]:
# Solution
numbers = torch.arange(1, 11, dtype=torch.float32)
mul_table = torch.matmul(numbers.view(-1, 1), numbers.view(1, -1))
print(f"Multiplication Table: \n{mul_table.int()}")

**Explanation**: Reshape `numbers` to column (10, 1) and row (1, 10) vectors. Matrix multiplication produces a 10x10 table where `table[i,j] = i * j`.

### Ex 8: Subtract Consecutive Rows in Multiplication Table

Subtract each row from the previous row, leaving the first row unchanged.

In [None]:
# Given
mul_table = torch.arange(1, 26).view(5, 5)

# Solution
diff_table = mul_table.clone()
diff_table[1:] = mul_table[1:] - mul_table[:-1]
print(f"Original Table: \n{mul_table}")
print(f"Difference Table: \n{diff_table}")

**Explanation**: Clone the table to avoid modifying the original. For rows 1 to end, subtract the previous rows (0 to end-1). The first row remains unchanged.

### Ex 9: Broadcasting with a Mask

Zero out predictions of invalid samples using a mask and compute the average predicted probability per class for valid samples.

In [None]:
# Given
predictions = torch.rand(5, 4)
mask = torch.tensor([1, 0, 1, 1, 0], dtype=torch.float32)  # 1 for valid, 0 for invalid

# Solution
masked_predictions = predictions * mask.view(-1, 1)  # Broadcasting mask
valid_count = mask.sum()
avg_probs = masked_predictions.sum(dim=0) / valid_count
print(f"Predictions: \n{predictions}")
print(f"Masked Predictions: \n{masked_predictions}")
print(f"Average Probabilities per Class: {avg_probs}")

**Explanation**: Reshape `mask` to (5, 1) and multiply with `predictions` to zero out invalid samples. Sum along `dim=0` and divide by the number of valid samples (`mask.sum()`).

## Final Notes

- Always check `shape`, `dtype`, and `device` when debugging.
- Use device-agnostic code (`to(device)`) for flexibility.
- Practice these operations to build intuition for tensor manipulations.

For further help, refer to:
- [PyTorch Documentation](https://pytorch.org/docs/stable/)
- [PyTorch Forums](https://discuss.pytorch.org/)
- [Course GitHub](https://github.com/mrdbourke/pytorch-deep-learning)

Good luck in your class!