# 👩‍💻 Tensor Operations, Gradients, and GPU Practice

## 📋 Overview
In this lab, you will gain hands-on experience with fundamental PyTorch components that form the backbone of deep learning workflows. You'll create and manipulate tensors, compute gradients automatically, and leverage GPU acceleration to speed up your computations. By the end of this lab, you'll understand how these elements work together to enable efficient deep learning model development and training.

## 🎯 Learning Outcomes
By the end of this lab, you will be able to:

- Create and manipulate tensors using various PyTorch operations
- Compute gradients automatically using PyTorch's autograd functionality
- Move computations between CPU and GPU to observe performance differences
- Implement basic tensor reshaping and broadcasting techniques

## 🚀 Starting Point
Access the starter code:

In [None]:
import torch
import time

# Check if GPU is available
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
print(f"Using device: {device}")

Required tools/setup:

- Python 3.6 or later
- PyTorch installed (1.7 or later recommended)
- Access to a GPU is beneficial but not required

## Task 1: Create Tensors and Perform Matrix Operations
**Context:** In deep learning, you frequently need to create matrices and vectors to represent data and model parameters. Matrix operations form the foundation of neural network computations.

**Steps:**

1. Create a 3x3 matrix tensor using `torch.tensor()`
    - Use a nested list structure to define your matrix values
    - Consider using floating point values for better compatibility

2. Create a 3x1 vector tensor

    - Make sure the dimensions are compatible for matrix multiplication

3. Perform matrix multiplication using `torch.mm()`

    - Why does `torch.mm()` require specific dimension compatibility? What would happen if dimensions don't match?
    - How is this different from element-wise multiplication?

4. Print the result and verify the shape is as expected (should be 3x1)

In [None]:
# Your code for creating matrices and performing operations
# [CODE GOES HERE]

**⚙️ Test Your Work:**

- Verify the shape of your result matches what you expect
- Try multiplying your matrices in reverse order - what happens and why?

**💡 Tip:** Remember that matrix multiplication is not commutative - AB ≠ BA in most cases!

## Task 2: Reshape and Broadcast Tensors
**Context:** When working with neural networks, you'll often need to reshape data or perform operations between tensors of different shapes. PyTorch's broadcasting makes this easier.

**Steps:**

1. Create a 1D tensor with 6 elements using `torch.arange()`

    - What does `torch.arange()` do and how is it different from `torch.tensor()`?

2. Reshape this 1D tensor into a 2x3 2D tensor using `.view()`

    - How does `.view()` compare to `.reshape()`? When might you use one over the other?

3. Create another 2x1 tensor with values [[1], [2]]

4. Add these tensors together using broadcasting

    - Think about: How does PyTorch determine which dimensions to broadcast?
    - Try to predict the output shape before running your code

In [None]:
# Your code for reshaping and broadcasting tensors
# [CODE GOES HERE]

**⚙️ Test Your Work:**

- Verify that the resulting tensor has the shape 2x3
- Confirm the values match what you expected from broadcasting

**💡 Tip:** Broadcasting automatically expands smaller tensors to match the shape of larger ones without making copies of data!

## Task 3: Move Tensors Between CPU and GPU
**Context:** Modern deep learning relies on GPU acceleration to process large datasets efficiently. Understanding how to move data between devices is essential for optimizing performance.

**Steps:**

1. Create a large tensor (1000x1000) on the CPU using `torch.rand()`

    - How does `torch.rand()` differ from `torch.randn()`?

2. Time how long it takes to square all values in this tensor on the CPU

    - Use the `time` module to measure execution time
    - Why use element-wise operations for testing performance differences?

3. Move your tensor to the GPU using `.to(device)`

    - What happens if you try to move to GPU on a system without one?

4. Time the same square operation on the GPU

    - Remember to include the time.time() calls before and after the operation

5. Compare and print the execution times

    - What factors might affect the performance difference you observe?

In [None]:
# Your code for comparing CPU and GPU performance
# [CODE GOES HERE]

**⚙️ Test Your Work:**

- If GPU is available, the GPU operation should be faster
- Try with different tensor sizes to see how the performance gap changes

**💡 Tip:** For small tensors, the overhead of moving data to the GPU might outweigh the performance benefit!

## Task 4: Define Custom Functions and Utilize Autograd
**Context:** The ability to automatically compute gradients is essential for training neural networks. PyTorch's autograd system handles this complex task for you.

**Steps:**

1. Create a tensor with `requires_grad=True` to enable gradient tracking

    - Why do we need to set this flag for gradient computation?

2. Define a custom mathematical function (x³ + 2x²)

    - Consider how PyTorch builds a computation graph for this operation

3. Apply this function to your tensor

4. Call `.backward()` on the result to compute gradients

    - What does calling backward() do in PyTorch?
    - How is this related to backpropagation in neural networks?

5. Print the gradient values stored in `.grad` attribute of your input tensor

    - Check if the values match what you would expect from manual differentiation

In [None]:
# Your code for gradient computation
# [CODE GOES HERE]

**⚙️ Test Your Work:**

- Verify that the gradient values match the derivative of our function (3x² + 4x)
- Try with different input values to ensure the gradient calculation is correct

**💡 Tip:** Understanding autograd is crucial for debugging training issues in deep learning!

## ✅ Success Checklist
- Successfully created tensors and performed matrix multiplication
- Correctly reshaped tensors and utilized broadcasting
- Moved computations between CPU and GPU and observed performance differences
- Defined a custom function and computed gradients using autograd
- Program runs without errors

## 🔍 Common Issues & Solutions
**Problem:** Dimension mismatch in matrix operations
**Solution:** Double-check tensor shapes before operations and use `.shape` to debug

**Problem:** CUDA out of memory error
**Solution:** Reduce tensor size or batch size when working with GPU

**Problem:** Gradients aren't being computed
**Solution:** Ensure you've set `requires_grad=True` and that you're calling `.backward()` on a scalar tensor

## 🔑 Key Points
- Tensors are the fundamental data structure in PyTorch, similar to NumPy arrays but with GPU support
- Broadcasting eliminates the need for explicit tensor resizing in many operations
- GPUs can significantly accelerate tensor operations, especially for large tensors
- Autograd automatically computes gradients by tracking operations in a dynamic computation graph

## 💻 Reference Solution

<details>    
<summary><strong>Click HERE to see a reference solution</strong></summary>    

```python
# Task 1: Create Tensors and Perform Matrix Operations
matrix_3x3 = torch.tensor([[1.0, 2.0, 3.0], 
                           [4.0, 5.0, 6.0], 
                           [7.0, 8.0, 9.0]])
vector_3x1 = torch.tensor([[1.0], [2.0], [3.0]])

# Perform matrix multiplication
result = torch.mm(matrix_3x3, vector_3x1)
print(f"Matrix multiplication result:\n{result}")
print(f"Result shape: {result.shape}")

# Try reverse multiplication
try:
    reverse_result = torch.mm(vector_3x1, matrix_3x3)
except RuntimeError as e:
    print(f"Error in reverse multiplication: {e}")

# Task 2: Reshape and Broadcast Tensors
tensor1D = torch.arange(6)
print(f"Original 1D tensor: {tensor1D}")

reshaped_tensor = tensor1D.view(2, 3)
print(f"Reshaped 2x3 tensor:\n{reshaped_tensor}")

tensor2D = torch.tensor([[1], [2]])
print(f"Second tensor shape: {tensor2D.shape}")

# Broadcasting example
broadcasted_sum = reshaped_tensor + tensor2D
print(f"Result after broadcasting:\n{broadcasted_sum}")

# Task 3: Move Tensors Between CPU and GPU
# Create a large tensor
large_tensor = torch.rand((1000, 1000))

# Time the operation on CPU
start_cpu = time.time()
result_cpu = large_tensor ** 2
end_cpu = time.time()
cpu_time = end_cpu - start_cpu

# Move to GPU if available
if torch.cuda.is_available():
    large_tensor_gpu = large_tensor.to(device)
    
    # Time the operation on GPU
    start_gpu = time.time()
    result_gpu = large_tensor_gpu ** 2
    # Ensure operation is complete before timing
    torch.cuda.synchronize()
    end_gpu = time.time()
    gpu_time = end_gpu - start_gpu
    
    print(f"CPU time: {cpu_time:.6f} seconds")
    print(f"GPU time: {gpu_time:.6f} seconds")
    print(f"Speedup: {cpu_time/gpu_time:.2f}x")
else:
    print(f"CPU time: {cpu_time:.6f} seconds")
    print("GPU not available for comparison")

# Task 4: Define Custom Functions and Utilize Autograd
# Create tensor with gradient tracking
x_tensor = torch.tensor([2.0, 3.0, 4.0], requires_grad=True)

# Define and apply custom function
def custom_function(x):
    return x**3 + 2*x**2

y_tensor = custom_function(x_tensor)
print(f"Function output: {y_tensor}")

# Sum to get scalar for backward function
y_sum = y_tensor.sum()

# Compute gradients
y_sum.backward()

# Print gradients
print(f"Input values: {x_tensor}")
print(f"Computed gradients: {x_tensor.grad}")
print(f"Expected gradients (3x² + 4x): {3*x_tensor**2 + 4*x_tensor}")
```