# 🟢 Lesson 2: Tensors and Devices

Before generating art, we must speak the language of the machine: **PyTorch Tensors**.

### Goals:
1.  Understand what a **Tensor** is.
2.  Check for **GPU Acceleration** (DirectML/CUDA).
3.  Understand precision types (`float32` vs `float16`).

In [None]:
# 1. Setup
import notebook_utils
project_root, device, dtype = notebook_utils.setup_notebook()

import torch

## 1. What is a Tensor?

A Tensor is just a multi-dimensional matrix that can live on a graphics card (GPU).

- **Scalar**: `5` (0 dimensions)
- **Vector**: `[1, 2, 3]` (1 dimension)
- **Matrix**: `[[1, 2], [3, 4]]` (2 dimensions)
- **Image Tensor**: `(Height, Width, Color Channels)` (3 dimensions)
- **Batch of Images**: `(Batch Size, Height, Width, Colors)` (4 dimensions)

In [None]:
# Create a random tensor
x = torch.rand(3, 3)
print("Random 3x3 Matrix:")
print(x)

## 2. Moving to the GPU

Deep Learning involves billions of multiplications. CPUs are smart but slow at parallel math. GPUs are dumb but incredibly fast at parallel math.

We use `torch-directml` to access your AMD GPU.

In [None]:
print(f"Current Active Device: {device}")

# Move the tensor to the GPU
x_gpu = x.to(device)
print(f"Tensor is now on: {x_gpu.device}")

## 3. Precision (FP32 vs FP16)

- **FP32 (Single Precision)**: Standard for math. Uses 4 bytes per number. High accuracy.
- **FP16 (Half Precision)**: Uses 2 bytes per number. Slightly less accurate, but **2x Faster** and **uses 50% less VRAM**.

Stable Diffusion is robust enough to run comfortably on FP16.

In [None]:
# FP32
a_32 = torch.tensor([1.5555555], dtype=torch.float32)
print(f"FP32: {a_32.item()}")

# FP16
a_16 = a_32.to(torch.float16)
print(f"FP16: {a_16.item()} (Notice the loss of precision!)")

## Next Steps
Now that we know how to move data to the GPU, let's load the actual AI models in Lesson 3!