# PyTorch Basics

## Objective
Understand PyTorch fundamentals by:
- Working with tensors
- Understanding autograd
- Running computations on CPU/GPU

## Why this matters
Deep learning frameworks are lower-level than scikit-learn.
Understanding tensors and gradients is essential.

## Typical PyTorch Workflow

1. Prepare data (tensors)
2. Define model (nn.Module)
3. Define loss function
4. Define optimizer
5. Training loop (forward → loss → backward → update)

In [1]:
import torch

torch.__version__


'2.4.1+cpu'

In [None]:
# Create tensors
# Tensors ≈ NumPy arrays, but with GPU + gradients
a = torch.tensor([1, 2, 3])
b = torch.tensor([4, 5, 6])

print("a:", a)
print("b:", b)
print("a + b:", a + b)

a: tensor([1, 2, 3])
b: tensor([4, 5, 6])
a + b: tensor([5, 7, 9])


In [3]:
x = torch.randn(3, 4)

print("Shape:", x.shape)
print("Data type:", x.dtype)

Shape: torch.Size([3, 4])
Data type: torch.float32


In [4]:
x = torch.randn(3, 4)

print("Shape:", x.shape)
print("Data type:", x.dtype)

Shape: torch.Size([3, 4])
Data type: torch.float32


In [5]:
# Matrix multiplication
m1 = torch.randn(2, 3)
m2 = torch.randn(3, 4)

result = torch.matmul(m1, m2)
result.shape

torch.Size([2, 4])

## Automatic Differentiation (Autograd)

PyTorch automatically tracks operations on tensors
to compute gradients for optimization.

In [6]:
x = torch.tensor(2.0, requires_grad=True)
y = x ** 2 + 3 * x + 1

y.backward()

print("y:", y.item())
print("dy/dx:", x.grad)

y: 11.0
dy/dx: tensor(7.)


In [7]:
with torch.no_grad():
    z = x ** 2

z

tensor(4.)

In [None]:
# Set device if GPU is available
device = "cuda" if torch.cuda.is_available() else "cpu"
device

'cpu'

In [11]:
# move tensors to device
x = torch.randn(3, 3).to(device)
x.device

device(type='cpu')