# Mastering PyTorch: Fundamental Concepts for Machine Learning
This Google Colab notebook walks through **basic → advanced** PyTorch features—the same tools used to train cutting‑edge research models.

*Author: ChatGPT‑4o*  
*Last updated:* 2025-06-29 00:57 UTC


## Environment Check

In [None]:

import torch, platform, sys, time, math
print(f'PyTorch version: {torch.__version__}')
device = torch.device('cuda' if torch.cuda.is_available() else 'cpu')
print('Using device:', device)
if device.type == 'cuda':
    print(torch.cuda.get_device_name(0))


## 1. Tensor Essentials
Tensors are multi‑dimensional arrays—the basic building blocks of every PyTorch model.

In [None]:

# Creating tensors
import torch
scalar = torch.tensor(3.14)
vector = torch.tensor([1,2,3])
matrix = torch.randn((3,3))
tensor4d = torch.zeros((2,3,4,5))

print('Scalar:', scalar)
print('Vector:', vector)
print('Matrix shape:', matrix.shape)
print('4‑D Tensor shape:', tensor4d.shape)

# Tensor math
a = torch.randn((2,2))
b = torch.randn((2,2))
print('Matrix product:\n', a @ b)


### Device placement
Move tensors between CPU and GPU seamlessly:

In [None]:

matrix_gpu = matrix.to(device)
print('matrix_gpu device ->', matrix_gpu.device)


## 2. Autograd: Automatic Differentiation

In [None]:

x = torch.tensor([2.0, 3.0], requires_grad=True)
y = (x**2).sum()
y.backward()
print('Gradients:', x.grad)


## 3. Build & Train a Simple Neural Network
We'll create a small MLP to classify two‑moon data.

In [None]:

from sklearn.datasets import make_moons
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
import torch.nn as nn
import torch.optim as optim

# generate data
X, y = make_moons(n_samples=2000, noise=0.25, random_state=42)
X = StandardScaler().fit_transform(X)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.2, stratify=y)

X_train = torch.tensor(X_train, dtype=torch.float32)
y_train = torch.tensor(y_train, dtype=torch.long)
X_test  = torch.tensor(X_test , dtype=torch.float32)
y_test  = torch.tensor(y_test , dtype=torch.long)

train_ds = torch.utils.data.TensorDataset(X_train, y_train)
test_ds  = torch.utils.data.TensorDataset(X_test , y_test )

train_loader = torch.utils.data.DataLoader(train_ds, batch_size=64, shuffle=True)
test_loader  = torch.utils.data.DataLoader(test_ds , batch_size=64)

class MLP(nn.Module):
    def __init__(self):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(2, 32),
            nn.ReLU(),
            nn.Linear(32,16),
            nn.ReLU(),
            nn.Linear(16,2)
        )
    def forward(self,x): return self.net(x)

model = MLP().to(device)
criterion = nn.CrossEntropyLoss()
optimizer = optim.Adam(model.parameters(), lr=1e-2)
scheduler = optim.lr_scheduler.StepLR(optimizer, step_size=20, gamma=0.5)

# Training loop with mixed precision
scaler = torch.cuda.amp.GradScaler(enabled=(device.type=='cuda'))

for epoch in range(1,51):
    model.train()
    running_loss=0.0
    for xb,yb in train_loader:
        xb,yb = xb.to(device), yb.to(device)
        optimizer.zero_grad()
        with torch.cuda.amp.autocast(enabled=(device.type=='cuda')):
            preds = model(xb)
            loss = criterion(preds,yb)
        scaler.scale(loss).backward()
        scaler.step(optimizer)
        scaler.update()
        running_loss += loss.item()*xb.size(0)
    scheduler.step()
    if epoch%10==0:
        print(f'Epoch {epoch:02d} | Loss {running_loss/len(train_ds):.4f}')


## 4. Evaluating the Model

In [None]:

model.eval()
correct,total = 0,0
with torch.no_grad():
    for xb,yb in test_loader:
        xb,yb = xb.to(device), yb.to(device)
        preds = model(xb).argmax(1)
        correct += (preds==yb).sum().item()
        total += yb.size(0)
print(f'Test Accuracy: {correct/total*100:.2f}%')


## 5. Custom Autograd Function
Create a custom activation (Swish) with manual backward.

In [None]:

class Swish(torch.autograd.Function):
    @staticmethod
    def forward(ctx, x):
        ctx.save_for_backward(x)
        return x * torch.sigmoid(x)
    @staticmethod
    def backward(ctx, grad_output):
        x, = ctx.saved_tensors
        sig = torch.sigmoid(x)
        return grad_output * (sig + x*sig*(1-sig))

# test
x = torch.randn(5, requires_grad=True)
y = Swish.apply(x)
y.sum().backward()
print('Custom gradient ok:', x.grad)


## 6. Saving, Loading, and TorchScript

In [None]:

torch.save(model.state_dict(), 'mlp_moons.pth')
# Recreate & load
loaded = MLP().to(device)
loaded.load_state_dict(torch.load('mlp_moons.pth', map_location=device))

# TorchScript trace
scripted = torch.jit.script(loaded)
scripted.save('mlp_moons_scripted.pt')
print('Model scripted & saved.')


## 🎉 Conclusion
This notebook covered:
- **Tensors** & device placement
- **Autograd** and custom gradients
- **Adam optimizer**, LR scheduler, mixed precision
- **GPU acceleration**
- Saving, loading, and *TorchScript*

Experiment further by increasing model depth, trying different optimizers, or scaling up on more complex datasets!