# **PyTorch training script from scratch**

We'll use a simple example: **binary classification** on a synthetic dataset.

---

### Problem Setup

Let’s assume a toy dataset where we classify points in 2D as class 0 or 1 based on a simple linear boundary.

---

## 1. Imports & Dependencies

In [None]:
import torch
import torch.nn as nn
from torch.utils.data import Dataset, DataLoader
import torch.optim as optim
import numpy as np
import matplotlib.pyplot as plt

## 2. Sample Dataset

We generate 2D points and classify them using a line `x + y > 1` → class 1.

In [None]:
np.random.seed(0)

# Generate 1000 2D points
X = np.random.rand(1000, 2)
y = (X[:, 0] + X[:, 1] > 1).astype(np.float32)  # label: 1 if x + y > 1

print(X.shape, y.shape)

## 3. Custom Dataset Class

In [None]:
class CustomDataset(Dataset):
    def __init__(self, X, y):
        self.X = torch.tensor(X, dtype=torch.float32)
        self.y = torch.tensor(y, dtype=torch.float32).unsqueeze(1)  # shape: (N, 1)
    
    def __len__(self):
        return len(self.X)
    
    def __getitem__(self, idx):
        return self.X[idx], self.y[idx]

**Explanation:**

* `__init__`: Converts `numpy` arrays to `torch.tensor`.
* `__len__`: Returns dataset length.
* `__getitem__`: Gets a sample (X, y) pair.

---

## 4. Model Architecture

Let’s build a small fully connected neural net.

In [None]:
class SimpleClassifier(nn.Module):
    def __init__(self):
        super(SimpleClassifier, self).__init__()
        self.fc1 = nn.Linear(2, 16)      # Input: 2 features
        self.relu = nn.ReLU()
        self.fc2 = nn.Linear(16, 1)      # Output: 1 logit for binary class
        self.sigmoid = nn.Sigmoid()      # Optional for inference

    def forward(self, x):
        x = self.fc1(x)
        x = self.relu(x)
        x = self.fc2(x)
        return x

## 5. Hyperparameters

In [None]:
learning_rate = 0.01
batch_size = 32
num_epochs = 20

## 6. Dataloader

In [None]:
dataset = CustomDataset(X, y)
dataloader = DataLoader(dataset, batch_size=batch_size, shuffle=True)

## 7. Model, Loss, Optimizer

In [None]:
model = SimpleClassifier()
criterion = nn.BCEWithLogitsLoss()  # Better for numerical stability
optimizer = optim.Adam(model.parameters(), lr=learning_rate)

## 8. Training Loop

In [None]:
for epoch in range(num_epochs):
    total_loss = 0
    for batch_X, batch_y in dataloader:
        # ---- Forward Pass ----
        logits = model(batch_X)
        loss = criterion(logits, batch_y)

        # ---- Backward Pass ----
        optimizer.zero_grad()
        loss.backward()
        optimizer.step()

        total_loss += loss.item()
    
    print(f"Epoch [{epoch+1}/{num_epochs}], Loss: {total_loss/len(dataloader):.4f}")


## 9. Evaluation (Optional)

In [None]:
# Inference mode
model.eval()
with torch.no_grad():
    test_logits = model(torch.tensor(X, dtype=torch.float32))
    predictions = (torch.sigmoid(test_logits) > 0.5).float()
    acc = (predictions.squeeze() == torch.tensor(y)).float().mean()
    print(f"Accuracy: {acc:.2f}")

## Final Thoughts

### Summary of Each Component:

| Component          | Purpose                                          |
| ------------------ | ------------------------------------------------ |
| **Dataset class**  | Wraps custom numpy arrays into a PyTorch dataset |
| **Dataloader**     | Feeds mini-batches into the model                |
| **Model**          | A basic 2-layer neural network                   |
| **Forward pass**   | Compute outputs from inputs                      |
| **Loss**           | Binary Cross Entropy with logits                 |
| **Backward pass**  | Compute gradients using `loss.backward()`        |
| **Optimizer step** | Updates weights via `optimizer.step()`           |
| **Epoch loop**     | Repeats training multiple times over data        |

