# Simple Neural Networks with PyTorch

In this assignment, you'll build your first neural network using **PyTorch**.

## What is a Neural Network?
A neural network is like stacking simple functions (layers) to learn complex patterns:
- **Input Layer**: Takes your features
- **Hidden Layers**: Process and transform data
- **Output Layer**: Makes predictions

## Why PyTorch?
- **Simple and Pythonic**: Feels like writing regular Python
- **Industry standard**: Used by researchers and companies
- **Flexible**: Easy to debug and customize

## Learning Objectives
- Build a simple neural network
- Understand layers, neurons, and activation functions
- Train and evaluate the model
- Visualize training progress

## Setup and Installation

In [None]:
!pip install torch scikit-learn matplotlib seaborn pandas numpy

## Import Libraries

In [None]:
import numpy as np
import pandas as pd
import matplotlib.pyplot as plt
import seaborn as sns

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

import torch
import torch.nn as nn
import torch.optim as optim
from torch.utils.data import DataLoader, TensorDataset

print(f"PyTorch version: {torch.__version__}")

sns.set_style('whitegrid')
plt.rcParams['figure.figsize'] = (10, 6)

## 1. Load and Prepare Dataset

In [None]:
data = load_breast_cancer()
X = pd.DataFrame(data.data, columns=data.feature_names)
y = data.target

X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42, stratify=y
)

print(f"Training set: {X_train.shape}")
print(f"Test set: {X_test.shape}")

## 2. Scale the Data (IMPORTANT for Neural Networks!)

**TODO:** Use `StandardScaler` to normalize the features.

**Hints:**
- Create a `StandardScaler()` object
- Use `.fit_transform()` on training data
- Use `.transform()` on test data

In [None]:
# TODO: Create scaler and scale data
# scaler = StandardScaler()
# X_train_scaled = scaler.fit_transform(X_train)
# X_test_scaled = scaler.transform(X_test)

# print("Data scaled!")

## 3. Convert to PyTorch Tensors

PyTorch works with tensors (like numpy arrays but on GPU).

**TODO:** Convert data to PyTorch tensors.

**Hints:**
- Use `torch.FloatTensor()` for features
- Use `torch.FloatTensor()` for labels (reshape to (-1, 1))

In [None]:
# TODO: Convert to tensors
# X_train_tensor = torch.FloatTensor(X_train_scaled)
# y_train_tensor = torch.FloatTensor(y_train.values).reshape(-1, 1)
# X_test_tensor = torch.FloatTensor(X_test_scaled)
# y_test_tensor = torch.FloatTensor(y_test.values).reshape(-1, 1)

# print(f"Tensor shapes: {X_train_tensor.shape}, {y_train_tensor.shape}")

## 4. Define the Neural Network

### Task: Create a neural network class

**Architecture:**
```
Input (30 features)
  ↓
Dense Layer (16 neurons, ReLU)
  ↓
Dense Layer (8 neurons, ReLU)
  ↓
Output Layer (1 neuron, Sigmoid)
```

**Key Concepts:**
- **nn.Module**: Base class for all neural networks
- **nn.Linear**: Fully connected layer
- **nn.ReLU**: ReLU activation
- **nn.Sigmoid**: Sigmoid activation

**TODO:** Complete the neural network class.

**Hints:**
1. Inherit from `nn.Module`
2. Define layers in `__init__`
3. Define forward pass in `forward` method
4. Use `nn.Linear(input_size, output_size)` for layers

In [None]:
# TODO: Define neural network class
# class SimpleNet(nn.Module):
#     def __init__(self):
#         super(SimpleNet, self).__init__()
#         # TODO: Define layers
#         # self.fc1 = nn.Linear(30, 16)
#         # self.fc2 = nn.Linear(16, 8)
#         # self.fc3 = nn.Linear(8, 1)
#         # self.relu = nn.ReLU()
#         # self.sigmoid = nn.Sigmoid()
#     
#     def forward(self, x):
#         # TODO: Define forward pass
#         # x = self.relu(self.fc1(x))
#         # x = self.relu(self.fc2(x))
#         # x = self.sigmoid(self.fc3(x))
#         # return x
#         pass

# TODO: Create model instance
# model = SimpleNet()
# print(model)

## 5. Define Loss Function and Optimizer

**TODO:** Set up loss function and optimizer.

**Hints:**
- Loss: `nn.BCELoss()` for binary classification
- Optimizer: `optim.Adam(model.parameters(), lr=0.001)`

In [None]:
# TODO: Define loss and optimizer
# criterion = nn.BCELoss()
# optimizer = optim.Adam(model.parameters(), lr=0.001)

# print("Loss and optimizer ready!")

## 6. Train the Model

### Task: Write training loop

**Training Steps (for each epoch):**
1. Forward pass: `outputs = model(X_train_tensor)`
2. Calculate loss: `loss = criterion(outputs, y_train_tensor)`
3. Backward pass: `loss.backward()`
4. Update weights: `optimizer.step()`
5. Zero gradients: `optimizer.zero_grad()`

**TODO:** Complete the training loop.

**Hints:**
- Train for 100 epochs
- Store loss history for plotting
- Print progress every 10 epochs

In [None]:
# TODO: Training loop
# epochs = 100
# losses = []

# for epoch in range(epochs):
#     # Forward pass
#     outputs = model(X_train_tensor)
#     loss = criterion(outputs, y_train_tensor)
#     
#     # Backward pass
#     optimizer.zero_grad()
#     loss.backward()
#     optimizer.step()
#     
#     # Store loss
#     losses.append(loss.item())
#     
#     # Print progress
#     if (epoch + 1) % 10 == 0:
#         print(f'Epoch [{epoch+1}/{epochs}], Loss: {loss.item():.4f}')

# print("\nTraining complete!")

## 7. Visualize Training Progress

**TODO:** Plot the training loss over epochs.

In [None]:
# TODO: Plot loss curve
# plt.figure(figsize=(10, 6))
# plt.plot(losses, linewidth=2)
# plt.xlabel('Epoch', fontsize=12)
# plt.ylabel('Loss', fontsize=12)
# plt.title('Training Loss Over Time', fontsize=14, fontweight='bold')
# plt.grid(True, alpha=0.3)
# plt.show()

## 8. Evaluate on Test Set

**TODO:** Make predictions and evaluate.

**Hints:**
- Use `model.eval()` for evaluation mode
- Use `with torch.no_grad()` to disable gradients
- Convert predictions to numpy for sklearn metrics

In [None]:
# TODO: Make predictions
# model.eval()
# with torch.no_grad():
#     y_pred_proba = model(X_test_tensor)
#     y_pred = (y_pred_proba > 0.5).float()

# TODO: Convert to numpy
# y_pred_np = y_pred.numpy().flatten()
# y_test_np = y_test.values

# TODO: Calculate metrics
# accuracy = accuracy_score(y_test_np, y_pred_np)

# TODO: Print results
# print(f"Test Accuracy: {accuracy:.4f}")
# print("\nClassification Report:")
# print(classification_report(y_test_np, y_pred_np, target_names=['Malignant', 'Benign']))

## 10. Compare with Traditional ML

**TODO:** Train a Random Forest and compare.

**Hints:**
- Import and train `RandomForestClassifier`
- Compare accuracy and complexity

In [None]:
# TODO: Train Random Forest
# from sklearn.ensemble import RandomForestClassifier

# rf = RandomForestClassifier(random_state=42)
# rf.fit(X_train_scaled, y_train)
# rf_pred = rf.predict(X_test_scaled)
# rf_accuracy = accuracy_score(y_test, rf_pred)

# TODO: Compare
# print(f"PyTorch Neural Network: {accuracy:.4f}")
# print(f"Random Forest: {rf_accuracy:.4f}")