# Neural Networks Fundamentals with PyTorch: From Regression to Classification

**Mahmood Amintoosi, Spring 2024**
Computer Science Dept, Ferdowsi University of Mashhad

Original material adapted from [Tomas Beuzen's course](https://ubc-mds.github.io/DSCI_572_sup-learn-2)

## Learning Objectives
<hr>

- Understand PyTorch tensors and their operations
- Build and train neural networks for regression and classification
- Implement logistic regression from scratch
- Apply different activation functions in neural networks

## Imports
<hr>

In [13]:
import numpy as np
import pandas as pd
import torch
from torch import nn, optim
from torch.utils.data import DataLoader, TensorDataset
from sklearn.datasets import make_regression, make_circles, make_blobs, load_breast_cancer
from sklearn.preprocessing import StandardScaler
from sklearn.linear_model import LinearRegression, LogisticRegression
from sklearn.model_selection import train_test_split
# from sklearn.linear_model import LinearRegression
from scipy.optimize import minimize
import matplotlib.pyplot as plt
%matplotlib inline

## 1. PyTorch Tensors Basics
<hr>

PyTorch tensors are similar to NumPy arrays but with GPU support and automatic differentiation capabilities.

In [14]:
# Creating tensors
x = torch.tensor([1, 2, 3], dtype=torch.float32)
y = torch.rand(2, 3)  # Random uniform tensor

# Operations
z = x + y[0]  # Broadcasting
print(f"Tensor z:\n{z}\nShape: {z.shape}")

Tensor z:
tensor([1.9004, 2.2558, 3.9370])
Shape: torch.Size([3])


## 2. Neural Networks for Regression
<hr>

In [15]:
# Simple regression model
class LinearRegression(nn.Module):
    def __init__(self, input_size, output_size):
        super().__init__()
        self.linear = nn.Linear(input_size, output_size)

    def forward(self, x):
        return self.linear(x)

# Create and train model
X, y = make_regression(n_samples=100, n_features=1, noise=10, random_state=42)
X_t = torch.tensor(X, dtype=torch.float32)
y_t = torch.tensor(y, dtype=torch.float32)

model = LinearRegression(1, 1)
criterion = nn.MSELoss()
optimizer = torch.optim.SGD(model.parameters(), lr=0.01)

# Training loop
for epoch in range(100):
    optimizer.zero_grad()
    outputs = model(X_t)
    loss = criterion(outputs.flatten(), y_t)
    loss.backward()
    optimizer.step()
    
    if epoch % 20 == 0:
        print(f'Epoch {epoch}, Loss: {loss.item():.4f}')

Epoch 0, Loss: 1706.2048
Epoch 20, Loss: 910.1295
Epoch 40, Loss: 506.3514
Epoch 60, Loss: 299.8825
Epoch 80, Loss: 193.5540


## 3. Logistic Regression from Scratch
<hr>

We'll implement logistic regression using the Breast Cancer dataset.

In [16]:
# Load dataset
data = load_breast_cancer()
X, y = data.data, data.target
X = StandardScaler().fit_transform(X)

# Add intercept term
X = np.hstack([np.ones((X.shape[0], 1)), X])

# Sigmoid function
def sigmoid(z):
    return 1 / (1 + np.exp(-z))

# Logistic loss
def logistic_loss(w, X, y):
    p = sigmoid(X @ w)
    return -(y * np.log(p) + (1-y) * np.log(1-p)).mean()

# Gradient of logistic loss
def logistic_grad(w, X, y):
    return (X.T @ (sigmoid(X @ w) - y)) / len(X)

# Optimize weights
w_opt = minimize(logistic_loss, np.zeros(X.shape[1]), 
                 jac=logistic_grad, args=(X, y)).x

# Compare with sklearn
lr = LogisticRegression().fit(X[:, 1:], y)
print(f"Our intercept: {w_opt[0]:.4f}, Sklearn: {lr.intercept_[0]:.4f}")
print(f"Our coefs mean: {w_opt[1:].mean():.4f}, Sklearn: {lr.coef_.mean():.4f}")

Our intercept: -1.4675, Sklearn: 0.2209
Our coefs mean: -0.8334, Sklearn: -0.4004


  return -(y * np.log(p) + (1-y) * np.log(1-p)).mean()
  return -(y * np.log(p) + (1-y) * np.log(1-p)).mean()
  return -(y * np.log(p) + (1-y) * np.log(1-p)).mean()
  return -(y * np.log(p) + (1-y) * np.log(1-p)).mean()


## 4. Neural Networks for Classification
<hr>

In [17]:
# Binary classification
class BinaryClassifier(nn.Module):
    def __init__(self, input_size, hidden_size):
        super().__init__()
        self.net = nn.Sequential(
            nn.Linear(input_size, hidden_size),
            nn.ReLU(),
            nn.Linear(hidden_size, 1)
        )

    def forward(self, x):
        return self.net(x)

# Create dataset
X, y = make_circles(n_samples=300, noise=0.1, factor=0.5, random_state=42)
X_t = torch.tensor(X, dtype=torch.float32)
y_t = torch.tensor(y, dtype=torch.float32)

# Train model
model = BinaryClassifier(2, 5)
criterion = nn.BCEWithLogitsLoss()
optimizer = torch.optim.Adam(model.parameters(), lr=0.1)

for epoch in range(100):
    optimizer.zero_grad()
    outputs = model(X_t).flatten()
    loss = criterion(outputs, y_t)
    loss.backward()
    optimizer.step()
    
    if epoch % 20 == 0:
        print(f'Epoch {epoch}, Loss: {loss.item():.4f}')

Epoch 0, Loss: 0.7048
Epoch 20, Loss: 0.4643
Epoch 40, Loss: 0.2252
Epoch 60, Loss: 0.1759
Epoch 80, Loss: 0.1479


## 5. Key Takeaways
<hr>

1. PyTorch tensors are the building blocks for neural networks
2. Neural networks consist of:
   - Input layer (size determined by features)
   - Hidden layers (with activation functions)
   - Output layer (size determined by task)
3. For classification:
   - Binary: Use sigmoid activation with BCE loss
   - Multiclass: Use softmax activation with cross-entropy loss
4. Activation functions (ReLU, sigmoid, tanh) introduce non-linearity

## 6. True/False Questions
<hr>

1. Neural networks can be used for both regression and classification. (True)
2. The architecture of a neural network is a hyperparameter. (True)
3. Feature weights in neural networks are directly interpretable like in linear models. (False)
4. More layers always lead to better performance. (False)