# Deep Learning – Classification (PyTorch)

This notebook is part of the **ML-Methods** project.

It introduces **Deep Learning for supervised classification**
using **PyTorch**, a low-level and flexible deep learning framework.

As with the other classification notebooks,
the first sections focus on data preparation
and are intentionally repeated.

This ensures consistency across models
and allows fair comparison of results.

-----------------------------------------------------

## Notebook Roadmap (standard ML-Methods)

1. Project setup and common pipeline  
2. Dataset loading  
3. Train-test split  
4. Feature scaling (why we do it)  

----------------------------------

5. What is this model? (Intuition)  
6. Model training  
7. Model behavior and key parameters  
8. Predictions  
9. Model evaluation  
10. When to use it and when not to  
11. Model persistence  
12. Mathematical formulation (deep dive)  
13. Final summary – Code only  

-----------------------------------------------------

## How this notebook should be read

This notebook is designed to be read **top to bottom**.

Before every code cell, you will find a short explanation describing:
- what we are about to do
- why this step is necessary
- how it fits into the overall process

Compared to scikit-learn,
this notebook exposes **more internal details**
of how a Deep Learning model is trained.

The goal is not only to run the code,
but to understand **what happens during training**
and how neural networks learn step by step.

-----------------------------------------------------

## What is Deep Learning (in this context)?

Deep Learning refers to a class of models
based on **neural networks with multiple layers**.

These models are designed to:
- learn complex, non-linear relationships
- build internal representations of the data
- improve performance as data complexity increases

In this notebook, we focus on:
**Deep Learning for tabular classification**
using fully connected neural networks.

-----------------------------------------------------

## Why PyTorch?

PyTorch is a **low-level deep learning framework**
that provides explicit control over:

- model architecture
- forward pass
- loss computation
- backpropagation
- parameter updates

Unlike scikit-learn:
- nothing is hidden
- every step must be defined explicitly

This makes PyTorch ideal for:
- learning how neural networks actually work
- understanding gradient-based optimization
- experimenting with custom architectures

-----------------------------------------------------

## Execution model: eager execution

PyTorch uses **eager execution** by default.

This means:
- operations are executed immediately
- tensors behave like regular Python objects
- debugging is straightforward

Eager execution makes PyTorch:
- intuitive to learn
- flexible to experiment with
- closer to the mathematical description of the model

-----------------------------------------------------

## What you should expect from the results

With Deep Learning (PyTorch), you should expect:

- non-linear decision boundaries
- strong performance on complex data
- behavior similar to scikit-learn neural networks
- higher transparency during training

However:
- more code is required
- implementation errors are easier to make
- careful design is necessary

-----------------------------------------------------


## 1. Project setup and common pipeline

In this section we set up the common pipeline
used across classification models in this project.

Although this notebook uses **PyTorch**,
the overall workflow remains identical
to the scikit-learn Deep Learning notebook.

This allows us to:
- reuse the same data preparation steps
- compare models fairly
- isolate the effect of the framework choice


In [1]:
# Common imports used across classification models

import numpy as np
import pandas as pd

from sklearn.datasets import load_breast_cancer
from sklearn.model_selection import train_test_split
from sklearn.preprocessing import StandardScaler

from sklearn.metrics import (
    accuracy_score,
    confusion_matrix,
    classification_report,
    ConfusionMatrixDisplay
)

from pathlib import Path
import matplotlib.pyplot as plt

# ====================================
# PyTorch imports
# ====================================

import torch
import torch.nn as nn
import torch.optim as optim


### What changes with PyTorch

Compared to scikit-learn:
- the pipeline structure remains the same
- data preparation and evaluation stay unchanged
- only the model implementation differs

With PyTorch, we explicitly define:
- how the model processes the input
- how the loss is computed
- how parameters are updated

Nothing is hidden.

Every step of the learning process
is written manually in code.

This makes PyTorch ideal
for understanding what neural networks
are actually doing during training.

In the next section,
we will load the dataset
and prepare it for PyTorch training.


____________
## 2. Dataset loading

In this section we load the dataset
used for the Deep Learning classification task.

We intentionally use the **same dataset**
adopted in previous classification notebooks.

This ensures:
- direct comparison with classical ML models
- fair comparison across deep learning frameworks
- focus on implementation differences, not on data
