# Error Estimation

In this notebook, we will implement and test **error estimation approaches** for evaluating classifiers and/or learning algorithms.

At the start, we will implement the $k$-fold cross-validation with and without stratification.

Subsequently, we will use the nested $k$-fold cross-validation on an exemplary dataset to perform model selection.

### **Table of Contents**
1. [$k$-fold Cross-alidation](#k-fold-cross-validation)
2. [Model Selection](#model-selection)

In [None]:
%load_ext autoreload
%autoreload 2

import matplotlib.pyplot as plt
import numpy as np

### **1. $k$-fold Cross-validation** <a class="anchor" id="k-fold-cross-validation"></a>

We implement the function [`cross_validation`](../e2ml/evaluation/_error_estimation.py) in the [`e2ml.evaluation`](../e2ml/evaluation) subpackage. Once, the implementation has been completed, we visualize and compare the standard and stratified cross-validation.

In [None]:
from e2ml.evaluation import cross_validation
# Generate articial class labels.
y = np.zeros(100)
sample_indices = np.arange(len(y), dtype=int)
y[30:90] = 1
y[90:] = 2

# Visualize standard (k=3)-fold cross validation via a bar plot showing the
# class distribution within each fold.
# TODO
    
# Visualize stratified (k=3)-fold cross validation via a bar plot showing the class
# distribution within each fold.
# TODO


### **2. Model Selection** <a class="anchor" id="model-selection"></a>

In the follwing, we perform a small evaluation study including a model selection. Our goal is to compare the learning algorithm of a [*support vector classifier*](https://scikit-learn.org/stable/modules/generated/sklearn.svm.SVC.html#sklearn.svm.SVC) (SVC) and a [*multi-layer perceptron*](https://scikit-learn.org/stable/modules/generated/sklearn.neural_network.MLPClassifier.html#sklearn.neural_network.MLPClassifier) (MLP) on the data set [*breast cancer*](https://scikit-learn.org/stable/modules/generated/sklearn.datasets.load_breast_cancer.html#sklearn.datasets.load_breast_cancer). We generate in each run 20 hyperparameter configurations according to one of the popular experimentation methods. Studied hyperparamters are the regularization parameter $C \in (0, 1000)$ (`C`) and the so-called bandwidth $\gamma \in (0, 1]$ (`gamma`) for the SVC, while the learning rate $\eta \in (0, 1]$ (`learning_rate_init`) and another regularization parameter $\alpha \in (0, 1)$ (`alpha`) are studied for the MLP. Further, we use a nested stratified $k=5$-folded cross-valdiation as error-estimation approach. The zero-one loss serves as performance measure to report the emprical mean and standard deviation of the risk estimates.

In [None]:
from sklearn.svm import SVC
from sklearn.neural_network import MLPClassifier
from sklearn.datasets import load_breast_cancer

# Load breast cancer data set.
X, y = load_breast_cancer(return_X_y=True)

# Perform evaluation study.
# TODO
