# Nested Cross-Validation   

Nested cross-validation (CV) is often used to train a model in which the hyperparameters need to be optimized. Nested CV esimates the generalization error of the underlying model and its (hyper)parameter search.  
  
Model selection without nested CV uses the same data to tune model parameters and evaluate model performance. Information may thus *leak* into the model and overfit the data.   
  
To avoid this problem nested CV uses a series of **train/validation/test** set splits. In the inner loop the score is approximately maximized by fitting a model to each training set, and then directly maximized in selecting (hyper)parameters over the validation set. In the outer loop, **generalization error** is estimated by averaging test set scores over several split datasets.[[1]]() 

In [1]:
from sklearn.datasets import load_wine
from matplotlib import pyplot as plt
from sklearn.svm import SVC
from sklearn.model_selection import GridSearchCV, cross_val_score, KFold
import numpy as np

In [2]:
random=30

In [3]:
wine=load_wine()
X_wine = wine.data
y_win = wine.target

In [5]:
param_grid = {
        "C":[1, 10, 100],
        "gamma": [.01, .1]
        }

In [7]:
# We will use a Support Vector Classifier with "rbf" kernel
svm = SVC(kernel="rbf")

In [10]:
cv = KFold(n_splits=3)
nested_cv_scores=[]

---

[Nested CV Pipeline Build Tutorial](https://github.com/IvanShakirov/Jupyter/blob/8ff431d55845026fd16f993d6f309b1fae8ca463/mlcourse.ai/jupyter_english/tutorials/nested_cross_validation_tatyana_kudasova.ipynb)

---