# Model Evaluation
Strategies for evaluating the quality of the model created through learning algorithms.
The main idea is : Not to train and test with the same data, the real meaning of having an useful and robust model is to see how it behaves in data it has never seen before.

How well models are able to make predictions from data they have never seen before.

Solution : Split up the data in 80% training and 20% test. 
Weaknesses: 

1-. The performance of the model can be hightly dependent on which few observations were selected for the test set.

2-. The model is not training with all the data and it is not being evaluated on all available data.

K-Fold Cross Validation (KFCV): We split the data into k parts called folds. The model is trained using k-1 folds and the last fold is used as test set. This is repeated K times using a different fold.

The performance of the model for each of the k iterations is then averaged to produce an overall measurement. 

## Cross Validating Models
Check how well oiur model works.


In [31]:
# Load libraries

from sklearn import datasets
from sklearn import metrics
from sklearn.model_selection import KFold, cross_val_score
from sklearn.pipeline import make_pipeline
from sklearn.linear_model import LogisticRegression
from sklearn.preprocessing import StandardScaler

In [32]:
# Load data

digits = datasets.load_digits()

In [33]:
# Create Feature Matrix

features = digits.data

In [34]:
# Create target vector 

target = digits.target

In [35]:
# Create a Standardizer

standardizer = StandardScaler()

In [36]:
# Create Logistic Regression object

logit = LogisticRegression()

In [37]:
# Create a pipeline that standardize, then run logistic regression

pipeline  = make_pipeline(standardizer, logit)

In [38]:
# Create K-Fold cross-validation 

kf = KFold(n_splits = 10, shuffle = True, random_state = 1) 

In [39]:
# Conduct K-Fold cross-validation

cv_results = cross_val_score(pipeline, # Pipeline
                             features, # Feature Matrix
                             target,   # Target Vector
                             cv = kf,  # Cross Validation Technique
                             scoring = "accuracy", # Loss function
                             n_jobs = -1) # use all CPU scores


In [40]:
# Calculate Mean

cv_results.mean()

0.9693916821849783

In [41]:
# Show results for 10 folds.
cv_results

array([0.97777778, 0.98888889, 0.96111111, 0.94444444, 0.97777778,
       0.98333333, 0.95555556, 0.98882682, 0.97765363, 0.93854749])

#### Points to consider when using KFCV

1-. It assumes data is Independent Identically Distributed (IID). If data is IDD shuffle observations would be a good idea. (shuffle = True)

2-. When using KFCV to evaluate a classifier it is beneficial to have folds containing roughly the same percentage of observations from each od the different target classes. (Stratified K-Fold). (i.e., 80% men and 20% women , on each fold mantain 80-20 observations) 

3-. Standardizer: Fit our standardization then apply transformation to both, train and test



In [42]:
# Import libraries

from sklearn.model_selection import train_test_split

In [44]:
# Create training and test sets

features_train, features_test, target_train, target_test = train_test_split(features,target,test_size = 0.1, random_state = 1)

In [46]:
# Fit standardizer to training set

standardizer.fit(features_train)

StandardScaler(copy=True, with_mean=True, with_std=True)

If we fit both, our preprocessor using observations from training and test sets some information leaks in our training.

In [48]:
# Apply to both training and test set

features_train_std = standardizer.transform(features_train)
features_test_std = standardizer.transform(features_test)

In [49]:
# Pipeline to preprocess data and then train a model 

pipeline = make_pipeline(standardizer, logit)

In [50]:
# Do K-Fold Cross Validation (KFCV)

cv_results = cross_val_score(pipeline, # Pipeline
                             features, # Feature Matrix
                             target,   # Target Vector
                             cv = kf,  # Cross Validation Technique
                             scoring = "accuracy", # Loss function
                             n_jobs = -1) # use all CPU scores
