# Model Evaluation

### Why a simple train/test split isn’t always enough.

A simple train/test split is often insufficient for robust model evaluation because it can lead to unreliable performance estimates and miss important data characteristics.

### Cross Validation

Cross-validation is a statistical technique used to evaluate machine learning models and estimate how well they will generalize to an independent, unseen dataset by dividing the dataset in k parts.

**Bais**

The model is too simple to capture the underlying patterns in the data. It consistently makes the same type of error because of a fundamental, mistaken assumption about the data's structure.

**Variance**

The model is too complex and learns the noise and random fluctuations in the training data rather than the intended pattern. It is overly sensitive to the specifics of the training set.

**Bais Variance trade-off**

In machine learning, bias is the error from overly simplistic assumptions in the learning algorithm, while variance is the error from the model being too sensitive to the training data. This relationship, known as the bias-variance tradeoff, directly explains the fundamental problems of underfitting (high bias, low variance) and overfitting (low bias, high variance).

## Practical

#### Loading iris

In [1]:
from sklearn.datasets import load_iris
iris = load_iris()
X = iris.data
y = iris.target

#### Train Test Split

In [2]:
from sklearn.model_selection import train_test_split
X_train, X_test , y_train, y_test = train_test_split(X, y, test_size=0.2, random_state=42)

#### Training the model and predictions

In [12]:
from sklearn.linear_model import LogisticRegression
model = LogisticRegression(max_iter=200)
model.fit(X_train,y_train)
y_pred= model.predict(X_test)

## Evaluation

#### Evaluate Logistic Regression with accuracy, confusion matrix and classification report

In [6]:
from sklearn.metrics import accuracy_score, confusion_matrix, classification_report
print("\nAccuracy: ",accuracy_score(y_test, y_pred))                            
print("\nConfusion matrix: \n",confusion_matrix(y_test, y_pred))                
print("\nClassification report: \n", classification_report(y_test, y_pred)) 


Accuracy:  1.0

Confusion matrix: 
 [[10  0  0]
 [ 0  9  0]
 [ 0  0 11]]

Classification report: 
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30



#### Cross validation

In [18]:
from sklearn.model_selection import cross_val_score
import numpy as np

cv_scores= cross_val_score(model, X, y, cv=5)
print("\nCross Validation scores: ", cv_scores)
print("\nMean of Cv scores: ", np.mean(cv_scores))


Cross Validation scores:  [0.96666667 1.         0.93333333 0.96666667 1.        ]

Mean of Cv scores:  0.9733333333333334


Mean CV accuracy = 0.973 → much more realistic estimate of model performance.

## K-Nearest Neighbours (KNN)

In [19]:
from sklearn.neighbors import KNeighborsClassifier
model = KNeighborsClassifier(n_neighbors=5)
model.fit(X_train, y_train)
y_pred = model.predict(X_test)

## Evaluation

#### Evaluate KNN with accuracy, confusion matrix and classification report

In [20]:
print("\nAccuracy: ",accuracy_score(y_test, y_pred))                            
print("\nConfusion matrix: \n",confusion_matrix(y_test, y_pred))                
print("\nClassification report: \n", classification_report(y_test, y_pred)) 


Accuracy:  1.0

Confusion matrix: 
 [[10  0  0]
 [ 0  9  0]
 [ 0  0 11]]

Classification report: 
               precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      1.00      1.00         9
           2       1.00      1.00      1.00        11

    accuracy                           1.00        30
   macro avg       1.00      1.00      1.00        30
weighted avg       1.00      1.00      1.00        30



#### Cross Validation of KNN

In [22]:
knn_scores = cross_val_score(model, X, y, cv=5)
print("\nCross Validation scores: ", knn_scores)
print("\nMean of Knn scores: ", np.mean(knn_scores))


Cross Validation scores:  [0.96666667 1.         0.93333333 0.96666667 1.        ]

Mean of Knn scores:  0.9733333333333334
