# Step 1 — Imports & Sample Dataset

In [18]:
from sklearn.datasets import load_iris
from sklearn.model_selection import train_test_split, cross_val_score,GridSearchCV
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report


# Step 2 — Load Dataset

In [2]:
X, y = load_iris(return_X_y=True)


# Step 3 — Train-Test Split (Test Set Holdout)

## First split into Train+Val and Test

In [3]:
X_train_val, X_test, y_train_val, y_test = train_test_split(
    X, y,
    test_size=0.2,
    random_state=42,
    stratify=y
)

print("Train+Val size:", X_train_val.shape)
print("Test size:", X_test.shape)


Train+Val size: (120, 4)
Test size: (30, 4)


# Step 4 — Train-Validation Split (for Tuning)

## Split Train+Val into Train and Validation

In [4]:
X_train, X_val, y_train, y_val = train_test_split(
    X_train_val, y_train_val,
    test_size=0.25,   # 0.25 * 0.8 = 0.20 of full data
    random_state=42,
    stratify=y_train_val
)

print("Train size:", X_train.shape)
print("Validation size:", X_val.shape)


Train size: (90, 4)
Validation size: (30, 4)


| Set        | % of Data |
| ---------- | --------- |
| Train      | 60%       |
| Validation | 20%       |
| Test       | 20%       |


# Step 5 — Train Model on Train Set

In [5]:
model = LogisticRegression(max_iter=200)

model.fit(X_train, y_train)


0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1.0
,fit_intercept,True
,intercept_scaling,1
,class_weight,
,random_state,
,solver,'lbfgs'
,max_iter,200


# Step 6 — Validate on Validation Set (Hyperparameter Tuning)

In [6]:
y_val_pred = model.predict(X_val)
val_acc = accuracy_score(y_val, y_val_pred)

print("Validation Accuracy:", val_acc)


Validation Accuracy: 0.9333333333333333


In [7]:
y_train_pred = model.predict(X_train)
train_acc = accuracy_score(y_train, y_train_pred)

print("Train Accuracy:", train_acc)

Train Accuracy: 0.9777777777777777


## This is where you’d try different models / hyperparameters.

# Step 7 — Cross-Validation (Robust Evaluation on Train Set)

### CV is done ONLY on training data

In [8]:
cv_scores = cross_val_score(
    model,
    X_train,
    y_train,
    cv=5,
    scoring='accuracy'
)

print("CV Scores:", cv_scores)
print("Mean CV Accuracy:", cv_scores.mean())


CV Scores: [1.         0.94444444 0.94444444 1.         1.        ]
Mean CV Accuracy: 0.9777777777777779


# Step 8 — Final Training on Train+Val

## After tuning, retrain using all available non-test data

In [9]:
final_model = LogisticRegression(max_iter=200)
final_model.fit(X_train_val, y_train_val)


0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1.0
,fit_intercept,True
,intercept_scaling,1
,class_weight,
,random_state,
,solver,'lbfgs'
,max_iter,200


# Step 9 — Final Test Set Evaluation (Unseen Data)

### This is the REAL performance

In [10]:
y_test_pred = final_model.predict(X_test)
test_acc = accuracy_score(y_test, y_test_pred)

print("Final Test Accuracy:", test_acc)


Final Test Accuracy: 0.9666666666666667


| Concept          | Where in Code              |
| ---------------- | -------------------------- |
| Train Set        | `X_train, y_train`         |
| Validation Set   | `X_val, y_val`             |
| Test Set         | `X_test, y_test`           |
| Train-Test Split | First `train_test_split()` |
| Cross-Validation | `cross_val_score()`        |


# Real-World Tip

There are 2 common approaches:

Approach 1 (What we did)

✔ Separate Validation Set + CV on Train
→ Best for large datasets

Approach 2 (No Validation Set)

✔ Use ONLY Train + CV
✔ Test Set at end
→ Common for small datasets

In [12]:
model = LogisticRegression(max_iter=500)

param_grid = {
    'C': [0.01, 0.1, 1, 10, 100],      # Regularization strength
    'penalty': ['l2'],                # l2 works with lbfgs
    'solver': ['lbfgs']
}


In [13]:
grid = GridSearchCV(
    estimator=model,
    param_grid=param_grid,
    cv=5,                  # 5-Fold Cross-Validation
    scoring='accuracy',
    n_jobs=-1
)

grid.fit(X_train_val, y_train_val)


0,1,2
,estimator,LogisticRegre...(max_iter=500)
,param_grid,"{'C': [0.01, 0.1, ...], 'penalty': ['l2'], 'solver': ['lbfgs']}"
,scoring,'accuracy'
,n_jobs,-1
,refit,True
,cv,5
,verbose,0
,pre_dispatch,'2*n_jobs'
,error_score,
,return_train_score,False

0,1,2
,penalty,'l2'
,dual,False
,tol,0.0001
,C,1
,fit_intercept,True
,intercept_scaling,1
,class_weight,
,random_state,
,solver,'lbfgs'
,max_iter,500


GridSearchCV internally does:

Train/Validation splits using Cross-Validation

In [15]:
print("Best Params:", grid.best_params_)
print("Best CV Accuracy:", grid.best_score_)


Best Params: {'C': 1, 'penalty': 'l2', 'solver': 'lbfgs'}
Best CV Accuracy: 0.9666666666666668


grid.best_estimator_ is already refit on full Train+Val

In [16]:
best_model = grid.best_estimator_


In [19]:
y_test_pred = best_model.predict(X_test)

test_acc = accuracy_score(y_test, y_test_pred)
print("Final Test Accuracy:", test_acc)

print("\nClassification Report:\n")
print(classification_report(y_test, y_test_pred))


Final Test Accuracy: 0.9666666666666667

Classification Report:

              precision    recall  f1-score   support

           0       1.00      1.00      1.00        10
           1       1.00      0.90      0.95        10
           2       0.91      1.00      0.95        10

    accuracy                           0.97        30
   macro avg       0.97      0.97      0.97        30
weighted avg       0.97      0.97      0.97        30



# What’s REALLY happening

## Without GridSearchCV (manual):

Train → Validate → Tune → Repeat


## With GridSearchCV:

Then:

Retrain on FULL Train+Val

↓
Final Test Evaluation


**What is Cross-Validation (CV)?**

Cross-Validation = Repeated Train/Validation on different splits

Instead of trusting ONE train/validation split, CV:

✅ Uses multiple splits
✅ Trains multiple times
✅ Validates multiple times
✅ Averages performance
→ Gives a more reliable estimate

**The Problem with Single Validation Split**

One split:

Train (80%) | Validation (20%)


Problems:
❌ Result depends on random split
❌ Can be lucky/unlucky split
❌ High variance estimate

# K-Fold Cross-Validation (Most Common)

Step-by-Step (K = 5 example)

Data is split into 5 equal parts (folds):

Fold1 | Fold2 | Fold3 | Fold4 | Fold5


Iteration 1:

Iteration 2:

Iteration 3:

Iteration 4:

Iteration 5:

| Fold      | Role                 |
| --------- | -------------------- |
| Each fold | Validation ONCE      |
| Each fold | Training (K-1) times |


**Final CV Score**

Mean CV Accuracy:

This is your CV estimate:

In [21]:
import numpy as np
np.mean([0.90
, 0.92
, 0.89
,0.91
 ,0.93])

np.float64(0.9099999999999999)

**Why CV is Better (Intuition)**