# K-Fold Cross -Validation

### Why use K-Fold?

Training models on a single train-test split can lead to overfitting or underfitting. K-Fold Cross-Validation ensures stable performance evaluation by dividing data into multiple folds.

### How it works?

- Splits data into K parts (e.g., K=5).

- Trains model on K-1 folds and tests on the remaining fold.

- Repeats K times and averages results.

### Example Dataset - Wine Quality Prediction 

Model : Decision Tree Regressor

In [33]:
import pandas as pd
from sklearn.model_selection import cross_val_score
from sklearn.tree import DecisionTreeRegressor

# Load Dataset
data = pd.read_csv('winequality-red.csv')
X = data.drop(columns=['quality'])
y = data['quality']

# K-Fold Cross-Validation
model = DecisionTreeRegressor()
scores = cross_val_score(model, X, y, cv=5, scoring='r2')

print("R² Scores per fold:", scores)
print("Mean R² Score:", scores.mean())


R² Scores per fold: [-1.09821429 -0.77272727 -2.         -1.14559387 -1.1192053 ]
Mean R² Score: -1.227148145237321
