<a href="https://colab.research.google.com/github/Sumanasri02/ai-ml-learning-journey/blob/main/04_Model_Evaluation_and_Tuning/01_model_evaluation.ipynb" target="_parent"><img src="https://colab.research.google.com/assets/colab-badge.svg" alt="Open In Colab"/></a>

# Model Evaluation in Machine Learning

## Overview
Model evaluation helps us understand how well a machine learning
model performs on unseen data.
Choosing the right evaluation metric is crucial for building reliable models.
## Why Model Evaluation?
- To measure model performance
- To compare different models
- To detect overfitting or underfitting
## Train-Test Split
Data is split into:
- Training data → used to train the model
- Testing data → used to evaluate the model


In [41]:
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LinearRegression
from sklearn.metrics import mean_squared_error, r2_score

In [43]:
# Sample data
X = np.array([[1], [2], [3], [4], [5]])
y = np.array([2, 4, 6, 8, 10])

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.2, random_state=42
)

# Train model
model = LinearRegression()
model.fit(X_train, y_train)

# Predictions
y_pred = model.predict(X_test)

# Evaluation metrics
mse = mean_squared_error(y_test, y_pred)
r2 = r2_score(y_test, y_pred)

print("Mean Squared Error:", mse)
print("R2 Score:", r2)


Mean Squared Error: 0.0
R2 Score: nan




## Classification Evaluation Metrics
Common metrics for classification models:
- Accuracy
- Precision
- Recall
- F1-score


In [47]:
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import accuracy_score, classification_report

# Sample data
X = np.array([[20], [25], [30], [35], [40], [45]])
y = np.array([0, 0, 0, 1, 1, 1])

# Train-test split
X_train, X_test, y_train, y_test = train_test_split(
    X, y, test_size=0.3, random_state=42
)

# Train model
clf = LogisticRegression()
clf.fit(X_train, y_train)

# Predictions
y_pred = clf.predict(X_test)

# Evaluation
print("Accuracy:", accuracy_score(y_test, y_pred))
print(classification_report(y_test, y_pred))


Accuracy: 1.0
              precision    recall  f1-score   support

           0       1.00      1.00      1.00         2

    accuracy                           1.00         2
   macro avg       1.00      1.00      1.00         2
weighted avg       1.00      1.00      1.00         2



## Key Takeaways
- Evaluation ensures model reliability
- Different problems need different metrics
- Always evaluate on unseen data
