## Model Evaluation: Evaluation Metrices and Cross-Validation Methods




### Agenda
- Exploring common evaluation metrics
- Cross Validation Techniques
- Real-time scenarios to determine the best practices for model evaluation

### Introduction 
- Model evaluation is a ctitical step in machine learning model development
- 

### Evaluation Metrics
### 1. Classification Metrics
- Accuracy
   - Definition: Ratio of correctly predicted instances to the total instances
   - When to use: When dataset is balanced [ Equal number of samples/records/rows for each class] -> Balanced vs Biased
   - Example: 

In [10]:
from sklearn.metrics import accuracy_score #accuracy_score py file is imported from metrics module from sklearn package
from sklearn.datasets import make_classification
from sklearn.ensemble import RandomForestClassifier
from sklearn.model_selection import train_test_split


In [22]:
# Creating a synthetic Data
X, y = make_classification(n_samples = 1000, n_features=10, random_state=42)


In [24]:
# Creating a test split
X_train, X_test, y_train, y_test = train_test_split(X,y, test_size = 0.2, random_state = 42)

In [30]:
# Model Trainig
# Step-1: Create Classifier Instance
model = RandomForestClassifier(random_state=42)

# Step-2: Fit the model using X_train, and y_train
model.fit(X_train, y_train)

# Step-3: Predict the output of the model using X-test and store into y_pred
y_pred = model.predict(X_test)

In [36]:
# Accuracy using Accuracy metrics
accuracy = accuracy_score(y_test, y_pred)
print(f"Accuracy : {accuracy:.2f}")


Accuracy : 0.88


### Precision
- Focuses on the propertion of True positives among all predicted positives

In [39]:
from sklearn.metrics import precision_score
#Calculate Precision Score
precision = precision_score(y_test, y_pred)
print(f"Precision : {precision:.2f}")

Precision : 0.91


### Recall
- Focuses on the propertion of true positives captured among all actual positives

In [45]:
from sklearn.metrics import recall_score
# Calcule Recall Score
recall = recall_score(y_test, y_pred)
# Precision
print(f"Recall : {recall:.2f}")

Recall : 0.86


### F1-Score -> TP/TP + FP?
- Definition: Harmonic mean of Precision and Recall

In [51]:
from sklearn.metrics import f1_score
f1 = f1_score (y_test, y_pred)
print(f"f1 Score: {f1:.2f}")

f1 Score: 0.89


### Confusion Matrix
![image.png](attachment:c1583ed1-8851-4304-ba91-84f7838ed260.png)

Accuracy ->  The ratio of correctly predicted instances to the total instances
Accuracy = TP+TN/(TP+FP+TN+FN)

Precision -> Focuses in the Proportion of True Positives among all predicted positives
Precision = TP/(TP+FP)

Recall -> Focuses on the propertion of true positives captured among all actual positives
Recall = TP/(TP+FN)

F1 Score = 2 * (Precision * recall/(precision + recall))

## ROC - AUC
- Measures the trade off between TP rate and FP rate

In [58]:
from sklearn.metrics import roc_auc_score

In [62]:
# Probability Predictions
y_prob = model.predict_proba(X_test)[:, 1]
roc_auc = roc_auc_score(y_test, y_prob)
print(f"ROC_AUC: {roc_auc: .2f}")

ROC_AUC:  0.95
