# ROC and AUC - Understanding Model Performance Metrics

## Introduction

Welcome to this notebook where we will explore Receiver Operating Characteristic (ROC) curves and the Area Under the Curve (AUC) metric. ROC AUC is a useful tool for understanding the performance of binary classifiers.

## What is ROC?

The Receiver Operating Characteristic (ROC) is a graphical representation that illustrates the performance of a binary classifier as its discrimination threshold varies. The ROC curve plots the True Positive Rate (TPR) against the False Positive Rate (FPR).

## What is AUC?

The Area Under the Curve (AUC) provides a single numerical value summarizing the overall performance of a binary classification model. AUC ranges from 0 to 1, where a value close to 1 indicates good model performance, and a value close to 0 indicates poor performance.

## Example Implementation

In [None]:
# Importing Libraries
import numpy as np
import matplotlib.pyplot as plt
from sklearn.datasets import make_classification
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.metrics import roc_curve, roc_auc_score
import pandas as pd

In [None]:
# Generating Synthetic Data
X, y = make_classification(n_samples=1000, n_features=20, n_classes=2, random_state=42)
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=42)

In [None]:
# Creating a Classifier
clf = LogisticRegression()
clf.fit(X_train, y_train)
y_pred_proba = clf.predict_proba(X_test)[:,1]

In [None]:
# ROC Curve

fpr, tpr, thresholds = roc_curve(y_test, y_pred_proba)

plt.figure()
plt.plot(fpr, tpr, label='Logistic Regression (area = %0.2f)' % roc_auc_score(y_test, y_pred_proba))
plt.plot([0, 1], [0, 1],'r--')
plt.xlim([0.0, 1.0])
plt.ylim([0.0, 1.05])
plt.xlabel('False Positive Rate')
plt.ylabel('True Positive Rate')
plt.title('Receiver Operating Characteristic')
plt.legend(loc="lower right")
plt.show()

In [None]:
# Calculating AUC
auc = roc_auc_score(y_test, y_pred_proba)
print("AUC Score: ", auc)

## Conclusion

In this notebook, we covered how to generate an ROC curve and calculate the AUC for a binary classifier using Python's scikit-learn library. Understanding ROC and AUC is crucial for evaluating the effectiveness of different models on a given classification problem.