# AUC-ROC Score 
### The Receiver Operator Characteristic (ROC) curve is an evaluation metric for binary classification problems. It is a probability curve that plots the TPR against FPR at various threshold values and essentially separates the ‘signal’ from the ‘noise’. The Area Under the Curve (AUC) is the measure of the ability of a classifier to distinguish between classes and is used as a summary of the ROC curve.

### The higher the AUC, the better the performance of the model at distinguishing between the positive and negative classes.

### When AUC = 1, then the classifier is able to perfectly distinguish between all the Positive and the Negative class points correctly. If, however, the AUC had been 0, then the classifier would be predicting all Negatives as Positives, and all Positives as Negatives.

### When 0.5<AUC<1, there is a high chance that the classifier will be able to distinguish the positive class values from the negative class values. This is so because the classifier is able to detect more numbers of True positives and True negatives than False negatives and False positives.

### When AUC=0.5, then the classifier is not able to distinguish between Positive and Negative class points. Meaning either the classifier is predicting random class or constant class for all the data points.

### So, the higher the AUC value for a classifier, the better its ability to distinguish between positive and negative classes.

### Let's use AUC ROC score to check the performance of our models.

![](https://cdn.analyticsvidhya.com/wp-content/uploads/2020/06/AUC3.png)
![](https://cdn.analyticsvidhya.com/wp-content/uploads/2020/06/AUC3.png)
![](https://cdn.analyticsvidhya.com/wp-content/uploads/2020/06/AUC2.png)

In [None]:
from sklearn.model_selection import train_test_split
from sklearn.linear_model import LogisticRegression
from sklearn.neighbors import KNeighborsClassifier
from sklearn.ensemble import RandomForestClassifier
from sklearn.svm import SVC
from sklearn.naive_bayes import GaussianNB
from sklearn.tree import DecisionTreeClassifier


## Train Test Split
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.3, random_state=27)
model1 = LogisticRegression(solver='liblinear')
model2 = KNeighborsClassifier(n_neighbors=4)
model3 = DecisionTreeClassifier(max_depth =3, random_state = 42)
model4 = RandomForestClassifier(max_depth=2, random_state=0)
model5 = GaussianNB()


model1.fit(X_train, y_train)
model2.fit(X_train, y_train)
model3.fit(X_train, y_train)
model4.fit(X_train, y_train)
model5.fit(X_train, y_train)





In [None]:
pred_prob1 = model1.predict_proba(X_test)
pred_prob2 = model2.predict_proba(X_test)
pred_prob3 = model3.predict_proba(X_test)
pred_prob4 = model4.predict_proba(X_test)
pred_prob5 = model5.predict_proba(X_test)


from sklearn.metrics import roc_curve
 #roc curve for models
fpr1, tpr1, thresh1 = roc_curve(y_test, pred_prob1[:,1], pos_label=1)
fpr2, tpr2, thresh2 = roc_curve(y_test, pred_prob2[:,1], pos_label=1)
fpr3, tpr3, thresh3 = roc_curve(y_test, pred_prob3[:,1], pos_label=1)
fpr4, tpr4, thresh4 = roc_curve(y_test, pred_prob4[:,1], pos_label=1)
fpr5, tpr5, thresh5 = roc_curve(y_test, pred_prob5[:,1], pos_label=1)



# roc curve for tpr = fpr 
random_probs = [0 for i in range(len(y_test))]
p_fpr, p_tpr, _ = roc_curve(y_test, random_probs, pos_label=1)

from sklearn.metrics import roc_auc_score

# auc scores
auc_score1 = roc_auc_score(y_test, pred_prob1[:,1])
auc_score2 = roc_auc_score(y_test, pred_prob2[:,1])
auc_score3 = roc_auc_score(y_test, pred_prob3[:,1])
auc_score4 = roc_auc_score(y_test, pred_prob4[:,1])
auc_score5 = roc_auc_score(y_test, pred_prob5[:,1])



print('the AUC ROC score for Logistic Regression Model is -      ',auc_score1)
print("")
print('the AUC ROC score for K Neighbors Classifier Model is -   ',auc_score2)
print("")
print('the AUC ROC score for  Decision Tree Model is -  ',auc_score3)
print("")
print('the AUC ROC score for Random Forest Classifier Model is -   ',auc_score4)
print("")
print('the AUC ROC score for Naive Bayes Classsifier Model is -   ',auc_score5)



In [None]:
# matplotlib
import matplotlib.pyplot as plt
plt.style.use('seaborn')

# plot roc curves
plt.plot(fpr1, tpr1, linestyle='--',color='orange', label='Logistic Regression')
plt.plot(fpr2, tpr2, linestyle='--',color='red', label='KNN')
plt.plot(fpr3, tpr3, linestyle='--',color='yellow', label='DT')
plt.plot(fpr4, tpr4, linestyle='--',color='green', label='RF')
plt.plot(fpr5, tpr5, linestyle='--',color='pink', label='NaiveBayes')

plt.plot(p_fpr, p_tpr, linestyle='--', color='blue')
# title
plt.title('ROC curve')
# x label
plt.xlabel('False Positive Rate')
# y label
plt.ylabel('True Positive rate')

plt.legend(loc='best')
plt.savefig('ROC',dpi=300)
plt.show();
