# Confusion Matrix Demo

In this demo, you will see how to create a confusion matrix to evaluate the accuracy of a model using scikit-learn's `confusion_matrix()` function. For more information, consult the online [documentation](https://scikit-learn.org/stable/modules/generated/sklearn.metrics.confusion_matrix.html).

### Import Packages

Before you get started, import a few packages. Run the code cell below. 

In [1]:
import pandas as pd
import numpy as np
import os 
import matplotlib.pyplot as plt
import seaborn as sns


We will also import the scikit-learn `DecisionTreeClassifier`, the `train_test_split()` function for splitting the data into training and test sets, the function `accuracy_score()` to evaluate your model, and the function `confusion_matrix()` to create a confusion matrix.

In [2]:
from sklearn.tree import DecisionTreeClassifier
from sklearn.model_selection import train_test_split
from sklearn.metrics import accuracy_score, confusion_matrix

##  Step 1: Build Your DataFrame

We will work with the "cell2celltrain" data set. It is ready for modeling. Run the cell below to load the data set and save it to DataFrame `df`.

In [3]:
filename = os.path.join(os.getcwd(), "data", "cell2celltrain.csv")
df = pd.read_csv(filename, header=0)

## Step 2: Create Labeled Examples

In [4]:
y = df['Churn'] #label
X = df.drop(columns = 'Churn', axis=1) #eature

## Step 3: Create Training and Test Data Sets

In [5]:
X_train, X_test, y_train, y_test = train_test_split(X, y, test_size=0.10, random_state=1234)

## Step 4: Train a Decision Tree Classifier and Make Predictions

In [6]:
# Create a DecisionTreeClassifier model object
model = DecisionTreeClassifier(max_depth=4, min_samples_leaf = 50)
    
# Fit the model to the training data 
model.fit(X_train, y_train)

# Make predictions on the test data
class_label_predictions = model.predict(X_test)

## Step 5: Check the Accuracy of Your Model

Execute the code cell below to see the accuracy score of your model and the confusion matrix.

In [7]:
# Compute and print model's accuracy score
acc_score = accuracy_score(y_test, class_label_predictions)
print('Accuracy score: {0}\n'.format(acc_score))

# Display a confusion matrix
print('Confusion Matrix for the model: ')

c_m = confusion_matrix(y_test, class_label_predictions, labels=[True, False])

# Create a Pandas DataFrame out of the confusion matrix for display purposes
pd.DataFrame(
c_m,
columns=['Predicted: Customer Will Leave', 'Predicted: Customer Will Stay'],
index=['Actual: Customer Will Leave', 'Actual: Customer Will Stay']
)

Accuracy score: 0.7181194906953967

Confusion Matrix for the model: 


Unnamed: 0,Predicted: Customer Will Leave,Predicted: Customer Will Stay
Actual: Customer Will Leave,45,1418
Actual: Customer Will Stay,21,3621
