## Churn detection - Classification Problem

In this use case, we look at how a mobile phone carrier company can proactively identify customers more likely to churn in the near term in order to improve the service and create custom outreach campaigns that help retain the customers.

Mobile phone carriers face an extremely competitive market. Many mobile carriers lose revenue from postpaid customers due to churn. Hence the ability to proactively and accurately identify customer churn at scale can be a huge competitive advantage. Some of the factors contributing to mobile phone customer churn includes: Perceived frequent service disruptions, poor customer service experiences in online/retail stores, offers from other competing carriers (better family plan, data plan, etc.).



## 1. Understand data


In [1]:
# %load "g:/My Drive/diamonds_preproc.py"
import numpy as np        
import pandas as pd       
import matplotlib.pyplot as plt
import seaborn as sns 
df = pd.read_csv("churn.csv")


## 2. Data preprocessing

## 3. Logistic Regression model


### Training Model

### cutoff treshold for score

### Model inference

### Model evaluation


* accuracy: ratio of correct predictions to the total number of samples in dataset - in the case of imbalanced classes this metric can be misguiding 

* true positive (TP)— sample’s label is positive and it is classified as one
* true negative (TN) — sample’s label is negative and it is classified as one
* false positive (FP)— sample’s label is neg., but it is classified as positive
* false negative (FN)— sample’s label is pos., but it classified as negative

![](graphs/metrics.jpg)


#### 1. True Positive Rate (also sensitivity or recall)
Recall metric shows how many relevant samples are selected, which means how well our model can predict all the interested samples in our dataset.

#### 2. Precision
Precision metric tells us how many predicted samples are relevant i.e. our mistakes into classifying sample as a correct one if it’s not true.
$$\frac{TP}{TP+FP} $$

#### 3. Recall (TPR or sensitivity)
Recall metric shows how many relevant samples are selected, which means how well our model can predict all the interested samples in our dataset.
Recall is the percentage of actual positives that were correctly classified 
$$\frac{TP}{TP+FN} $$

#### 4. FPR 
$$\frac{FP}{FP+TN} $$


#### 5. ROC 
An ROC curve (receiver operating characteristic curve) is a graph showing the performance of a classification model at all classification thresholds. This curve plots two parameters:
* True Positive Rate
* False Positive Rate
ROC curve plots TPR vs. FPR at different classification thresholds. Lowering the classification threshold classifies more items as positive, thus increasing both False Positives and True Positives. The following figure shows a typical ROC curve.
![](graphs/roc.svg)


#### 6. AUC 
AUC refers to the Area Under the Curve of a Receiver Operating Characteristic curve (ROC-AUC). This metric is equal to the probability that a classifier will rank a random positive sample higher than a random negative sample. 
![](graphs/auc.svg)
AUC ranges in value from 0 to 1. A model whose predictions are 100% wrong has an AUC of 0.0; one whose predictions are 100% correct has an AUC of 1.0.

AUC is desirable for the following two reasons:
* AUC is scale-invariant. It measures how well predictions are ranked, rather than their absolute values.
* AUC is classification-threshold-invariant. It measures the quality of the model's predictions irrespective of what classification threshold is chosen.


#### 7. AUPRC 
AUPRC refers to Area Under the Curve of the Precision-Recall Curve. This metric computes precision-recall pairs for different probability thresholds. 
A precision-recall curve (or PR Curve) is a plot of the precision (y-axis) and the recall (x-axis) for different probability thresholds.
* PR Curve: Plot of Recall (x) vs Precision (y).




#### 4. F1-score
F1 metric is the harmonic average of the precision and recall. This metric is a good choice for the imbalanced classification scenario. The range of F1 is in [0, 1], where 1 is perfect classification and 0 is total failure.

