# Customer Churn Analysis
Churn rate, when applied to a customer base, refers to the proportion of contractual customers or subscribers who leave a supplier during a given time period. It is a possible indicator of customer dissatisfaction, cheaper and/or better offers from the competition, more successful sales and/or marketing by the competition, or reasons having to do with the customer life cycle.

Churn is closely related to the concept of average customer life time. For example, an annual churn rate of 25 percent implies an average customer life of four years. An annual churn rate of 33 percent implies an average customer life of three years. The churn rate can be minimized by creating barriers which discourage customers to change suppliers (contractual binding periods, use of proprietary technology, value-added services, unique business models, etc.), or through retention activities such as loyalty programs. It is possible to overstate the churn rate, as when a consumer drops the service but then restarts it within the same year. Thus, a clear distinction needs to be made between "gross churn", the total number of absolute disconnections, and "net churn", the overall loss of subscribers or members. The difference between the two measures is the number of new subscribers or members that have joined during the same period. Suppliers may find that if they offer a loss-leader "introductory special", it can lead to a higher churn rate and subscriber abuse, as some subscribers will sign on, let the service lapse, then sign on again to take continuous advantage of current specials. https://en.wikipedia.org/wiki/Churn_rate

In [5]:
%%capture

import numpy as np
import pandas as pd
import h2o
from h2o.automl import H2OAutoML
from __future__ import print_function
import pandas_profiling

# Suppress unwatned warnings
import warnings
warnings.filterwarnings('ignore')
import logging
logging.getLogger("requests").setLevel(logging.WARNING)

In [2]:
# Load our favorite visualization library
import os
import plotly
import plotly.plotly as py
import plotly.figure_factory as ff
import plotly.graph_objs as go
import cufflinks as cf
plotly.offline.init_notebook_mode(connected=True)

# Sign into Plotly with masked, encrypted API key

myPlotlyKey = os.environ['SECRET_ENV_BRETTS_PLOTLY_KEY']
py.sign_in(username='bretto777',api_key=myPlotlyKey)


### Load The Dataset

In [11]:
# Load some data
churnDF = pd.read_csv('https://s3-us-west-1.amazonaws.com/dsclouddata/home/jupyter/churn_train.csv', delimiter=',')
churnDF["Churn"] = churnDF["Churn"].replace(to_replace=False, value='Retain')
churnDF["Churn"] = churnDF["Churn"].replace(to_replace=True, value='Churn')
churnDFs = churnDF.sample(frac=0.07) # Sample for speedy viz
churnDF.head(5)

Unnamed: 0,State,Account Length,Area Code,Phone,Int'l Plan,VMail Plan,VMail Message,Day Mins,Day Calls,Day Charge,...,Eve Calls,Eve Charge,Night Mins,Night Calls,Night Charge,Intl Mins,Intl Calls,Intl Charge,CustServ Calls,Churn
0,ND,84,415,400-7253,no,yes,33,159.1,106,27.05,...,101,12.73,213.4,108,9.6,13.0,18,3.51,1,Retain
1,RI,117,408,370-5042,no,yes,13,207.6,65,35.29,...,77,12.98,232.8,95,10.48,9.7,3,2.62,1,Retain
2,VA,132,510,343-4696,no,no,0,81.1,86,13.79,...,72,20.84,237.0,115,10.67,10.3,2,2.78,0,Retain
3,OK,121,408,364-2495,no,yes,31,237.1,63,40.31,...,117,17.48,196.7,85,8.85,10.1,5,2.73,4,Retain
4,ME,205,510,413-4039,no,yes,24,175.8,139,29.89,...,98,13.18,180.7,64,8.13,7.8,5,2.11,2,Retain


In [8]:
pandas_profiling.ProfileReport(churnDF)

0,1
Number of variables,21
Number of observations,2666
Total Missing (%),0.0%
Total size in memory,437.5 KiB
Average record size in memory,168.0 B

0,1
Numeric,12
Categorical,4
Boolean,0
Date,0
Text (Unique),1
Rejected,4
Unsupported,0

0,1
Distinct count,211
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,101.38
Minimum,1
Maximum,243
Zeros (%),0.0%

0,1
Minimum,1
5-th percentile,36
Q1,74
Median,100
Q3,128
95-th percentile,169
Maximum,243
Range,242
Interquartile range,54

0,1
Standard deviation,40.165
Coef of variation,0.3962
Kurtosis,-0.10673
Mean,101.38
MAD,32.142
Skewness,0.12542
Sum,270271
Variance,1613.3
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
87,40,0.0%,
105,35,0.0%,
101,33,0.0%,
90,32,0.0%,
100,30,0.0%,
86,30,0.0%,
95,30,0.0%,
93,30,0.0%,
123,29,0.0%,
88,29,0.0%,

Value,Count,Frequency (%),Unnamed: 3
1,7,0.0%,
2,1,0.0%,
3,5,0.0%,
5,1,0.0%,
6,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
221,1,0.0%,
224,2,0.0%,
225,1,0.0%,
232,1,0.0%,
243,1,0.0%,

0,1
Distinct count,3
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,436.36
Minimum,408
Maximum,510
Zeros (%),0.0%

0,1
Minimum,408
5-th percentile,408
Q1,408
Median,415
Q3,415
95-th percentile,510
Maximum,510
Range,102
Interquartile range,7

0,1
Standard deviation,41.875
Coef of variation,0.095964
Kurtosis,-0.58232
Mean,436.36
MAD,35.851
Skewness,1.1797
Sum,1163348
Variance,1753.6
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
415,1346,0.0%,
408,671,0.0%,
510,649,0.0%,

Value,Count,Frequency (%),Unnamed: 3
408,671,0.0%,
415,1346,0.0%,
510,649,0.0%,

Value,Count,Frequency (%),Unnamed: 3
408,671,0.0%,
415,1346,0.0%,
510,649,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
Retain,2264
Churn,402

Value,Count,Frequency (%),Unnamed: 3
Retain,2264,0.0%,
Churn,402,0.0%,

0,1
Distinct count,10
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,1.5698
Minimum,0
Maximum,9
Zeros (%),0.0%

0,1
Minimum,0
5-th percentile,0
Q1,1
Median,1
Q3,2
95-th percentile,4
Maximum,9
Range,9
Interquartile range,1

0,1
Standard deviation,1.3214
Coef of variation,0.8418
Kurtosis,1.8085
Mean,1.5698
MAD,1.0578
Skewness,1.0978
Sum,4185
Variance,1.7462
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
1,943,0.0%,
2,601,0.0%,
0,556,0.0%,
3,348,0.0%,
4,139,0.0%,
5,51,0.0%,
6,17,0.0%,
7,7,0.0%,
9,2,0.0%,
8,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,556,0.0%,
1,943,0.0%,
2,601,0.0%,
3,348,0.0%,
4,139,0.0%,

Value,Count,Frequency (%),Unnamed: 3
5,51,0.0%,
6,17,0.0%,
7,7,0.0%,
8,2,0.0%,
9,2,0.0%,

0,1
Distinct count,118
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,100.46
Minimum,0
Maximum,165
Zeros (%),0.0%

0,1
Minimum,0
5-th percentile,67
Q1,87
Median,101
Q3,114
95-th percentile,134
Maximum,165
Range,165
Interquartile range,27

0,1
Standard deviation,20.253
Coef of variation,0.2016
Kurtosis,0.28903
Mean,100.46
MAD,16.089
Skewness,-0.13478
Sum,267832
Variance,410.18
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
102,63,0.0%,
105,61,0.0%,
97,58,0.0%,
107,57,0.0%,
95,57,0.0%,
88,57,0.0%,
108,57,0.0%,
110,55,0.0%,
104,54,0.0%,
114,53,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,2,0.0%,
30,1,0.0%,
35,1,0.0%,
36,1,0.0%,
40,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
157,1,0.0%,
158,2,0.0%,
160,1,0.0%,
163,1,0.0%,
165,1,0.0%,

0,1
Correlation,1

0,1
Distinct count,1495
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,180.79
Minimum,0
Maximum,350.8
Zeros (%),0.0%

0,1
Minimum,0.0
5-th percentile,92.225
Q1,144.12
Median,180.25
Q3,217.6
95-th percentile,271.58
Maximum,350.8
Range,350.8
Interquartile range,73.475

0,1
Standard deviation,54.8
Coef of variation,0.30312
Kurtosis,-0.018467
Mean,180.79
MAD,43.857
Skewness,-0.01346
Sum,481980
Variance,3003.1
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
175.4,7,0.0%,
216.0,6,0.0%,
174.5,6,0.0%,
162.3,6,0.0%,
178.7,6,0.0%,
197.0,6,0.0%,
203.4,6,0.0%,
185.0,6,0.0%,
146.3,6,0.0%,
159.5,6,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,2,0.0%,
2.6,1,0.0%,
7.8,1,0.0%,
7.9,1,0.0%,
12.5,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
335.5,1,0.0%,
337.4,1,0.0%,
345.3,1,0.0%,
346.8,1,0.0%,
350.8,1,0.0%,

0,1
Distinct count,123
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,100.15
Minimum,0
Maximum,170
Zeros (%),0.0%

0,1
Minimum,0
5-th percentile,66
Q1,87
Median,101
Q3,114
95-th percentile,133
Maximum,170
Range,170
Interquartile range,27

0,1
Standard deviation,20.145
Coef of variation,0.20114
Kurtosis,0.26876
Mean,100.15
MAD,16.021
Skewness,-0.083696
Sum,267006
Variance,405.8
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
105,67,0.0%,
94,60,0.0%,
97,58,0.0%,
108,58,0.0%,
103,57,0.0%,
111,55,0.0%,
102,55,0.0%,
98,54,0.0%,
101,53,0.0%,
109,52,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,1,0.0%,
12,1,0.0%,
36,1,0.0%,
37,1,0.0%,
42,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
157,1,0.0%,
159,1,0.0%,
164,1,0.0%,
168,1,0.0%,
170,1,0.0%,

0,1
Correlation,1

0,1
Distinct count,1431
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,200.95
Minimum,0
Maximum,363.7
Zeros (%),0.0%

0,1
Minimum,0.0
5-th percentile,119.75
Q1,167.1
Median,201.15
Q3,235.0
95-th percentile,283.17
Maximum,363.7
Range,363.7
Interquartile range,67.9

0,1
Standard deviation,50.01
Coef of variation,0.24886
Kurtosis,0.10225
Mean,200.95
MAD,39.87
Skewness,-0.040083
Sum,535740
Variance,2501
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
169.9,8,0.0%,
201.0,7,0.0%,
220.6,7,0.0%,
185.5,6,0.0%,
195.7,6,0.0%,
230.0,6,0.0%,
161.7,6,0.0%,
180.5,6,0.0%,
216.5,6,0.0%,
199.7,6,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,1,0.0%,
31.2,1,0.0%,
42.2,1,0.0%,
42.5,1,0.0%,
43.9,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
348.5,1,0.0%,
350.9,1,0.0%,
351.6,1,0.0%,
361.8,1,0.0%,
363.7,1,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
no,2404
yes,262

Value,Count,Frequency (%),Unnamed: 3
no,2404,0.0%,
yes,262,0.0%,

0,1
Distinct count,20
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,4.4677
Minimum,0
Maximum,20
Zeros (%),0.0%

0,1
Minimum,0
5-th percentile,1
Q1,3
Median,4
Q3,6
95-th percentile,9
Maximum,20
Range,20
Interquartile range,3

0,1
Standard deviation,2.4781
Coef of variation,0.55467
Kurtosis,3.0595
Mean,4.4677
MAD,1.8982
Skewness,1.3095
Sum,11911
Variance,6.141
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
3,541,0.0%,
4,489,0.0%,
2,389,0.0%,
5,365,0.0%,
6,266,0.0%,
7,177,0.0%,
1,136,0.0%,
8,99,0.0%,
9,80,0.0%,
10,43,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,17,0.0%,
1,136,0.0%,
2,389,0.0%,
3,541,0.0%,
4,489,0.0%,

Value,Count,Frequency (%),Unnamed: 3
15,5,0.0%,
16,1,0.0%,
18,3,0.0%,
19,1,0.0%,
20,1,0.0%,

0,1
Correlation,0.99999

0,1
Distinct count,161
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,10.255
Minimum,0
Maximum,20
Zeros (%),0.0%

0,1
Minimum,0.0
5-th percentile,5.7
Q1,8.5
Median,10.3
Q3,12.1
95-th percentile,14.7
Maximum,20.0
Range,20.0
Interquartile range,3.6

0,1
Standard deviation,2.8052
Coef of variation,0.27355
Kurtosis,0.73899
Mean,10.255
MAD,2.1875
Skewness,-0.28011
Sum,27339
Variance,7.8692
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
11.3,52,0.0%,
10.0,49,0.0%,
11.0,46,0.0%,
9.8,45,0.0%,
11.1,45,0.0%,
10.1,45,0.0%,
11.8,42,0.0%,
11.4,42,0.0%,
10.2,42,0.0%,
10.9,41,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0.0,17,0.0%,
1.1,1,0.0%,
1.3,1,0.0%,
2.0,2,0.0%,
2.1,2,0.0%,

Value,Count,Frequency (%),Unnamed: 3
18.2,2,0.0%,
18.3,1,0.0%,
18.4,1,0.0%,
18.9,1,0.0%,
20.0,1,0.0%,

0,1
Distinct count,116
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,99.978
Minimum,33
Maximum,175
Zeros (%),0.0%

0,1
Minimum,33
5-th percentile,68
Q1,87
Median,100
Q3,113
95-th percentile,131
Maximum,175
Range,142
Interquartile range,26

0,1
Standard deviation,19.462
Coef of variation,0.19467
Kurtosis,-0.077178
Mean,99.978
MAD,15.615
Skewness,0.053485
Sum,266541
Variance,378.78
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
105,70,0.0%,
104,67,0.0%,
91,65,0.0%,
98,55,0.0%,
100,55,0.0%,
94,54,0.0%,
103,53,0.0%,
106,52,0.0%,
102,52,0.0%,
92,52,0.0%,

Value,Count,Frequency (%),Unnamed: 3
33,1,0.0%,
36,1,0.0%,
38,1,0.0%,
44,1,0.0%,
48,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
156,1,0.0%,
157,2,0.0%,
158,1,0.0%,
166,1,0.0%,
175,1,0.0%,

0,1
Correlation,1

0,1
Distinct count,1430
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,200.6
Minimum,23.2
Maximum,377.5
Zeros (%),0.0%

0,1
Minimum,23.2
5-th percentile,118.3
Q1,167.2
Median,200.85
Q3,234.5
95-th percentile,283.12
Maximum,377.5
Range,354.3
Interquartile range,67.3

0,1
Standard deviation,50.289
Coef of variation,0.25069
Kurtosis,0.095734
Mean,200.6
MAD,40.009
Skewness,-0.018775
Sum,534810
Variance,2529
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
214.6,7,0.0%,
194.3,7,0.0%,
188.2,7,0.0%,
191.4,7,0.0%,
197.4,6,0.0%,
182.4,6,0.0%,
197.3,6,0.0%,
193.6,6,0.0%,
207.2,6,0.0%,
190.5,6,0.0%,

Value,Count,Frequency (%),Unnamed: 3
23.2,1,0.0%,
45.0,1,0.0%,
47.4,1,0.0%,
50.1,2,0.0%,
53.3,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
354.9,1,0.0%,
364.3,1,0.0%,
364.9,1,0.0%,
367.7,1,0.0%,
377.5,1,0.0%,

First 3 values
401-5012
343-1538
382-4952

Last 3 values
401-5169
390-3565
367-1062

Value,Count,Frequency (%),Unnamed: 3
327-1058,1,0.0%,
327-1319,1,0.0%,
327-3053,1,0.0%,
327-3954,1,0.0%,
327-4795,1,0.0%,

Value,Count,Frequency (%),Unnamed: 3
422-7728,1,0.0%,
422-8268,1,0.0%,
422-8333,1,0.0%,
422-8344,1,0.0%,
422-9964,1,0.0%,

0,1
Distinct count,51
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
WV,77
MN,73
NY,69
Other values (48),2447

Value,Count,Frequency (%),Unnamed: 3
WV,77,0.0%,
MN,73,0.0%,
NY,69,0.0%,
AL,68,0.0%,
VA,67,0.0%,
WI,65,0.0%,
OH,64,0.0%,
OR,63,0.0%,
CT,63,0.0%,
ID,62,0.0%,

0,1
Distinct count,44
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0
Infinite (%),0.0%
Infinite (n),0

0,1
Mean,8.0555
Minimum,0
Maximum,51
Zeros (%),0.0%

0,1
Minimum,0
5-th percentile,0
Q1,0
Median,0
Q3,19
95-th percentile,36
Maximum,51
Range,51
Interquartile range,19

0,1
Standard deviation,13.647
Coef of variation,1.6941
Kurtosis,-0.038274
Mean,8.0555
MAD,11.675
Skewness,1.2694
Sum,21476
Variance,186.24
Memory size,20.9 KiB

Value,Count,Frequency (%),Unnamed: 3
0,1932,0.0%,
31,48,0.0%,
33,41,0.0%,
28,40,0.0%,
27,38,0.0%,
30,37,0.0%,
24,36,0.0%,
32,35,0.0%,
29,34,0.0%,
23,31,0.0%,

Value,Count,Frequency (%),Unnamed: 3
0,1932,0.0%,
8,2,0.0%,
9,2,0.0%,
11,2,0.0%,
12,4,0.0%,

Value,Count,Frequency (%),Unnamed: 3
47,3,0.0%,
48,2,0.0%,
49,1,0.0%,
50,1,0.0%,
51,1,0.0%,

0,1
Distinct count,2
Unique (%),0.0%
Missing (%),0.0%
Missing (n),0

0,1
no,1932
yes,734

Value,Count,Frequency (%),Unnamed: 3
no,1932,0.0%,
yes,734,0.0%,

Unnamed: 0,State,Account Length,Area Code,Phone,Int'l Plan,VMail Plan,VMail Message,Day Mins,Day Calls,Day Charge,Eve Mins,Eve Calls,Eve Charge,Night Mins,Night Calls,Night Charge,Intl Mins,Intl Calls,Intl Charge,CustServ Calls,Churn
0,ND,84,415,400-7253,no,yes,33,159.1,106,27.05,149.8,101,12.73,213.4,108,9.6,13.0,18,3.51,1,Retain
1,RI,117,408,370-5042,no,yes,13,207.6,65,35.29,152.7,77,12.98,232.8,95,10.48,9.7,3,2.62,1,Retain
2,VA,132,510,343-4696,no,no,0,81.1,86,13.79,245.2,72,20.84,237.0,115,10.67,10.3,2,2.78,0,Retain
3,OK,121,408,364-2495,no,yes,31,237.1,63,40.31,205.6,117,17.48,196.7,85,8.85,10.1,5,2.73,4,Retain
4,ME,205,510,413-4039,no,yes,24,175.8,139,29.89,155.0,98,13.18,180.7,64,8.13,7.8,5,2.11,2,Retain


## Scatterplot Matrix

In [12]:
# separate the calls data for plotting

churnDFs = churnDFs[['Account Length','Day Calls','Eve Calls','CustServ Calls','Churn']]

# Create scatter plot matrix of call data
splom = ff.create_scatterplotmatrix(churnDFs, diag='histogram', index='Churn',  
                                  colormap= dict(
                                      Churn = '#9CBEF1',
                                      Retain = '#04367F'
                                      ),
                                  colormap_type='cat',
                                  height=560, width=650,
                                  size=4, marker=dict(symbol='circle'))
py.iplot(splom)

In [13]:
%%capture

#h2o.connect(ip="35.225.239.147")
h2o.init(nthreads=1, max_mem_size="768m")

In [14]:
%%capture

# Split data into training and testing frames

from sklearn import cross_validation
from sklearn.model_selection import train_test_split

training, testing = train_test_split(churnDF, train_size=0.8, stratify=churnDF["Churn"], random_state=9)
train = h2o.H2OFrame(python_obj=training).drop("State")
test = h2o.H2OFrame(python_obj=testing).drop("State")

# Set predictor and response variables
y = "Churn"
x = train.columns
x.remove(y)

# Automatic Machine Learning

The Automatic Machine Learning (AutoML) function automates the supervised machine learning model training process. The current version of AutoML trains and cross-validates a Random Forest, an Extremely-Randomized Forest, a random grid of Gradient Boosting Machines (GBMs), a random grid of Deep Neural Nets, and a Stacked Ensemble of all the models.

In [17]:
%%capture
# Run AutoML until 11 models are built
autoModel = H2OAutoML(max_models = 20)
autoModel.train(x = x, y = y,
          training_frame = train,
          validation_frame = test, 
          leaderboard_frame = test)

## Leaderboard

In [18]:
leaders = autoModel.leaderboard
leaders

model_id,auc,logloss
StackedEnsemble_AllModels_0_AutoML_20180216_175915,0.952321,0.125829
GBM_grid_0_AutoML_20180216_175915_model_1,0.946107,0.132584
GBM_grid_0_AutoML_20180216_175915_model_0,0.946039,0.144563
GBM_grid_0_AutoML_20180216_175915_model_3,0.944594,0.13859
GBM_grid_0_AutoML_20180216_175915_model_2,0.944036,0.136333
GBM_grid_0_AutoML_20180216_175427_model_2,0.942455,0.13896
GBM_grid_0_AutoML_20180216_175427_model_0,0.939362,0.141543
GBM_grid_0_AutoML_20180216_175427_model_3,0.938667,0.13965
GBM_grid_0_AutoML_20180216_175915_model_6,0.937945,0.149145
GBM_grid_0_AutoML_20180216_175427_model_1,0.937318,0.136583




# Variable Importances
Below we plot variable importances as reported by the best performing algo in the ensemble.

In [19]:
importances = h2o.get_model(leaders[2, 0]).varimp(use_pandas=True)
importances = importances.loc[:,['variable','relative_importance']].groupby('variable').mean()
importances.sort_values(by="relative_importance", ascending=False).iplot(kind='bar', colors='#5AC4F2', theme='white')

In [20]:
import matplotlib.pyplot as plt
plt.figure()
bestModel = h2o.get_model(leaders[2, 0])
plt = bestModel.partial_plot(data=test, cols=["Day Mins","CustServ Calls","Day Charge"])


PartialDependencePlot progress: |█████████████████████████████████████████| 100%


# Best Model vs the Base Learners
This plot shows the ROC curves for the Super Model, the Best Base Model, and 9 next best models in the ensemble. 

In [21]:
Model0 = np.array(h2o.get_model(leaders[0,0]).roc(xval=True))
Model1 = np.array(h2o.get_model(leaders[1,0]).roc(xval=True))
Model2 = np.array(h2o.get_model(leaders[2,0]).roc(xval=True))
Model3 = np.array(h2o.get_model(leaders[3,0]).roc(xval=True))
Model4 = np.array(h2o.get_model(leaders[4,0]).roc(xval=True))
Model5 = np.array(h2o.get_model(leaders[5,0]).roc(xval=True))
Model6 = np.array(h2o.get_model(leaders[6,0]).roc(xval=True))
Model7 = np.array(h2o.get_model(leaders[7,0]).roc(xval=True))
Model8 = np.array(h2o.get_model(leaders[8,0]).roc(xval=True))
Model9 = np.array(h2o.get_model(leaders[9,0]).roc(xval=True))

layout = go.Layout(autosize=False, width=725, height=575,  xaxis=dict(title='False Positive Rate', titlefont=dict(family='Arial, sans-serif', size=15, color='grey')), 
                                                           yaxis=dict(title='True Positive Rate', titlefont=dict(family='Arial, sans-serif', size=15, color='grey')))

traceChanceLine = go.Scatter(x = [0,1], y = [0,1], mode = 'lines+markers', name = 'chance', line = dict(color = ('rgb(136, 140, 150)'), width = 4, dash = 'dash'))
Model0Trace = go.Scatter(x = Model0[0], y = Model0[1], mode = 'lines', name = 'Model 0', line = dict(color = ('rgb(26, 58, 126)'), width = 3))
Model1Trace = go.Scatter(x = Model1[0], y = Model1[1], mode = 'lines', name = 'Model 1', line = dict(color = ('rgb(156, 190, 241))'), width = 1))
Model2Trace = go.Scatter(x = Model2[0], y = Model2[1], mode = 'lines', name = 'Model 2', line = dict(color = ('rgb(156, 190, 241)'), width = 1))
Model3Trace = go.Scatter(x = Model3[0], y = Model3[1], mode = 'lines', name = 'Model 3', line = dict(color = ('rgb(156, 190, 241)'), width = 1))
Model4Trace = go.Scatter(x = Model4[0], y = Model4[1], mode = 'lines', name = 'Model 4', line = dict(color = ('rgb(156, 190, 241)'), width = 1))
Model5Trace = go.Scatter(x = Model5[0], y = Model5[1], mode = 'lines', name = 'Model 5', line = dict(color = ('rgb(156, 190, 241)'), width = 1))
Model6Trace = go.Scatter(x = Model6[0], y = Model6[1], mode = 'lines', name = 'Model 6', line = dict(color = ('rgb(156, 190, 241)'), width = 1))
Model7Trace = go.Scatter(x = Model7[0], y = Model7[1], mode = 'lines', name = 'Model 7', line = dict(color = ('rgb(156, 190, 241)'), width = 1))
Model8Trace = go.Scatter(x = Model8[0], y = Model8[1], mode = 'lines', name = 'Model 8', line = dict(color = ('rgb(156, 190, 241)'), width = 1))
Model9Trace = go.Scatter(x = Model9[0], y = Model9[1], mode = 'lines', name = 'Model 9', line = dict(color = ('rgb(156, 190, 241)'), width = 1))

fig = go.Figure(data=[Model0Trace,Model1Trace,Model2Trace,Model3Trace,Model4Trace,Model5Trace,Model6Trace,Model8Trace,Model9Trace,traceChanceLine], layout=layout)

py.iplot(fig)

# Confusion Matrix

In [22]:
cm = h2o.get_model(leaders[1, 0]).confusion_matrix(xval=True)
cm = cm.table.as_data_frame()
cm
confusionMatrix = ff.create_table(cm)
confusionMatrix.layout.height=300
confusionMatrix.layout.width=800
confusionMatrix.layout.font.size=17
py.iplot(confusionMatrix)

# Business Impact Matrix

Weighting Predictions With a Dollar Value
-   Correctly predicting retain: `+$5`
-   Correctly predicting churn: `+$75`
-   Incorrectly predicting retain: `-$150`
-   Incorrectly predicting churn: `-$1.5`

    

In [23]:
CorrectPredictChurn = cm.loc[0,'Churn']
CorrectPredictChurnImpact = 75
cm1 = CorrectPredictChurn*CorrectPredictChurnImpact

IncorrectPredictChurn = cm.loc[1,'Churn']
IncorrectPredictChurnImpact = -5
cm2 = IncorrectPredictChurn*IncorrectPredictChurnImpact

IncorrectPredictRetain = cm.loc[0,'Retain']
IncorrectPredictRetainImpact = -150
cm3 = IncorrectPredictRetain*IncorrectPredictRetainImpact

CorrectPredictRetain = cm.loc[0,'Retain']
CorrectPredictRetainImpact = 5
cm4 = IncorrectPredictRetain*CorrectPredictRetainImpact


data_matrix = [['Business Impact', '($) Predicted Churn', '($) Predicted Retain', '($) Total'],
               ['($) Actual Churn', cm1, cm3, '' ],
               ['($) Actual Retain', cm2, cm4, ''],
               ['($) Total', cm1+cm2, cm3+cm4, cm1+cm2+cm3+cm4]]

impactMatrix = ff.create_table(data_matrix, height_constant=20, hoverinfo='weight')
impactMatrix.layout.height=300
impactMatrix.layout.width=800
impactMatrix.layout.font.size=17
py.iplot(impactMatrix)

In [24]:
print("Total customers evaluated: 2132")

Total customers evaluated: 2132


In [25]:
print("Total value created by the model: $" + str(cm1+cm2+cm3+cm4))

Total value created by the model: $3955.0


In [26]:
print("Total value per customer: $" +str(round(((cm1+cm2+cm3+cm4)/2132),3)))

Total value per customer: $1.855


In [48]:
%%capture
# Save the best model

path = h2o.save_model(model=h2o.get_model(leaders[0, 0]), force=True)
os.rename(h2o.get_model(leaders[0, 0]).model_id, "AutoML-leader")    

In [49]:
%%capture
LoadedEnsemble = h2o.load_model(path="AutoML-leader")
print(LoadedEnsemble)