### Introduction:

As per https://evalml.featurelabs.com/en/v0.9.0/- EvalML is an AutoML library that builds, optimizes, and evaluates machine learning pipelines using domain-specific objective functions. 

### Problem statement: 
This project is based on customer churn datasets. We get the dataset in link-https://www.kaggle.com/sakshigoyal7/credit-card-customers. I am going to use the EvalML to predict the customers who are going to churn from one service to other service.
This project will consider the following research questions to find out the customers who do not like to stay or churn:
+ What is the accuracy of the machine learning model get from EvalML than other classical machine learning models?
+ What are the best scores that we get after calculation of confusion matrix, classification report and roc auc score applying other machine learning models and EvalML model?

### Problem analysis:
I use the EvaML machine learning library to predict the accuracy of the model. I will also compare the other models that we have from sklearn importing and compare the results with EvalML model. My aim is to identify why it is the best library to predict the good accuracy as well as classify the model.

In [191]:
import pandas as pd
import numpy as np
import evalml

In [192]:
df=pd.read_csv('train.csv')

In [193]:
df.head()

Unnamed: 0,state,account_length,area_code,international_plan,voice_mail_plan,number_vmail_messages,total_day_minutes,total_day_calls,total_day_charge,total_eve_minutes,total_eve_calls,total_eve_charge,total_night_minutes,total_night_calls,total_night_charge,total_intl_minutes,total_intl_calls,total_intl_charge,number_customer_service_calls,churn
0,OH,107,area_code_415,no,yes,26,161.6,123,27.47,195.5,103,16.62,254.4,103,11.45,13.7,3,3.7,1,no
1,NJ,137,area_code_415,no,no,0,243.4,114,41.38,121.2,110,10.3,162.6,104,7.32,12.2,5,3.29,0,no
2,OH,84,area_code_408,yes,no,0,299.4,71,50.9,61.9,88,5.26,196.9,89,8.86,6.6,7,1.78,2,no
3,OK,75,area_code_415,yes,no,0,166.7,113,28.34,148.3,122,12.61,186.9,121,8.41,10.1,3,2.73,3,no
4,MA,121,area_code_510,no,yes,24,218.2,88,37.09,348.5,108,29.62,212.6,118,9.57,7.5,7,2.03,3,no


#### Drop the columns which has less contribution to the output. We will do to keep only the columns which has direct correlation to the output. 

In [194]:
print(f'Feature types: {df.dtypes.unique()}')

Feature types: [dtype('O') dtype('int64') dtype('float64')]


In [209]:
df.info()

<class 'pandas.core.frame.DataFrame'>
Int64Index: 4250 entries, 0 to 4249
Data columns (total 20 columns):
 #   Column                         Non-Null Count  Dtype  
---  ------                         --------------  -----  
 0   state                          4250 non-null   uint8  
 1   account_length                 4250 non-null   int64  
 2   area_code                      4250 non-null   object 
 3   international_plan             4250 non-null   uint8  
 4   voice_mail_plan                4250 non-null   uint8  
 5   number_vmail_messages          4250 non-null   int64  
 6   total_day_minutes              4250 non-null   float64
 7   total_day_calls                4250 non-null   int64  
 8   total_day_charge               4250 non-null   float64
 9   total_eve_minutes              4250 non-null   float64
 10  total_eve_calls                4250 non-null   int64  
 11  total_eve_charge               4250 non-null   float64
 12  total_night_minutes            4250 non-null   f

In [210]:
df.isnull().sum().sum()

0

In [211]:
df=df.dropna()

In [212]:
df.head()

Unnamed: 0,state,account_length,area_code,international_plan,voice_mail_plan,number_vmail_messages,total_day_minutes,total_day_calls,total_day_charge,total_eve_minutes,total_eve_calls,total_eve_charge,total_night_minutes,total_night_calls,total_night_charge,total_intl_minutes,total_intl_calls,total_intl_charge,number_customer_service_calls,churn
0,0,107,area_code_415,1,0,26,161.6,123,27.47,195.5,103,16.62,254.4,103,11.45,13.7,3,3.7,1,no
1,0,137,area_code_415,1,1,0,243.4,114,41.38,121.2,110,10.3,162.6,104,7.32,12.2,5,3.29,0,no
2,0,84,area_code_408,0,1,0,299.4,71,50.9,61.9,88,5.26,196.9,89,8.86,6.6,7,1.78,2,no
3,0,75,area_code_415,0,1,0,166.7,113,28.34,148.3,122,12.61,186.9,121,8.41,10.1,3,2.73,3,no
4,0,121,area_code_510,1,0,24,218.2,88,37.09,348.5,108,29.62,212.6,118,9.57,7.5,7,2.03,3,no


In [213]:
df['state']=pd.get_dummies(df['state'])

In [214]:
df['international_plan']=pd.get_dummies(df['international_plan'])

In [215]:
df['voice_mail_plan']=pd.get_dummies(df['voice_mail_plan'])

In [228]:
del df['area_code']

### Define the x and y variables

In [229]:
x=df.drop(['churn'], axis=1).values

In [230]:
from sklearn.preprocessing import StandardScaler

In [231]:
sc=StandardScaler()

In [232]:
sc.fit(x)

StandardScaler()

In [233]:
x=sc.fit_transform(x)

In [234]:
x.T

array([[ 0.1206729 ,  0.1206729 ,  0.1206729 , ...,  0.1206729 ,
         0.1206729 ,  0.1206729 ],
       [ 0.17039882,  0.92618569, -0.40903778, ..., -0.63577385,
        -1.26559624, -0.35865199],
       [-0.32054702, -0.32054702,  3.11966717, ..., -0.32054702,
        -0.32054702, -0.32054702],
       ...,
       [-0.57916393,  0.2329267 ,  1.04501732, ...,  1.04501732,
         0.2329267 ,  4.69942514],
       [ 1.24859124,  0.69834168, -1.32818716, ..., -1.2208214 ,
        -0.13374301, -0.34847454],
       [-0.42634613, -1.1889602 ,  0.33626795, ..., -0.42634613,
         0.33626795, -1.1889602 ]])

In [235]:
y=df['churn'].values

In [236]:
from sklearn.model_selection import train_test_split
x_train, x_test, y_train, y_test=train_test_split(x, y, test_size=.25, random_state=62)

#### Develop a ML model

In [237]:
from sklearn.ensemble import RandomForestClassifier

In [238]:
rf=RandomForestClassifier(n_estimators=25)

In [239]:
rf.fit(x_train, y_train)

RandomForestClassifier(n_estimators=25)

In [240]:
y_pred_rf=rf.predict(x_test)

In [241]:
from sklearn.metrics import accuracy_score, classification_report, confusion_matrix

In [242]:
accuracy_score(y_test, y_pred_rf)

0.9567262464722484

In [243]:
confusion_matrix(y_test, y_pred_rf)

array([[901,   9],
       [ 37, 116]], dtype=int64)

In [244]:
from evalml import AutoMLSearch

In [245]:
automl=AutoMLSearch(x_train, y_train, problem_type='binary', objective='F1',
                    allowed_model_families=['random_forest', 'xgboost', 'lightgbm'], 
                    additional_objectives=['accuracy binary'],
                   max_batches=5)

In [246]:
%%time
automl.search(data_checks=None)

Generating pipelines to search over...
*****************************
* Beginning pipeline search *
*****************************

Optimizing for F1. 
Greater score is better.

Searching up to 5 batches for a total of 24 pipelines. 
Allowed model families: lightgbm, random_forest, xgboost



FigureWidget({
    'data': [{'mode': 'lines+markers',
              'name': 'Best Score',
              'type'…

Batch 1: (1/24) Mode Baseline Binary Classification P... Elapsed:00:00
	Starting cross validation
	Finished cross validation - mean F1: 0.000
Batch 1: (2/24) Random Forest Classifier w/ Imputer      Elapsed:00:00
	Starting cross validation
	Finished cross validation - mean F1: 0.711
Batch 1: (3/24) XGBoost Classifier w/ Imputer            Elapsed:00:00
	Starting cross validation
	Finished cross validation - mean F1: 0.823
Batch 1: (4/24) LightGBM Classifier w/ Imputer           Elapsed:00:01
	Starting cross validation
	Finished cross validation - mean F1: 0.821
Batch 2: (5/24) XGBoost Classifier w/ Imputer            Elapsed:00:02
	Starting cross validation
	Finished cross validation - mean F1: 0.744
Batch 2: (6/24) XGBoost Classifier w/ Imputer            Elapsed:00:03
	Starting cross validation
	Finished cross validation - mean F1: 0.532
Batch 2: (7/24) XGBoost Classifier w/ Imputer            Elapsed:00:04
	Starting cross validation
	Finished cross validation - mean F1: 0.798
Batch 

#### Pipeline review

In [247]:
automl.rankings

Unnamed: 0,id,pipeline_name,score,validation_score,percent_better_than_baseline,high_variance_cv,parameters
0,2,XGBoost Classifier w/ Imputer,0.823089,0.817844,,False,{'Imputer': {'categorical_impute_strategy': 'm...
1,3,LightGBM Classifier w/ Imputer,0.821449,0.815094,,False,{'Imputer': {'categorical_impute_strategy': 'm...
11,1,Random Forest Classifier w/ Imputer,0.710978,0.711297,,False,{'Imputer': {'categorical_impute_strategy': 'm...
23,0,Mode Baseline Binary Classification Pipeline,0.0,0.0,,False,{'Baseline Classifier': {'strategy': 'mode'}}


#### Feature Importances

In [248]:
automl.describe_pipeline(0)

************************************************
* Mode Baseline Binary Classification Pipeline *
************************************************

Problem Type: binary
Model Family: Baseline

Pipeline Steps
1. Baseline Classifier
	 * strategy : mode

Training
Training for binary problems.
Total training time (including CV): 0.1 seconds

Cross Validation
----------------
               F1  Accuracy Binary # Training # Validation
0           0.000            0.860   2124.000     1063.000
1           0.000            0.861   2125.000     1062.000
2           0.000            0.861   2125.000     1062.000
mean        0.000            0.860          -            -
std         0.000            0.000          -            -
coef of var   inf            0.001          -            -


In [249]:
automl.get_pipeline(0)

GeneratedPipelineBinary(parameters={'Baseline Classifier':{'strategy': 'mode'},})

In [250]:
automl.best_pipeline

GeneratedPipelineBinary(parameters={'Imputer':{'categorical_impute_strategy': 'most_frequent', 'numeric_impute_strategy': 'mean', 'categorical_fill_value': None, 'numeric_fill_value': None}, 'XGBoost Classifier':{'eta': 0.1, 'max_depth': 6, 'min_child_weight': 1, 'n_estimators': 100},})

#### Best pipeline

In [251]:
best_pipeline=automl.best_pipeline

In [252]:
best_pipeline.fit(x_train, y_train)

GeneratedPipelineBinary(parameters={'Imputer':{'categorical_impute_strategy': 'most_frequent', 'numeric_impute_strategy': 'mean', 'categorical_fill_value': None, 'numeric_fill_value': None}, 'XGBoost Classifier':{'eta': 0.1, 'max_depth': 6, 'min_child_weight': 1, 'n_estimators': 100},})

In [253]:
predictions=best_pipeline.predict(x_test)

In [254]:
predictions

<DataColumn: None (Physical Type = category) (Logical Type = Categorical) (Semantic Tags = {'category'})>

In [255]:
graph_confusion_matrix(y_test, predictions)

In [256]:
print(best_pipeline)

Mode Baseline Binary Classification Pipeline


In [257]:
automl.describe_pipeline(automl.rankings.iloc[1]["id"])

**********************************
* LightGBM Classifier w/ Imputer *
**********************************

Problem Type: binary
Model Family: LightGBM

Pipeline Steps
1. Imputer
	 * categorical_impute_strategy : most_frequent
	 * numeric_impute_strategy : mean
	 * categorical_fill_value : None
	 * numeric_fill_value : None
2. LightGBM Classifier
	 * boosting_type : gbdt
	 * learning_rate : 0.1
	 * n_estimators : 100
	 * max_depth : 0
	 * num_leaves : 31
	 * min_child_samples : 20
	 * n_jobs : -1
	 * bagging_freq : 0
	 * bagging_fraction : 0.9

Training
Training for binary problems.
Total training time (including CV): 0.8 seconds

Cross Validation
----------------
               F1  Accuracy Binary # Training # Validation
0           0.815            0.954   2124.000     1063.000
1           0.813            0.953   2125.000     1062.000
2           0.836            0.959   2125.000     1062.000
mean        0.821            0.955          -            -
std         0.012            0.003

#### Graphical representation

In [258]:
from evalml.model_understanding.graphs import (
    graph_binary_objective_vs_threshold,
    graph_permutation_importance,
    graph_confusion_matrix
)

In [259]:
graph_binary_objective_vs_threshold(best_pipeline, x_test, y_test, 'F1')

In [260]:
graph_confusion_matrix(y_test, y_pred_rf)

In [261]:
graph_permutation_importance(best_pipeline, x_test, y_test, 'F1')

In [262]:
print(classification_report(y_test, y_pred_rf))

              precision    recall  f1-score   support

          no       0.96      0.99      0.98       910
         yes       0.93      0.76      0.83       153

    accuracy                           0.96      1063
   macro avg       0.94      0.87      0.90      1063
weighted avg       0.96      0.96      0.95      1063



In [263]:
from xgboost import XGBClassifier

In [264]:
model = XGBClassifier()
model.fit(x_train, y_train)

XGBClassifier(base_score=0.5, booster='gbtree', colsample_bylevel=1,
              colsample_bynode=1, colsample_bytree=1, gamma=0, gpu_id=-1,
              importance_type='gain', interaction_constraints='',
              learning_rate=0.300000012, max_delta_step=0, max_depth=6,
              min_child_weight=1, missing=nan, monotone_constraints='()',
              n_estimators=100, n_jobs=0, num_parallel_tree=1, random_state=0,
              reg_alpha=0, reg_lambda=1, scale_pos_weight=1, subsample=1,
              tree_method='exact', validate_parameters=1, verbosity=None)

In [265]:
y_pred_xgb=model.predict(x_test)

In [266]:
y_pred_xgb

array(['no', 'yes', 'no', ..., 'no', 'no', 'no'], dtype=object)

In [267]:
accuracy_score(y_test, y_pred_xgb)

0.9604891815616181

In [268]:
graph_confusion_matrix(y_test, y_pred_xgb)

In [269]:
confusion_matrix(y_test, y_pred_xgb)

array([[902,   8],
       [ 34, 119]], dtype=int64)

In [270]:
accuracy = accuracy_score(y_test, y_pred_xgb)

In [271]:
print("Accuracy: %.2f%%" % (accuracy * 100.0))

Accuracy: 96.05%


In [272]:
from evalml.objectives.standard_metrics import AccuracyBinary, AUC, F1, PrecisionWeighted, Recall

In [273]:
evalml.objectives.get_all_objective_names()

['expvariance',
 'maxerror',
 'medianae',
 'mse',
 'mae',
 'r2',
 'mean squared log error',
 'root mean squared log error',
 'root mean squared error',
 'mean absolute percentage error',
 'mcc multiclass',
 'log loss multiclass',
 'auc weighted',
 'auc macro',
 'auc micro',
 'recall weighted',
 'recall macro',
 'recall micro',
 'precision weighted',
 'precision macro',
 'precision micro',
 'f1 weighted',
 'f1 macro',
 'f1 micro',
 'balanced accuracy multiclass',
 'accuracy multiclass',
 'mcc binary',
 'log loss binary',
 'auc',
 'recall',
 'precision',
 'f1',
 'balanced accuracy binary',
 'accuracy binary',
 'lead scoring',
 'fraud cost',
 'cost benefit matrix']

In [274]:
best_pipeline.score(x_test, y_test, objectives=["auc", 'F1'])

OrderedDict([('AUC', 0.9296487825899591), ('F1', 0.851063829787234)])

### Reference:
+ https://evalml.featurelabs.com/en/v0.9.0/
+ https://www.youtube.com/watch?v=94Yd_GaeOqM&list=PLfSLx4WE4q520hf7cVwgfn2sbAxxkHkt8