The notebook aims to highlight **`the ease of using autoML on containers to solve a common classification problem`**
- Loads data to CAS
- Runs AutoML (datasciencepilot) on CAS
- Generates the champ model
- Scores data with champ model
- Saves scoring results & model to the filesystem
- Spawns h2o session
- Runs h2o automl against the data
- Generates the champ model
- Optionally,saves the champ model 

In [1]:
import swat
from _config import container_cas35_login
from swat.render import render_html
import numpy as np
import pandas as pd
pd.set_option('display.float_format', lambda x: '%.2f' % x)
from matplotlib import pyplot as plt
from IPython.display import HTML
%matplotlib inline
# from plotnine import *
import os, time
#os.environ['CAS_CLIENT_SSL_CA_LIST']=r'/ext_str/beast_trustfiles/nBeast_trustfiles/trustedcerts.pem'
from IPython.display import IFrame
%reload_ext autoreload
%autoreload 2

In [2]:
host = container_cas35_login()[2]
user = container_cas35_login()[0]
pswd = container_cas35_login()[1]

In [3]:
sess = swat.CAS(host,5571,user,pswd)
sess.setsessopt(caslib="casuser")
sess.setsessopt(locale='en-US')
sess.loadactionset(actionset="dataSciencePilot")

NOTE: 'CASUSER(sasdemo)' is now the active caslib.
NOTE: Added action set 'dataSciencePilot'.


In [4]:
sess.help(actionset="dataSciencePilot")["dataSciencePilot"]

NOTE: Information for action set 'dataSciencePilot':
NOTE:    dataSciencePilot
NOTE:       exploreData - Exploration, automatic variable analysis and grouping using comprehensive statistical profiling of the variables.
NOTE:       screenVariables - Screens noise variables and variables that need special transformations to be useful in the downstream analytics.
NOTE:       analyzeMissingPatterns - Missing pattern analysis
NOTE:       exploreCorrelation - Explore linear and nonlinear correlation among the variables.
NOTE:       detectInteractions - Variable interaction detection and ranking
NOTE:       generateShadowFeatures - Generate shadow features.
NOTE:       featureMachine - Automated feature transformation and generation engine
NOTE:       selectFeatures - Feature selection
NOTE:       dsAutoMl - Automated machine learning pipeline exploration, execution and ranking.


Unnamed: 0,name,description
0,exploreData,"Exploration, automatic variable analysis and g..."
1,screenVariables,Screens noise variables and variables that nee...
2,analyzeMissingPatterns,Missing pattern analysis
3,exploreCorrelation,Explore linear and nonlinear correlation among...
4,detectInteractions,Variable interaction detection and ranking
5,generateShadowFeatures,Generate shadow features.
6,featureMachine,Automated feature transformation and generatio...
7,selectFeatures,Feature selection
8,dsAutoMl,Automated machine learning pipeline exploratio...


In [5]:
churn_df=pd.read_csv('churn.csv')
churn_df.columns = [i.replace(' ','_').replace("'",'').lower() for i in churn_df.columns]
#Check out how the resultant dataset look like
churn_df.head()

Unnamed: 0,account_length,vmail_message,day_mins,eve_mins,night_mins,intl_mins,custserv_calls,churn,intl_plan,vmail_plan,...,day_charge,eve_calls,eve_charge,night_calls,night_charge,intl_calls,intl_charge,state,area_code,phone
0,128,25,265.1,197.4,244.7,10.0,1,0,0,1,...,45.07,99,16.78,91,11.01,3,2.7,KS,415,382-4657
1,107,26,161.6,195.5,254.4,13.7,1,0,0,1,...,27.47,103,16.62,103,11.45,3,3.7,OH,415,371-7191
2,137,0,243.4,121.2,162.6,12.2,0,0,0,0,...,41.38,110,10.3,104,7.32,5,3.29,NJ,415,358-1921
3,84,0,299.4,61.9,196.9,6.6,2,0,1,0,...,50.9,88,5.26,89,8.86,7,1.78,OH,408,375-9999
4,75,0,166.7,148.3,186.9,10.1,3,0,1,0,...,28.34,122,12.61,121,8.41,3,2.73,OK,415,330-6626


In [6]:
# Load Table to CAS
out=sess.upload(churn_df,casout=dict(name='churn',caslib='casuser'))
sess.loadactionset('fedSQL') #Enable SQL actions - for distributed SQL         
out=out.casTable #get CASTAble
#programmatically build query
col_list= [i for i in out.columns if i not in ('area_code','churn','intl_plan','vmail_plan')]
cas_lib='casuser'
option_params='{options replace=true}'
query = """create table {}.churn {} as select {}, 
cast(intl_plan as char) as intl_plan,
cast(vmail_plan as char) as vmail_plan,
cast(area_code as varchar) as area_code,
cast(churn as varchar) as churn
from casuser.churn """.format(cas_lib,
                              option_params,
                              col_list).replace('[','').replace(']','').replace("'",'')

#execute query and check the results
sess.fedsql.execdirect(query) # run the query
out = sess.CASTable('CHURN', caslib ='casuser') #get the results
render_html(out.fetch(to=5)) #view the results

NOTE: Cloud Analytic Services made the uploaded file available as table CHURN in caslib CASUSER(sasdemo).
NOTE: The table CHURN has been created in caslib CASUSER(sasdemo) from binary data uploaded to Cloud Analytic Services.
NOTE: Added action set 'fedSQL'.
NOTE: Table CHURN was created in caslib CASUSER(sasdemo) with 3333 rows returned.


Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN,Selected Rows from Table CHURN
account_length,vmail_message,day_mins,eve_mins,night_mins,intl_mins,custserv_calls,day_calls,day_charge,eve_calls,eve_charge,night_calls,night_charge,intl_calls,intl_charge,state,phone,INTL_PLAN,VMAIL_PLAN,AREA_CODE,CHURN
128,25,265.1,197.4,244.7,10.0,1,110,45.07,99,16.78,91,11.01,3,2.7,KS,382-4657,0,1,415,0
107,26,161.6,195.5,254.4,13.7,1,123,27.47,103,16.62,103,11.45,3,3.7,OH,371-7191,0,1,415,0
137,0,243.4,121.2,162.6,12.2,0,114,41.38,110,10.3,104,7.32,5,3.29,NJ,358-1921,0,0,415,0
84,0,299.4,61.9,196.9,6.6,2,71,50.9,88,5.26,89,8.86,7,1.78,OH,375-9999,1,0,408,0
75,0,166.7,148.3,186.9,10.1,3,113,28.34,122,12.61,121,8.41,3,2.73,OK,330-6626,1,0,415,0


In [7]:
effect_vars={'area_code','account_length', 'custserv_calls', 'day_calls', 'day_charge', 
             'day_mins', 'eve_calls','eve_charge', 'eve_mins', 'intl_calls', 
             'intl_charge', 'intl_mins', 'intl_plan','night_calls','night_charge',
             'night_mins', 'vmail_message', 'vmail_plan'}

In [8]:
out.columninfo()

Unnamed: 0,Column,Label,ID,Type,RawLength,FormattedLength,Format,NFL,NFD
0,account_length,,1,double,8,12,,0,0
1,vmail_message,,2,double,8,12,,0,0
2,day_mins,,3,double,8,12,,0,0
3,eve_mins,,4,double,8,12,,0,0
4,night_mins,,5,double,8,12,,0,0
5,intl_mins,,6,double,8,12,,0,0
6,custserv_calls,,7,double,8,12,,0,0
7,day_calls,,8,double,8,12,,0,0
8,day_charge,,9,double,8,12,,0,0
9,eve_calls,,10,double,8,12,,0,0


In [19]:
sess.dataSciencePilot.dsAutoMl(
    debugLevel = 1,
        table                   = out,
        target                  = "CHURN",
        inputs = effect_vars,
#         explorationPolicy       = {},
#         screenPolicy            = {},
#         selectionPolicy         = {},
        transformationPolicy    = {"missing":True, "cardinality":True,
                                   "entropy":True, "iqv":True,
                                   "skewness":True, "kurtosis":True, "Outlier":True},
        modelTypes              = ["decisionTree", "GRADBOOST"],
        objective               = "AUC",
        sampleSize              = 20,
        topKPipelines           = 10,
        kFolds                  = 2,
        transformationOut       = {"name" : "TRANSFORMATION_OUT", "replace" : True},
        featureOut              = {"name" : "FEATURE_OUT", "replace" : True},
        pipelineOut             = {"name" : "PIPELINE_OUT", "replace" : True},
        saveState               = dict(modelNamePrefix='churn_model', replace = True)      
    )

NOTE: highCardinality
         { debugLevel=1, table={name='CHURN', caslib='CASUSER(sasdemo)'}, inputs={{name='CHURN'}, {name='VMAIL_PLAN'}, {name='INTL_PLAN'}, {name='AREA_CODE'}, {name='day_calls'}, {name='day_charge'}, {name='eve_mins'}, {name='eve_charge'}, {name='account_length'}, {name='day_mins'}, {name='intl_charge'}, {name='custserv_calls'}, {name='night_mins'}, {name='vmail_message'}, {name='eve_calls'}, {name='intl_mins'}, {name='night_charge'}, {name='intl_calls'}, {name='night_calls'}}, nthreads=24 }
NOTE: RUStats
         { debugLevel=1, table={name='CHURN', caslib='CASUSER(sasdemo)'}, inputs={'day_calls', 'day_charge', 'eve_mins', 'eve_charge', 'account_length', 'day_mins', 'intl_charge', 'custserv_calls', 'night_mins', 'vmail_message', 'eve_calls', 'intl_mins', 'night_charge', 'intl_calls', 'night_calls'}, requestPackages={{Location={'MEAN'}}}, miscValueStats=true, nthreads=24 }
NOTE: RUStats
         { inputs={{name='VMAIL_PLAN'}, {name='INTL_PLAN'}, {name='AREA_CODE'}

         { table={name='__temp_feature_machine_casout__'}, target='CHURN', nominals={'cpy_nom_mode_imp_lab_INTL_PLAN', 'lchehi_lab_custserv_calls', 'nhoks_nloks_dtree_10_day_mins', 'nhoks_nloks_dtree_5_day_charge', 'lcnhenhi_dtree10_vmail_message', 'cpy_nom_mode_imp_lab_VMAIL_PLAN', 'nhoks_nloks_dtree_10_intl_mins', 'CHURN'}, inputs={'cpy_nom_mode_imp_lab_INTL_PLAN', 'lchehi_lab_custserv_calls', 'nhoks_nloks_dtree_10_day_mins', 'nhoks_nloks_dtree_5_day_charge', 'lcnhenhi_dtree10_vmail_message', 'cpy_nom_mode_imp_lab_VMAIL_PLAN', 'hc_tar_frq_rat_account_length', 'hc_lbl_cnt_account_length', 'hc_tar_frq_rat_day_calls', 'hc_lbl_cnt_day_calls', 'nhoks_nloks_dtree_10_intl_mins', 'nhoks_nloks_pow_p0_5_intl_charge'}, tunerOptions={objective='AUC', searchMethod='grid', enableLocalSearch=false, validationPartitionFraction=0.3, seed=0, logLevel=0}, modelTypes={{modelType='decisionTree', tuningOptions={tuningParameters={{namepath='maxLevel', valuelist={5, 10, 15}}, {namepath='nBins', valuelist={1

         { table={name='__temp_feature_machine_casout__'}, target='CHURN', nominals={'cpy_nom_mode_imp_lab_INTL_PLAN', 'cpy_nom_mode_imp_lab_var_1_', 'nhoks_nloks_dtree_10_day_mins', 'lcnhenhi_dtree5_vmail_message', 'cpy_nom_mode_imp_lab_VMAIL_PLAN', 'nhoks_nloks_dtree_10_intl_mins', 'CHURN'}, inputs={'cpy_nom_mode_imp_lab_INTL_PLAN', 'cpy_nom_mode_imp_lab_var_1_', 'nhoks_nloks_dtree_10_day_mins', 'nhoks_nloks_pow_n0_5_day_charge', 'lcnhenhi_dtree5_vmail_message', 'cpy_nom_mode_imp_lab_VMAIL_PLAN', 'hc_tar_frq_rat_account_length', 'hc_tar_frq_rat_day_calls', 'nhoks_nloks_dtree_10_intl_mins', 'nhoks_nloks_pow_p0_5_intl_charge'}, tunerOptions={objective='AUC', searchMethod='grid', enableLocalSearch=false, validationPartitionFraction=0.3, seed=0, logLevel=0}, modelTypes={{modelType='decisionTree', tuningOptions={tuningParameters={{namepath='maxLevel', valuelist={5, 10, 15}}, {namepath='nBins', valuelist={100}}, {namepath='crit', valuelist={'gain', 'gainRatio'}}}}}, {modelType='gradBoost',

         { table={name='__temp_feature_machine_casout__'}, target='CHURN', nominals={'cpy_nom_mode_imp_lab_INTL_PLAN', 'lchehi_lab_custserv_calls', 'nhoks_nloks_dtree_10_day_mins', 'nhoks_nloks_dtree_10_day_charge', 'lcnhenhi_dtree10_vmail_message', 'cpy_nom_mode_imp_lab_VMAIL_PLAN', 'nhoks_nloks_dtree_5_intl_mins', 'nhoks_nloks_dtree_10_intl_charge', 'CHURN'}, inputs={'cpy_nom_mode_imp_lab_INTL_PLAN', 'lchehi_lab_custserv_calls', 'nhoks_nloks_dtree_10_day_mins', 'nhoks_nloks_dtree_10_day_charge', 'lcnhenhi_dtree10_vmail_message', 'cpy_nom_mode_imp_lab_VMAIL_PLAN', 'hc_tar_frq_rat_account_length', 'hc_tar_frq_rat_day_calls', 'nhoks_nloks_dtree_5_intl_mins', 'nhoks_nloks_dtree_10_intl_charge'}, tunerOptions={objective='AUC', searchMethod='grid', enableLocalSearch=false, validationPartitionFraction=0.3, seed=0, logLevel=0}, modelTypes={{modelType='gradBoost', tuningOptions={tuningParameters={{namepath='nTree', valuelist={100, 150}}, {namepath='m', valuelist={10}}, {namepath='learningRate

Unnamed: 0,Descr,Value
0,Number of Tree Nodes,29.0
1,Max Number of Branches,2.0
2,Number of Levels,5.0
3,Number of Leaves,15.0
4,Number of Bins,100.0
5,Minimum Size of Leaves,5.0
6,Maximum Size of Leaves,2236.0
7,Number of Variables,10.0
8,Confidence Level for Pruning,0.25
9,Number of Observations Used,3333.0

Unnamed: 0,Descr,Value
0,Number of Observations Read,3333.0
1,Number of Observations Used,3333.0
2,Misclassification Error (%),8.0708070807

Unnamed: 0,LEVNAME,LEVINDEX,VARNAME
0,0,0,P_CHURN0
1,1,1,P_CHURN1

Unnamed: 0,LEVNAME,LEVINDEX,VARNAME
0,,0,I_CHURN

Unnamed: 0,Variable,Event,CutOff,TP,FP,FN,TN,Sensitivity,Specificity,KS,...,F_HALF,FPR,ACC,FDR,F1,C,Gini,Gamma,Tau,MISCEVENT
0,P_CHURN0,0,0.00,2850.00,483.00,0.00,0.00,1.00,0.00,0.00,...,0.88,1.00,0.86,0.14,0.92,0.89,0.78,0.89,0.19,0.14
1,P_CHURN0,0,0.01,2850.00,435.00,0.00,48.00,1.00,0.10,0.00,...,0.89,0.90,0.87,0.13,0.93,0.89,0.78,0.89,0.19,0.13
2,P_CHURN0,0,0.02,2850.00,435.00,0.00,48.00,1.00,0.10,0.00,...,0.89,0.90,0.87,0.13,0.93,0.89,0.78,0.89,0.19,0.13
3,P_CHURN0,0,0.03,2850.00,435.00,0.00,48.00,1.00,0.10,0.00,...,0.89,0.90,0.87,0.13,0.93,0.89,0.78,0.89,0.19,0.13
4,P_CHURN0,0,0.04,2848.00,378.00,2.00,105.00,1.00,0.22,0.00,...,0.90,0.78,0.89,0.12,0.94,0.89,0.78,0.89,0.19,0.11
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,P_CHURN0,0,0.95,2226.00,62.00,624.00,421.00,0.78,0.87,0.00,...,0.93,0.13,0.79,0.03,0.87,0.89,0.78,0.89,0.19,0.21
96,P_CHURN0,0,0.96,2226.00,62.00,624.00,421.00,0.78,0.87,0.00,...,0.93,0.13,0.79,0.03,0.87,0.89,0.78,0.89,0.19,0.21
97,P_CHURN0,0,0.97,2226.00,62.00,624.00,421.00,0.78,0.87,0.00,...,0.93,0.13,0.79,0.03,0.87,0.89,0.78,0.89,0.19,0.21
98,P_CHURN0,0,0.98,52.00,0.00,2798.00,483.00,0.02,1.00,0.00,...,0.09,0.00,0.16,0.00,0.04,0.89,0.78,0.89,0.19,0.84

Unnamed: 0,NOBS,ASE,DIV,RASE,MCE,MCLL
0,3333.0,0.06,3333.0,0.25,0.08,0.23

Unnamed: 0,Parameter,Value
0,Model Type,Decision Tree
1,Tuner Objective Function,Area Under Curve
2,Search Method,GRID
3,Number of Grid Points,6
4,Maximum Tuning Time in Seconds,36000
5,Validation Type,Single Partition
6,Validation Partition Fraction,0.30
7,Log Level,0
8,Seed,1695368905
9,Number of Parallel Evaluations,4

Unnamed: 0,Evaluation,MAXLEVEL,NBINS,CRIT,AreaUnderCurve,EvaluationTime
0,0,11,20,gainRatio,0.81,0.38
1,2,5,100,gain,0.88,0.58
2,4,10,100,gainRatio,0.86,0.97
3,3,10,100,gain,0.82,0.84
4,5,15,100,gain,0.82,0.56
5,1,15,100,gainRatio,0.8,0.47
6,6,5,100,gainRatio,0.77,0.37

Unnamed: 0,Iteration,Evaluations,Best_obj,Time_sec
0,0,1,0.81,0.38
1,1,7,0.88,1.41

Unnamed: 0,Evaluation,Iteration,MAXLEVEL,NBINS,CRIT,AreaUnderCurve,EvaluationTime
0,0,0,11,20,gainRatio,0.81,0.38
1,1,1,15,100,gainRatio,0.8,0.47
2,2,1,5,100,gain,0.88,0.58
3,3,1,10,100,gain,0.82,0.84
4,4,1,10,100,gainRatio,0.86,0.97
5,5,1,15,100,gain,0.82,0.56
6,6,1,5,100,gainRatio,0.77,0.37

Unnamed: 0,Parameter,Name,Value
0,Evaluation,Evaluation,2
1,Maximum Tree Levels,MAXLEVEL,5
2,Maximum Bins,NBINS,100
3,Criterion,CRIT,gain
4,Area Under Curve,Objective,0.8804436378

Unnamed: 0,Parameter,Value
0,Initial Configuration Objective Value,0.81
1,Best Configuration Objective Value,0.88
2,Worst Configuration Objective Value,0.77
3,Initial Configuration Evaluation Time in Seconds,0.38
4,Best Configuration Evaluation Time in Seconds,0.41
5,Number of Improved Configurations,1.0
6,Number of Evaluated Configurations,7.0
7,Total Tuning Time in Seconds,1.66
8,Parallel Tuning Speedup,2.28

Unnamed: 0,Task,Time_sec,Time_percent
0,Model Training,2.39,63.03
1,Model Scoring,0.9,23.81
2,Total Objective Evaluations,3.3,86.93
3,Tuner,0.5,13.07
4,Total CPU Time,3.8,100.0

Unnamed: 0,Hyperparameter,RelImportance
0,CRIT,1.0
1,MAXLEVEL,0.43
2,NBINS,0.0

Unnamed: 0,Descr,Value
0,Number of Trees,150.0
1,Distribution,2.0
2,Learning Rate,0.1
3,Subsampling Rate,0.6
4,Number of Selected Variables (M),10.0
5,Number of Bins,57.0
6,Number of Variables,10.0
7,Max Number of Tree Nodes,31.0
8,Min Number of Tree Nodes,19.0
9,Max Number of Branches,2.0

Unnamed: 0,Progress,Metric
0,1.00,0.14
1,2.00,0.14
2,3.00,0.14
3,4.00,0.14
4,5.00,0.13
...,...,...
145,146.00,0.04
146,147.00,0.04
147,148.00,0.04
148,149.00,0.04

Unnamed: 0,Descr,Value
0,Number of Observations Read,3333.0
1,Number of Observations Used,3333.0
2,Misclassification Error (%),3.7203720372

Unnamed: 0,TreeID,Trees,NLeaves,MCR,LogLoss,ASE,RASE,MAXAE
0,0.00,1.00,14.00,0.14,0.37,0.11,0.33,0.87
1,1.00,2.00,29.00,0.14,0.34,0.10,0.32,0.88
2,2.00,3.00,43.00,0.14,0.32,0.09,0.31,0.89
3,3.00,4.00,58.00,0.14,0.31,0.09,0.30,0.89
4,4.00,5.00,74.00,0.13,0.29,0.08,0.29,0.90
...,...,...,...,...,...,...,...,...
145,145.00,146.00,2240.00,0.04,0.11,0.03,0.17,0.96
146,146.00,147.00,2256.00,0.04,0.11,0.03,0.17,0.96
147,147.00,148.00,2272.00,0.04,0.11,0.03,0.17,0.96
148,148.00,149.00,2288.00,0.04,0.11,0.03,0.17,0.96

Unnamed: 0,LEVNAME,LEVINDEX,VARNAME
0,0,0,P_CHURN0
1,1,1,P_CHURN1

Unnamed: 0,LEVNAME,LEVINDEX,VARNAME
0,,0,I_CHURN

Unnamed: 0,Variable,Event,CutOff,TP,FP,FN,TN,Sensitivity,Specificity,KS,...,F_HALF,FPR,ACC,FDR,F1,C,Gini,Gamma,Tau,MISCEVENT
0,P_CHURN0,0,0.00,2850.00,483.00,0.00,0.00,1.00,0.00,0.00,...,0.88,1.00,0.86,0.14,0.92,0.99,0.98,0.98,0.24,0.14
1,P_CHURN0,0,0.01,2850.00,479.00,0.00,4.00,1.00,0.01,0.00,...,0.88,0.99,0.86,0.14,0.92,0.99,0.98,0.98,0.24,0.14
2,P_CHURN0,0,0.02,2850.00,470.00,0.00,13.00,1.00,0.03,0.00,...,0.88,0.97,0.86,0.14,0.92,0.99,0.98,0.98,0.24,0.14
3,P_CHURN0,0,0.03,2850.00,447.00,0.00,36.00,1.00,0.07,0.00,...,0.89,0.93,0.87,0.14,0.93,0.99,0.98,0.98,0.24,0.13
4,P_CHURN0,0,0.04,2850.00,431.00,0.00,52.00,1.00,0.11,0.00,...,0.89,0.89,0.87,0.13,0.93,0.99,0.98,0.98,0.24,0.13
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,P_CHURN0,0,0.95,2200.00,1.00,650.00,482.00,0.77,1.00,0.00,...,0.94,0.00,0.80,0.00,0.87,0.99,0.98,0.98,0.24,0.20
96,P_CHURN0,0,0.96,2069.00,0.00,781.00,483.00,0.73,1.00,0.00,...,0.93,0.00,0.77,0.00,0.84,0.99,0.98,0.98,0.24,0.23
97,P_CHURN0,0,0.97,1868.00,0.00,982.00,483.00,0.66,1.00,0.00,...,0.90,0.00,0.71,0.00,0.79,0.99,0.98,0.98,0.24,0.29
98,P_CHURN0,0,0.98,1574.00,0.00,1276.00,483.00,0.55,1.00,0.00,...,0.86,0.00,0.62,0.00,0.71,0.99,0.98,0.98,0.24,0.38

Unnamed: 0,NOBS,ASE,DIV,RASE,MCE,MCLL
0,3333.0,0.03,3333.0,0.17,0.04,0.11

Unnamed: 0,Parameter,Value
0,Model Type,Gradient Boosting Tree
1,Tuner Objective Function,Area Under Curve
2,Search Method,GRID
3,Number of Grid Points,16
4,Maximum Tuning Time in Seconds,36000
5,Validation Type,Single Partition
6,Validation Partition Fraction,0.30
7,Log Level,0
8,Seed,1695369075
9,Number of Parallel Evaluations,4

Unnamed: 0,Evaluation,M,LEARNINGRATE,SUBSAMPLERATE,LASSO,RIDGE,NBINS,MAXLEVEL,AreaUnderCurve,EvaluationTime
0,0,10,0.1,0.5,0.0,1.0,50,5,0.88,0.44
1,12,10,0.1,0.6,0.0,0.0,57,5,0.92,6.1
2,15,10,0.1,0.8,0.5,0.0,57,5,0.91,3.3
3,10,10,0.1,0.6,0.5,0.0,57,5,0.91,4.8
4,7,10,0.1,0.8,0.0,0.0,57,7,0.91,4.96
5,14,10,0.1,0.8,0.5,0.0,57,7,0.91,4.57
6,11,10,0.1,0.6,0.0,0.0,57,7,0.91,5.85
7,8,10,0.1,0.6,0.5,0.0,57,7,0.91,4.79
8,5,10,0.1,0.8,0.0,0.0,57,5,0.9,2.75
9,2,10,0.05,0.6,0.5,0.0,57,5,0.89,0.98

Unnamed: 0,Iteration,Evaluations,Best_obj,Time_sec
0,0,1,0.88,0.44
1,1,17,0.92,12.89

Unnamed: 0,Evaluation,Iteration,M,LEARNINGRATE,SUBSAMPLERATE,LASSO,RIDGE,NBINS,MAXLEVEL,AreaUnderCurve,EvaluationTime
0,0,0,10,0.1,0.5,0.0,1.0,50,5,0.88,0.44
1,1,1,10,0.05,0.6,0.0,0.0,57,7,0.87,0.82
2,2,1,10,0.05,0.6,0.5,0.0,57,5,0.89,0.98
3,3,1,10,0.05,0.8,0.0,0.0,57,5,0.88,1.24
4,4,1,10,0.05,0.6,0.0,0.0,57,5,0.88,1.31
5,5,1,10,0.1,0.8,0.0,0.0,57,5,0.9,2.75
6,6,1,10,0.05,0.6,0.5,0.0,57,7,0.88,1.45
7,7,1,10,0.1,0.8,0.0,0.0,57,7,0.91,4.96
8,8,1,10,0.1,0.6,0.5,0.0,57,7,0.91,4.79
9,9,1,10,0.05,0.8,0.5,0.0,57,7,0.88,1.4

Unnamed: 0,Parameter,Name,Value
0,Evaluation,Evaluation,12.0
1,Number of Variables to Try,M,10.0
2,Learning Rate,LEARNINGRATE,0.1
3,Sampling Rate,SUBSAMPLERATE,0.6
4,Lasso,LASSO,0.0
5,Ridge,RIDGE,0.0
6,Number of Bins,NBINS,57.0
7,Maximum Tree Levels,MAXLEVEL,5.0
8,Area Under Curve,Objective,0.9165839887

Unnamed: 0,Parameter,Value
0,Initial Configuration Objective Value,0.88
1,Best Configuration Objective Value,0.92
2,Worst Configuration Objective Value,0.87
3,Initial Configuration Evaluation Time in Seconds,0.44
4,Best Configuration Evaluation Time in Seconds,6.1
5,Number of Improved Configurations,7.0
6,Number of Evaluated Configurations,17.0
7,Total Tuning Time in Seconds,18.67
8,Parallel Tuning Speedup,2.82

Unnamed: 0,Task,Time_sec,Time_percent
0,Model Training,48.98,93.03
1,Model Scoring,3.16,6.01
2,Total Objective Evaluations,52.15,99.05
3,Tuner,0.5,0.95
4,Total CPU Time,52.65,100.0

Unnamed: 0,Hyperparameter,RelImportance
0,LEARNINGRATE,1.0
1,MAXLEVEL,0.02
2,LASSO,0.02
3,SUBSAMPLERATE,0.0
4,M,0.0
5,RIDGE,0.0
6,NBINS,0.0

Unnamed: 0,Descr,Value
0,Number of Trees,150.0
1,Distribution,2.0
2,Learning Rate,0.1
3,Subsampling Rate,0.8
4,Number of Selected Variables (M),21.0
5,Number of Bins,57.0
6,Number of Variables,21.0
7,Max Number of Tree Nodes,111.0
8,Min Number of Tree Nodes,41.0
9,Max Number of Branches,2.0

Unnamed: 0,Progress,Metric
0,1.00,0.14
1,2.00,0.14
2,3.00,0.14
3,4.00,0.14
4,5.00,0.09
...,...,...
97,98.00,0.00
98,99.00,0.00
99,100.00,0.00
100,101.00,0.00

Unnamed: 0,Descr,Value
0,Number of Observations Read,3333.0
1,Number of Observations Used,3333.0
2,Misclassification Error (%),0.0300030003

Unnamed: 0,TreeID,Trees,NLeaves,MCR,LogLoss,ASE,RASE,MAXAE
0,0.00,1.00,33.00,0.14,0.35,0.11,0.32,0.87
1,1.00,2.00,63.00,0.14,0.31,0.09,0.30,0.88
2,2.00,3.00,96.00,0.14,0.29,0.08,0.28,0.89
3,3.00,4.00,133.00,0.14,0.26,0.07,0.27,0.90
4,4.00,5.00,169.00,0.09,0.24,0.06,0.25,0.91
...,...,...,...,...,...,...,...,...
97,97.00,98.00,3669.00,0.00,0.02,0.00,0.05,0.61
98,98.00,99.00,3704.00,0.00,0.02,0.00,0.05,0.60
99,99.00,100.00,3745.00,0.00,0.02,0.00,0.05,0.59
100,100.00,101.00,3784.00,0.00,0.02,0.00,0.05,0.58

Unnamed: 0,LEVNAME,LEVINDEX,VARNAME
0,0,0,P_CHURN0
1,1,1,P_CHURN1

Unnamed: 0,LEVNAME,LEVINDEX,VARNAME
0,,0,I_CHURN

Unnamed: 0,Variable,Event,CutOff,TP,FP,FN,TN,Sensitivity,Specificity,KS,...,F_HALF,FPR,ACC,FDR,F1,C,Gini,Gamma,Tau,MISCEVENT
0,P_CHURN0,0,0.00,2850.00,483.00,0.00,0.00,1.00,0.00,0.00,...,0.88,1.00,0.86,0.14,0.92,1.00,1.00,1.00,0.25,0.14
1,P_CHURN0,0,0.01,2850.00,450.00,0.00,33.00,1.00,0.07,0.00,...,0.89,0.93,0.86,0.14,0.93,1.00,1.00,1.00,0.25,0.14
2,P_CHURN0,0,0.02,2850.00,360.00,0.00,123.00,1.00,0.25,0.00,...,0.91,0.75,0.89,0.11,0.94,1.00,1.00,1.00,0.25,0.11
3,P_CHURN0,0,0.03,2850.00,285.00,0.00,198.00,1.00,0.41,0.00,...,0.93,0.59,0.91,0.09,0.95,1.00,1.00,1.00,0.25,0.09
4,P_CHURN0,0,0.04,2850.00,239.00,0.00,244.00,1.00,0.51,0.00,...,0.94,0.49,0.93,0.08,0.96,1.00,1.00,1.00,0.25,0.07
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
95,P_CHURN0,0,0.95,2729.00,0.00,121.00,483.00,0.96,1.00,0.00,...,0.99,0.00,0.96,0.00,0.98,1.00,1.00,1.00,0.25,0.04
96,P_CHURN0,0,0.96,2659.00,0.00,191.00,483.00,0.93,1.00,0.00,...,0.99,0.00,0.94,0.00,0.97,1.00,1.00,1.00,0.25,0.06
97,P_CHURN0,0,0.97,2549.00,0.00,301.00,483.00,0.89,1.00,0.00,...,0.98,0.00,0.91,0.00,0.94,1.00,1.00,1.00,0.25,0.09
98,P_CHURN0,0,0.98,2386.00,0.00,464.00,483.00,0.84,1.00,0.00,...,0.96,0.00,0.86,0.00,0.91,1.00,1.00,1.00,0.25,0.14

Unnamed: 0,NOBS,ASE,DIV,RASE,MCE,MCLL
0,3333.0,0.0,3333.0,0.05,0.0,0.02

Unnamed: 0,Parameter,Value
0,Model Type,Gradient Boosting Tree
1,Tuner Objective Function,Area Under Curve
2,Search Method,GRID
3,Number of Grid Points,16
4,Maximum Tuning Time in Seconds,36000
5,Validation Type,Single Partition
6,Validation Partition Fraction,0.30
7,Log Level,0
8,Seed,1695382851
9,Number of Parallel Evaluations,4

Unnamed: 0,Evaluation,M,LEARNINGRATE,SUBSAMPLERATE,LASSO,RIDGE,NBINS,MAXLEVEL,AreaUnderCurve,EvaluationTime
0,0,21,0.1,0.5,0.0,1.0,50,5,0.83,0.99
1,13,21,0.1,0.8,0.5,0.0,57,7,0.91,11.42
2,12,21,0.1,0.6,0.0,0.0,57,5,0.91,7.31
3,4,21,0.1,0.6,0.5,0.0,57,7,0.91,11.01
4,15,21,0.1,0.6,0.0,0.0,57,7,0.91,11.48
5,3,21,0.1,0.8,0.0,0.0,57,5,0.9,7.08
6,16,21,0.1,0.6,0.5,0.0,57,5,0.9,5.36
7,9,21,0.1,0.8,0.0,0.0,57,7,0.9,8.6
8,10,21,0.1,0.8,0.5,0.0,57,5,0.89,4.66
9,11,21,0.05,0.6,0.0,0.0,57,7,0.85,3.04

Unnamed: 0,Iteration,Evaluations,Best_obj,Time_sec
0,0,1,0.83,0.99
1,1,17,0.91,26.0

Unnamed: 0,Evaluation,Iteration,M,LEARNINGRATE,SUBSAMPLERATE,LASSO,RIDGE,NBINS,MAXLEVEL,AreaUnderCurve,EvaluationTime
0,0,0,21,0.1,0.5,0.0,1.0,50,5,0.83,0.99
1,1,1,21,0.05,0.8,0.0,0.0,57,5,0.84,1.27
2,2,1,21,0.05,0.6,0.5,0.0,57,7,0.85,2.83
3,3,1,21,0.1,0.8,0.0,0.0,57,5,0.9,7.08
4,4,1,21,0.1,0.6,0.5,0.0,57,7,0.91,11.01
5,5,1,21,0.05,0.8,0.5,0.0,57,7,0.83,2.79
6,6,1,21,0.05,0.6,0.5,0.0,57,5,0.82,2.05
7,7,1,21,0.05,0.8,0.5,0.0,57,5,0.84,1.88
8,8,1,21,0.05,0.8,0.0,0.0,57,7,0.84,2.5
9,9,1,21,0.1,0.8,0.0,0.0,57,7,0.9,8.6

Unnamed: 0,Parameter,Name,Value
0,Evaluation,Evaluation,13.0
1,Number of Variables to Try,M,21.0
2,Learning Rate,LEARNINGRATE,0.1
3,Sampling Rate,SUBSAMPLERATE,0.8
4,Lasso,LASSO,0.5
5,Ridge,RIDGE,0.0
6,Number of Bins,NBINS,57.0
7,Maximum Tree Levels,MAXLEVEL,7.0
8,Area Under Curve,Objective,0.9120750151

Unnamed: 0,Parameter,Value
0,Initial Configuration Objective Value,0.83
1,Best Configuration Objective Value,0.91
2,Worst Configuration Objective Value,0.82
3,Initial Configuration Evaluation Time in Seconds,0.99
4,Best Configuration Evaluation Time in Seconds,11.42
5,Number of Improved Configurations,6.0
6,Number of Evaluated Configurations,17.0
7,Total Tuning Time in Seconds,37.9
8,Parallel Tuning Speedup,2.57

Unnamed: 0,Task,Time_sec,Time_percent
0,Model Training,93.42,95.97
1,Model Scoring,3.4,3.49
2,Total Objective Evaluations,96.83,99.47
3,Tuner,0.52,0.53
4,Total CPU Time,97.34,100.0

Unnamed: 0,CAS_Library,Name,Rows,Columns
0,CASUSER(SASDEMO),churn_model_gradBoost_5,1,2

Unnamed: 0,Hyperparameter,RelImportance
0,LEARNINGRATE,1.0
1,MAXLEVEL,0.03
2,LASSO,0.0
3,SUBSAMPLERATE,0.0
4,M,0.0
5,RIDGE,0.0
6,NBINS,0.0

Unnamed: 0,casLib,Name,Rows,Columns,casTable
0,CASUSER(sasdemo),PIPELINE_OUT,10,49,"CASTable('PIPELINE_OUT', caslib='CASUSER(sasde..."
1,CASUSER(sasdemo),TRANSFORMATION_OUT,19,21,"CASTable('TRANSFORMATION_OUT', caslib='CASUSER..."
2,CASUSER(sasdemo),FEATURE_OUT,106,15,"CASTable('FEATURE_OUT', caslib='CASUSER(sasdem..."
3,CASUSER(sasdemo),churn_model_fm_,1,2,"CASTable('churn_model_fm_', caslib='CASUSER(sa..."
4,CASUSER(sasdemo),churn_model_gradBoost_1,1,2,"CASTable('churn_model_gradBoost_1', caslib='CA..."
5,CASUSER(sasdemo),churn_model_gradBoost_2,1,2,"CASTable('churn_model_gradBoost_2', caslib='CA..."
6,CASUSER(sasdemo),churn_model_gradBoost_3,1,2,"CASTable('churn_model_gradBoost_3', caslib='CA..."
7,CASUSER(sasdemo),churn_model_gradBoost_4,1,2,"CASTable('churn_model_gradBoost_4', caslib='CA..."
8,CASUSER(sasdemo),churn_model_gradBoost_5,1,2,"CASTable('churn_model_gradBoost_5', caslib='CA..."


In [20]:
sess.tableinfo()

Unnamed: 0,Name,Rows,Columns,IndexedColumns,Encoding,CreateTimeFormatted,ModTimeFormatted,AccessTimeFormatted,JavaCharSet,CreateTime,...,Repeated,View,MultiPart,SourceName,SourceCaslib,Compressed,Creator,Modifier,SourceModTimeFormatted,SourceModTime
0,CHURN,3333,21,0,utf-8,2020-06-02T17:12:38+00:00,2020-06-02T17:12:38+00:00,2020-06-02T17:44:05+00:00,UTF8,1906737157.62,...,0,0,0,,,0,sasdemo,,,
1,CHURN_MODEL_FM_,1,2,0,utf-8,2020-06-02T17:44:06+00:00,2020-06-02T17:44:06+00:00,2020-06-02T17:51:17+00:00,UTF8,1906739045.52,...,0,0,0,,,0,sasdemo,,,
2,TRANSFORMATION_OUT,19,21,0,utf-8,2020-06-02T17:44:06+00:00,2020-06-02T17:44:06+00:00,2020-06-02T17:51:17+00:00,UTF8,1906739045.59,...,0,0,0,,,0,sasdemo,,,
3,FEATURE_OUT,106,15,0,utf-8,2020-06-02T17:44:06+00:00,2020-06-02T17:44:06+00:00,2020-06-02T17:51:17+00:00,UTF8,1906739045.59,...,0,0,0,,,0,sasdemo,,,
4,CHURN_MODEL_GRADBOOST_1,1,2,0,utf-8,2020-06-02T17:49:27+00:00,2020-06-02T17:49:27+00:00,2020-06-02T17:51:17+00:00,UTF8,1906739366.6,...,0,0,0,,,0,sasdemo,,,
5,CHURN_MODEL_GRADBOOST_2,1,2,0,utf-8,2020-06-02T17:49:41+00:00,2020-06-02T17:49:41+00:00,2020-06-02T17:51:17+00:00,UTF8,1906739381.34,...,0,0,0,,,0,sasdemo,,,
6,CHURN_MODEL_GRADBOOST_3,1,2,0,utf-8,2020-06-02T17:50:17+00:00,2020-06-02T17:50:17+00:00,2020-06-02T17:51:17+00:00,UTF8,1906739416.56,...,0,0,0,,,0,sasdemo,,,
7,CHURN_MODEL_GRADBOOST_4,1,2,0,utf-8,2020-06-02T17:50:38+00:00,2020-06-02T17:50:38+00:00,2020-06-02T17:51:17+00:00,UTF8,1906739438.41,...,0,0,0,,,0,sasdemo,,,
8,CHURN_MODEL_GRADBOOST_5,1,2,0,utf-8,2020-06-02T17:51:17+00:00,2020-06-02T17:51:17+00:00,2020-06-02T17:51:17+00:00,UTF8,1906739476.77,...,0,0,0,,,0,sasdemo,,,
9,PIPELINE_OUT,10,49,0,utf-8,2020-06-02T17:51:17+00:00,2020-06-02T17:51:17+00:00,2020-06-02T17:51:17+00:00,UTF8,1906739476.83,...,0,0,0,,,0,sasdemo,,,


In [21]:
churn_model_fm_ = sess.CASTable('CHURN_MODEL_FM_')

In [22]:
sess.loadactionset('astore')

NOTE: Added action set 'astore'.


In [23]:
sess.describe(churn_model_fm_)

Unnamed: 0,Key
0,9232E8B6FBDF1D5E33F226BFFE14710C507B2BD7

Unnamed: 0,Attribute,Value
0,Analytic Engine,fte
1,Time Created,02Jun2020:17:44:05

Unnamed: 0,Name,Length,Role,Type,RawType,FormatName
0,day_charge,8.0,Input,Interval,Num,
1,day_mins,8.0,Input,Interval,Num,
2,eve_charge,8.0,Input,Interval,Num,
3,eve_mins,8.0,Input,Interval,Num,
4,intl_charge,8.0,Input,Interval,Num,
5,intl_mins,8.0,Input,Interval,Num,
6,night_charge,8.0,Input,Interval,Num,
7,night_mins,8.0,Input,Interval,Num,
8,account_length,8.0,Input,Interval,Num,
9,custserv_calls,8.0,Input,Interval,Num,

Unnamed: 0,Name,Length,Type,Label
0,nhoks_nloks_dtree_10_day_charge,8.0,Num,"day_charge: Not high (outlier, kurtosis, skewn..."
1,nhoks_nloks_dtree_5_day_charge,8.0,Num,"day_charge: Not high (outlier, kurtosis, skewn..."
2,nhoks_nloks_log_day_charge,8.0,Num,"day_charge: Not high (outlier, kurtosis, skewn..."
3,nhoks_nloks_pow_n0_5_day_charge,8.0,Num,"day_charge: Not high (outlier, kurtosis, skewn..."
4,nhoks_nloks_dtree_10_day_mins,8.0,Num,"day_mins: Not high (outlier, kurtosis, skewnes..."
5,nhoks_nloks_dtree_5_day_mins,8.0,Num,"day_mins: Not high (outlier, kurtosis, skewnes..."
6,nhoks_nloks_log_day_mins,8.0,Num,"day_mins: Not high (outlier, kurtosis, skewnes..."
7,nhoks_nloks_pow_p0_5_day_mins,8.0,Num,"day_mins: Not high (outlier, kurtosis, skewnes..."
8,nhoks_nloks_dtree_10_eve_charge,8.0,Num,"eve_charge: Not high (outlier, kurtosis, skewn..."
9,nhoks_nloks_dtree_5_eve_charge,8.0,Num,"eve_charge: Not high (outlier, kurtosis, skewn..."


In [24]:
sess.score(table='CHURN',casout=dict(name='fm_result', caslib='casuser', replace =True),rstore=churn_model_fm_, copyvars=['CHURN'])

Unnamed: 0,casLib,Name,Rows,Columns,casTable
0,CASUSER(sasdemo),fm_result,3333,51,"CASTable('fm_result', caslib='CASUSER(sasdemo)')"

Unnamed: 0,Task,Seconds,Percent
0,Loading the Store,0.0,0.01
1,Creating the State,0.01,0.32
2,Scoring,0.01,0.66
3,Total,0.02,1.0


In [25]:
fm_result = sess.CASTable('fm_result')

In [26]:
fm_result.head()

Unnamed: 0,nhoks_nloks_dtree_10_day_charge,nhoks_nloks_dtree_5_day_charge,nhoks_nloks_log_day_charge,nhoks_nloks_pow_n0_5_day_charge,nhoks_nloks_dtree_10_day_mins,nhoks_nloks_dtree_5_day_mins,nhoks_nloks_log_day_mins,nhoks_nloks_pow_p0_5_day_mins,nhoks_nloks_dtree_10_eve_charge,nhoks_nloks_dtree_5_eve_charge,...,hc_tar_frq_rat_eve_calls,cpy_nom_mode_imp_lab_intl_calls,lchehi_lab_intl_calls,hc_lbl_cnt_night_calls,hc_tar_frq_rat_night_calls,lcnhenhi_dtree10_vmail_message,lcnhenhi_dtree5_vmail_message,cpy_nom_mode_imp_lab_INTL_PLAN,cpy_nom_mode_imp_lab_VMAIL_PLAN,CHURN
0,9.0,4.0,3.83,0.15,9.0,4.0,5.58,16.31,5.0,3.0,...,0.88,4.0,4.0,3.0,0.88,1.0,1.0,1.0,2.0,0
1,4.0,1.0,3.35,0.19,4.0,1.0,5.09,12.75,5.0,3.0,...,0.88,4.0,4.0,9.0,0.88,7.0,4.0,1.0,2.0,0
2,8.0,3.0,3.75,0.15,8.0,3.0,5.5,15.63,1.0,1.0,...,0.79,6.0,6.0,2.0,0.87,8.0,4.0,1.0,1.0,0
3,10.0,5.0,3.95,0.14,10.0,5.0,5.71,17.33,1.0,1.0,...,0.84,8.0,8.0,22.0,0.85,8.0,4.0,2.0,1.0,0
4,4.0,1.0,3.38,0.18,4.0,1.0,5.12,12.95,2.0,2.0,...,0.76,4.0,4.0,38.0,0.84,8.0,4.0,2.0,1.0,0


In [27]:
pipeline_out = sess.CASTable('PIPELINE_OUT')
pipeline_out.head(10)

Unnamed: 0,PipelineId,ModelType,MLType,Objective,ObjectiveType,Target,NFeatures,Feat1Id,Feat1IsNom,Feat2Id,...,Feat17Id,Feat17IsNom,Feat18Id,Feat18IsNom,Feat19Id,Feat19IsNom,Feat20Id,Feat20IsNom,Feat21Id,Feat21IsNom
0,10.0,binary classification,gradBoost,0.94,AUC,CHURN,15.0,105.0,1.0,86.0,...,,,,,,,,,,
1,20.0,binary classification,gradBoost,0.92,AUC,CHURN,10.0,105.0,1.0,86.0,...,,,,,,,,,,
2,2.0,binary classification,gradBoost,0.92,AUC,CHURN,21.0,105.0,1.0,85.0,...,94.0,0.0,93.0,1.0,95.0,1.0,63.0,1.0,73.0,1.0
3,18.0,binary classification,gradBoost,0.91,AUC,CHURN,10.0,105.0,1.0,86.0,...,,,,,,,,,,
4,8.0,binary classification,gradBoost,0.91,AUC,CHURN,21.0,105.0,1.0,85.0,...,94.0,0.0,93.0,1.0,95.0,1.0,62.0,1.0,79.0,0.0
5,12.0,binary classification,gradBoost,0.91,AUC,CHURN,12.0,105.0,1.0,85.0,...,,,,,,,,,,
6,14.0,binary classification,gradBoost,0.91,AUC,CHURN,10.0,105.0,1.0,85.0,...,,,,,,,,,,
7,16.0,binary classification,gradBoost,0.91,AUC,CHURN,12.0,105.0,1.0,86.0,...,,,,,,,,,,
8,6.0,binary classification,gradBoost,0.9,AUC,CHURN,10.0,105.0,1.0,85.0,...,,,,,,,,,,
9,17.0,binary classification,dtree,0.89,AUC,CHURN,10.0,105.0,1.0,86.0,...,,,,,,,,,,


In [28]:
champ_model = sess.CASTable('CHURN_MODEL_GRADBOOST_1')

sess.score(table=fm_result,
           casout=dict(name='model_result', caslib='casuser', replace =True),
           rstore=champ_model)

Unnamed: 0,casLib,Name,Rows,Columns,casTable
0,CASUSER(sasdemo),model_result,3333,4,"CASTable('model_result', caslib='CASUSER(sasde..."

Unnamed: 0,Task,Seconds,Percent
0,Loading the Store,0.0,0.0
1,Creating the State,0.18,0.79
2,Scoring,0.05,0.21
3,Total,0.23,1.0


In [29]:
sess.save(table=champ_model,name='automl_churn_champ.sashdat', replace=True)

NOTE: Cloud Analytic Services saved the file automl_churn_champ.sashdat in caslib CASUSER(sasdemo).


In [30]:
result = sess.CASTable('model_result')
result.head()

Unnamed: 0,P_CHURN0,P_CHURN1,I_CHURN,_WARN_
0,1.0,0.0,0,
1,1.0,0.0,0,
2,0.98,0.02,0,
3,0.95,0.05,0,
4,0.99,0.01,0,


In [31]:
sess.save(table=result,name='model_scoring_result.sashdat', replace=True)

NOTE: Cloud Analytic Services saved the file model_scoring_result.sashdat in caslib CASUSER(sasdemo).


In [32]:
sess.fileinfo(allfiles=True)

Unnamed: 0,Permission,Owner,Group,Name,Size,Encryption,Time,ModTime
0,-rwxr-xr-x,sasdemo,sas,automl_churn_champ.sashdat,6916880,NONE,2020-06-02T18:14:33+00:00,1906740873.45
1,-rwxr-xr-x,sasdemo,sas,model_scoring_result.sashdat,197592,NONE,2020-06-02T18:14:38+00:00,1906740878.45


In [34]:
sess.close()

In [14]:
import h2o
from h2o.automl import H2OAutoML
h2o.init('http://ccbu-vidk.dlviyacluster.sashq-d.openstack.sas.com:54321/')

Checking whether there is an H2O instance running at http://ccbu-vidk.dlviyacluster.sashq-d.openstack.sas.com:54321 . connected.


0,1
H2O_cluster_uptime:,3 mins 32 secs
H2O_cluster_timezone:,Etc/UTC
H2O_data_parsing_timezone:,UTC
H2O_cluster_version:,3.30.0.2
H2O_cluster_version_age:,1 month and 3 days
H2O_cluster_name:,root
H2O_cluster_total_nodes:,1
H2O_cluster_free_memory:,938 Mb
H2O_cluster_total_cores:,24
H2O_cluster_allowed_cores:,24


In [15]:
churn_h2o = h2o.H2OFrame(out.to_frame())

Parse progress: |█████████████████████████████████████████████████████████| 100%


In [16]:
churn_h2o.head()

account_length,vmail_message,day_mins,eve_mins,night_mins,intl_mins,custserv_calls,day_calls,day_charge,eve_calls,eve_charge,night_calls,night_charge,intl_calls,intl_charge,state,phone,INTL_PLAN,VMAIL_PLAN,AREA_CODE,CHURN
128,25,265.1,197.4,244.7,10.0,1,110,45.07,99,16.78,91,11.01,3,2.7,KS,382-4657,0,1,415,0
107,26,161.6,195.5,254.4,13.7,1,123,27.47,103,16.62,103,11.45,3,3.7,OH,371-7191,0,1,415,0
137,0,243.4,121.2,162.6,12.2,0,114,41.38,110,10.3,104,7.32,5,3.29,NJ,358-1921,0,0,415,0
84,0,299.4,61.9,196.9,6.6,2,71,50.9,88,5.26,89,8.86,7,1.78,OH,375-9999,1,0,408,0
75,0,166.7,148.3,186.9,10.1,3,113,28.34,122,12.61,121,8.41,3,2.73,OK,330-6626,1,0,415,0
118,0,223.4,220.6,203.9,6.3,0,98,37.98,101,18.75,118,9.18,6,1.7,AL,391-8027,1,0,510,0
121,24,218.2,348.5,212.6,7.5,3,88,37.09,108,29.62,118,9.57,7,2.03,MA,355-9993,0,1,510,0
147,0,157.0,103.1,211.8,7.1,0,79,26.69,94,8.76,96,9.53,6,1.92,MO,329-9001,1,0,415,0
117,0,184.5,351.6,215.8,8.7,1,97,31.37,80,29.89,90,9.71,4,2.35,LA,335-4719,0,0,408,0
141,37,258.6,222.0,326.4,11.2,0,84,43.96,111,18.87,97,14.69,5,3.02,WV,330-8173,1,1,415,0




In [17]:
x = churn_h2o.columns
y = 'CHURN'
x.remove(y)
x.remove('AREA_CODE')

In [18]:
churn_h2o[y] = churn_h2o[y].asfactor()

In [19]:
# Run AutoML for 20 base models (limited to 1 hour max runtime by default)
aml = H2OAutoML(max_models=20, seed=1,stopping_metric='auc',sort_metric='auc')
aml.train(x=x, y=y, training_frame=churn_h2o)

# View the AutoML Leaderboard
lb = aml.leaderboard
lb.head(rows=lb.nrows)

AutoML progress: |████████████████████████████████████████████████████████| 100%


model_id,auc,logloss,aucpr,mean_per_class_error,rmse,mse
XGBoost_grid__1_AutoML_20200602_000138_model_4,0.919038,0.170703,0.858957,0.122468,0.20374,0.04151
XGBoost_grid__1_AutoML_20200602_000138_model_3,0.916229,0.182804,0.859997,0.121415,0.211499,0.044732
XGBoost_3_AutoML_20200602_000138,0.914528,0.171909,0.858757,0.132592,0.203327,0.041342
XGBoost_grid__1_AutoML_20200602_000138_model_1,0.913543,0.188334,0.837277,0.117239,0.216441,0.0468466
StackedEnsemble_BestOfFamily_AutoML_20200602_000138,0.913092,0.153979,0.869004,0.110221,0.193898,0.0375964
StackedEnsemble_AllModels_AutoML_20200602_000138,0.9127,0.150351,0.873603,0.097132,0.191726,0.0367587
XGBoost_1_AutoML_20200602_000138,0.912105,0.185976,0.846952,0.12424,0.2134,0.0455396
XGBoost_2_AutoML_20200602_000138,0.910856,0.205302,0.817516,0.135697,0.23115,0.0534304
GBM_grid__1_AutoML_20200602_000138_model_2,0.907632,0.228659,0.814674,0.147225,0.24948,0.0622404
XGBoost_grid__1_AutoML_20200602_000138_model_2,0.907014,0.24491,0.852866,0.124363,0.21243,0.0451265




In [162]:
h2o.shutdown()

  """Entry point for launching an IPython kernel.


H2O session _sid_92b0 closed.
