# ABSTRACT

Hyperparameters are parameters that are specified prior to running machine learning algorithms that have a large effect on the predictive power of statistical models. Knowledge of the relative importance of a hyperparameter to an algorithm and its range of values is crucial to hyperparameter tuning and creating effective models. To either experts or non-experts, determining hyperparameters that optimize model performance can be a tedious and difficult task. Therefore, we develop a hyperparameter database that allows users to visualize and understand how to choose hyperparameters that maximize the predictive power of their models. 

The database is created by running millions of hyperparameter values, over thousands of public datasets and calculating the individual conditional expectation of every hyperparameter to the quality of a model.                 

We analyze the **effect of hyperparameters** on algorithms such as                                                  
Distributed Random Forest (DRF),                                                                               
Generalized Linear Model (GLM),                                                                                
Gradient Boosting Machine (GBM),                                                                            
Boosting (XGBoost) and several more.                                                                          
Consequently, the database attempts to provide a one-stop platform for data scientists to identify hyperparameters that have the most effect on their models in order to speed up the process of developing effective predictive models. Moreover, the database will also use these public datasets to build models that can predict hyperparameters without search and for visualizing and teaching concepts such as statistical power and bias/variance tradeoff. The raw data will also be publically available for the research community.


In [1]:
# import h2o package and specific estimator 
import h2o
from h2o.automl import H2OAutoML
import random, os, sys
from datetime import datetime
import pandas as pd
import logging
import csv
import optparse
import time
import json
from distutils.util import strtobool
import psutil

import warnings
warnings.filterwarnings('ignore')

In [2]:
h2o.init(strict_version_check=False) # start h2o

Checking whether there is an H2O instance running at http://localhost:54321 ..... not found.
Attempting to start a local H2O server...
  Java Version: java version "11.0.2" 2019-01-15 LTS; Java(TM) SE Runtime Environment 18.9 (build 11.0.2+9-LTS); Java HotSpot(TM) 64-Bit Server VM 18.9 (build 11.0.2+9-LTS, mixed mode)
  Starting server from /anaconda3/lib/python3.7/site-packages/h2o/backend/bin/h2o.jar
  Ice root: /var/folders/8c/swkczny90t36p2798_2khxfh0000gn/T/tmpyhfnuok5
  JVM stdout: /var/folders/8c/swkczny90t36p2798_2khxfh0000gn/T/tmpyhfnuok5/h2o_newzysharma_started_from_python.out
  JVM stderr: /var/folders/8c/swkczny90t36p2798_2khxfh0000gn/T/tmpyhfnuok5/h2o_newzysharma_started_from_python.err
  Server is running at http://127.0.0.1:54321
Connecting to H2O server at http://127.0.0.1:54321 ... successful.


0,1
H2O cluster uptime:,02 secs
H2O cluster timezone:,America/New_York
H2O data parsing timezone:,UTC
H2O cluster version:,3.24.0.1
H2O cluster version age:,16 days
H2O cluster name:,H2O_from_python_newzysharma_azqram
H2O cluster total nodes:,1
H2O cluster free memory:,2 Gb
H2O cluster total cores:,4
H2O cluster allowed cores:,4


In [4]:
#importing data to the server
hp = h2o.import_file(path="hour.csv")

Parse progress: |█████████████████████████████████████████████████████████| 100%


In [5]:
#Displaying the head
hp.head()

instant,dteday,season,yr,mnth,hr,holiday,weekday,workingday,weathersit,temp,atemp,hum,windspeed,casual,registered,cnt
1,2011-01-01 00:00:00,1,0,1,0,0,6,0,1,0.24,0.2879,0.81,0.0,3,13,16
2,2011-01-01 00:00:00,1,0,1,1,0,6,0,1,0.22,0.2727,0.8,0.0,8,32,40
3,2011-01-01 00:00:00,1,0,1,2,0,6,0,1,0.22,0.2727,0.8,0.0,5,27,32
4,2011-01-01 00:00:00,1,0,1,3,0,6,0,1,0.24,0.2879,0.75,0.0,3,10,13
5,2011-01-01 00:00:00,1,0,1,4,0,6,0,1,0.24,0.2879,0.75,0.0,0,1,1
6,2011-01-01 00:00:00,1,0,1,5,0,6,0,2,0.24,0.2576,0.75,0.0896,0,1,1
7,2011-01-01 00:00:00,1,0,1,6,0,6,0,1,0.22,0.2727,0.8,0.0,2,0,2
8,2011-01-01 00:00:00,1,0,1,7,0,6,0,1,0.2,0.2576,0.86,0.0,1,2,3
9,2011-01-01 00:00:00,1,0,1,8,0,6,0,1,0.24,0.2879,0.75,0.0,1,7,8
10,2011-01-01 00:00:00,1,0,1,9,0,6,0,1,0.32,0.3485,0.76,0.0,8,6,14




In [6]:
hp.describe()

Rows:17379
Cols:17




Unnamed: 0,instant,dteday,season,yr,mnth,hr,holiday,weekday,workingday,weathersit,temp,atemp,hum,windspeed,casual,registered,cnt
type,int,time,int,int,int,int,int,int,int,int,real,real,real,real,int,int,int
mins,1.0,1293840000000.0,1.0,0.0,1.0,0.0,0.0,0.0,0.0,1.0,0.02,0.0,0.0,0.0,0.0,0.0,1.0
mean,8690.0,1325477314552.0461,2.501639910236492,0.5025605615973301,6.5377754761493785,11.546751826917548,0.028770355026181024,3.003682605443351,0.6827205247712756,1.425283387997008,0.4969871684216584,0.47577510213476026,0.6272288394038784,0.1900976063064618,35.67621842453536,153.78686920996606,189.4630876345014
maxs,17379.0,1356912000000.0,4.0,1.0,12.0,23.0,1.0,6.0,1.0,4.0,1.0,1.0,1.0,0.8507,367.0,886.0,977.0
sigma,5017.029499614288,18150225217.779854,1.1069181394480765,0.5000078290910197,3.438775713750168,6.914405095264493,0.16716527638437123,2.005771456110988,0.4654306335238829,0.6393568777542534,0.19255612124972193,0.17185021563535932,0.19292983406291514,0.1223402285727905,49.30503038705309,151.35728591258314,181.38759909186476
zeros,0,0,0,8645,0,726,16879,2502,5514,0,0,2,22,2180,1581,24,0
missing,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0,0
0,1.0,2011-01-01 00:00:00,1.0,0.0,1.0,0.0,0.0,6.0,0.0,1.0,0.24,0.2879,0.81,0.0,3.0,13.0,16.0
1,2.0,2011-01-01 00:00:00,1.0,0.0,1.0,1.0,0.0,6.0,0.0,1.0,0.22,0.2727,0.8,0.0,8.0,32.0,40.0
2,3.0,2011-01-01 00:00:00,1.0,0.0,1.0,2.0,0.0,6.0,0.0,1.0,0.22,0.2727,0.8,0.0,5.0,27.0,32.0


In [7]:
# Functions

def alphabet(n):
  alpha='0123456789abcdefghijklmnopqrstuvwxyzABCDEFGHIJKLMNOPQRSTUVWXYZ'    
  str=''
  r=len(alpha)-1   
  while len(str)<n:
    i=random.randint(0,r)
    str+=alpha[i]   
  return str
  
  
def set_meta_data(analysis,run_id,server,data,test,model_path,target,run_time,classification,scale,model,balance,balance_threshold,name,path,nthreads,min_mem_size):
  m_data={}
  m_data['start_time'] = time.time()
  m_data['target']=target
  m_data['server_path']=server
  m_data['data_path']=data 
  m_data['test_path']=test
  m_data['max_models']=model
  m_data['run_time']=run_time
  m_data['run_id'] =run_id
  m_data['scale']=scale
  m_data['classification']=classification
  m_data['scale']=False
  m_data['model_path']=model_path
  m_data['balance']=balance
  m_data['balance_threshold']=balance_threshold
  m_data['project'] =name
  m_data['end_time'] = time.time()
  m_data['execution_time'] = 0.0
  m_data['run_path'] =path
  m_data['nthreads'] = nthreads
  m_data['min_mem_size'] = min_mem_size
  m_data['analysis'] = analysis
  return m_data


def dict_to_json(dct,n):
  j = json.dumps(dct, indent=4)
  f = open(n, 'w')
  print(j, file=f)
  f.close()
  
  
def stackedensemble(mod):
    coef_norm=None
    try:
      metalearner = h2o.get_model(mod.metalearner()['name'])
      coef_norm=metalearner.coef_norm()
    except:
      pass        
    return coef_norm

def stackedensemble_df(df):
    bm_algo={ 'GBM': None,'GLM': None,'DRF': None,'XRT': None,'Dee': None}
    for index, row in df.iterrows():
      if len(row['model_id'])>3:
        key=row['model_id'][0:3]
        if key in bm_algo:
          if bm_algo[key] is None:
                bm_algo[key]=row['model_id']
    bm=list(bm_algo.values()) 
    bm=list(filter(None.__ne__, bm))             
    return bm

def se_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['auc']=modl.auc()   
    d['roc']=modl.roc()
    d['mse']=modl.mse()   
    d['null_degrees_of_freedom']=modl.null_degrees_of_freedom()
    d['null_deviance']=modl.null_deviance()
    d['residual_degrees_of_freedom']=modl.residual_degrees_of_freedom()   
    d['residual_deviance']=modl.residual_deviance()
    d['rmse']=modl.rmse()
    return d

def get_model_by_algo(algo,models_dict):
    mod=None
    mod_id=None    
    for m in list(models_dict.keys()):
        if m[0:3]==algo:
            mod_id=m
            mod=h2o.get_model(m)      
    return mod,mod_id     
    
    
def gbm_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['varimp']=modl.varimp()  
    return d
    
    
def dl_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['varimp']=modl.varimp()  
    return d
    
    
def drf_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['varimp']=modl.varimp()  
    d['roc']=modl.roc()      
    return d
    
def xrt_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['varimp']=modl.varimp()  
    d['roc']=modl.roc()      
    return d
    
    
def glm_stats(modl):
    d={}
    d['algo']=modl.algo
    d['model_id']=modl.model_id   
    d['coef']=modl.coef()  
    d['coef_norm']=modl.coef_norm()      
    return d
    
def model_performance_stats(perf):
    d={}
    try:    
      d['mse']=perf.mse()
    except:
      pass      
    try:    
      d['rmse']=perf.rmse() 
    except:
      pass      
    try:    
      d['null_degrees_of_freedom']=perf.null_degrees_of_freedom()
    except:
      pass      
    try:    
      d['residual_degrees_of_freedom']=perf.residual_degrees_of_freedom()
    except:
      pass      
    try:    
      d['residual_deviance']=perf.residual_deviance() 
    except:
      pass      
    try:    
      d['null_deviance']=perf.null_deviance() 
    except:
      pass      
    try:    
      d['aic']=perf.aic() 
    except:
      pass      
    try:
      d['logloss']=perf.logloss() 
    except:
      pass    
    try:
      d['auc']=perf.auc()
    except:
      pass  
    try:
      d['gini']=perf.gini()
    except:
      pass    
    return d
    
def impute_missing_values(df, x, scal=False):
    # determine column types
    ints, reals, enums = [], [], []
    for key, val in df.types.items():
        if key in x:
            if val == 'enum':
                enums.append(key)
            elif val == 'int':
                ints.append(key)            
            else: 
                reals.append(key)    
    _ = df[reals].impute(method='mean')
    _ = df[ints].impute(method='median')
    if scal:
        df[reals] = df[reals].scale()
        df[ints] = df[ints].scale()    
    return


def get_independent_variables(df, targ):
    C = [name for name in df.columns if name != targ]
    # determine column types
    ints, reals, enums = [], [], []
    for key, val in df.types.items():
        if key in C:
            if val == 'enum':
                enums.append(key)
            elif val == 'int':
                ints.append(key)            
            else: 
                reals.append(key)    
    x=ints+enums+reals
    return x
    
def get_all_variables_csv(i):
    ivd={}
    try:
      iv = pd.read_csv(i,header=None)
    except:
      sys.exit(1)    
    col=iv.values.tolist()[0]
    dt=iv.values.tolist()[1]
    i=0
    for c in col:
      ivd[c.strip()]=dt[i].strip()
      i+=1        
    return ivd
    
    

def check_all_variables(df,dct,y=None):     
    targ=list(dct.keys())     
    for key, val in df.types.items():
        if key in targ:
          if dct[key] not in ['real','int','enum']:                      
            targ.remove(key)  
    for key, val in df.types.items():
        if key in targ:            
          if dct[key] != val:
            print('convert ',key,' ',dct[key],' ',val)
            if dct[key]=='enum':
                try:
                  df[key] = df[key].asfactor() 
                except:
                  targ.remove(key)                 
            if dct[key]=='int': 
                try:                
                  df[key] = df[key].asnumeric() 
                except:
                  targ.remove(key)                  
            if dct[key]=='real':
                try:                
                  df[key] = df[key].asnumeric()  
                except:
                  targ.remove(key)                  
    if y is None:
      y=df.columns[-1] 
    if y in targ:
      targ.remove(y)
    else:
      y=targ.pop()            
    return targ    
    
def predictions(mod,data,run_id):
    test = h2o.import_file(data)
    mod_perf=mod_best.model_performance(test)
              
    stats_test={}
    stats_test=model_performance_stats(mod_perf)

    n=run_id+'_test_stats.json'
    dict_to_json(stats_test,n) 

    try:    
      cf=mod_perf.confusion_matrix(metrics=["f1","f2","f0point5","accuracy","precision","recall","specificity","absolute_mcc","min_per_class_accuracy","mean_per_class_accuracy"])
      cf_df=cf[0].table.as_data_frame()
      cf_df.to_csv(run_id+'_test_confusion_matrix.csv')
    except:
      pass

    predictions = mod_best.predict(test)
    predictions_df=test.cbind(predictions).as_data_frame() 
    predictions_df.to_csv(run_id+'_predictions.csv')
    return

def predictions_test(mod,test,run_id):
    mod_perf=mod_best.model_performance(test)          
    stats_test={}
    stats_test=model_performance_stats(mod_perf)
    n=run_id+'_test_stats.json'
    dict_to_json(stats_test,n) 
    try:
      cf=mod_perf.confusion_matrix()
#      cf=mod_perf.confusion_matrix(metrics=["f1","f2","f0point5","accuracy","precision","recall","specificity","absolute_mcc","min_per_class_accuracy","mean_per_class_accuracy"])
      cf_df=cf.table.as_data_frame()
      cf_df.to_csv(run_id+'_test_confusion_matrix.csv')
    except:
      pass
    predictions = mod_best.predict(test)    
    predictions_df=test.cbind(predictions).as_data_frame() 
    predictions_df.to_csv(run_id+'_predictions.csv')
    return predictions

def check_X(x,df):
    for name in x:
        if name not in df.columns:
          x.remove(name)  
    return x    
    
    
def get_stacked_ensemble(lst):
    se=None
    for model in model_set:
      if 'BestOfFamily' in model:
        se=model
    if se is None:     
      for model in model_set:
        if 'AllModels'in model:
          se=model           
    return se       
    
def get_variables_types(df):
    d={}
    for key, val in df.types.items():
        d[key]=val           
    return d    
    
#  End Functions

In [8]:
all_variables=None

# Model with 500 seconds

In [11]:
# Assume the following are passed by the user from the web interface

'''
Need a user id and project id?

'''
target='cnt' 
data_file='hour.csv'
run_time=500
run_id='500_' # Just some arbitrary ID
server_path='/Users/newzysharma/Desktop/Desktop/info6105/INFO6105-FinalProject/HyperparametersDB/500'
classification=False
scale=False
max_models=None
balance_y=False # balance_classes=balance_y
balance_threshold=0.2
project ="HyperparameterDB_Project"  # project_name = project

In [12]:
# assign target and inputs for logistic regression
y = target
X = [name for name in hp.columns if name != y]
print(y)
print(X)

cnt
['instant', 'dteday', 'season', 'yr', 'mnth', 'hr', 'holiday', 'weekday', 'workingday', 'weathersit', 'temp', 'atemp', 'hum', 'windspeed', 'casual', 'registered']


In [13]:
# determine column types
ints, reals, enums = [], [], []
for key, val in hp.types.items():
    if key in X:
        if val == 'enum':
            enums.append(key)
        elif val == 'int':
            ints.append(key)            
        else: 
            reals.append(key)

print(ints)
print(enums)
print(reals)

['instant', 'season', 'yr', 'mnth', 'hr', 'holiday', 'weekday', 'workingday', 'weathersit', 'casual', 'registered']
[]
['dteday', 'temp', 'atemp', 'hum', 'windspeed']


In [14]:
# impute missing values
_ = hp[reals].impute(method='mean')
_ = hp[ints].impute(method='median')

if scale:
    hp[reals] = df[reals].scale()
    hp[ints] = df[ints].scale()

In [15]:
# # set target to factor for classification by default or if user specifies classification
# if classification:
#     [y] = hp[y].asfactor()

In [16]:
hp[y].levels()

[]


## Cross-validate rather than take a test training split with 500 seconds

In [17]:
# automl
# runs for run_time seconds then builds a stacked ensemble

aml = H2OAutoML(max_runtime_secs=run_time,project_name = project) # init automl, run for 500 seconds
aml.train(x=X,  
           y=y,
           training_frame=hp)

AutoML progress: |████████████████████████████████████████████████████████| 100%


## Leaderboard

In [18]:
# view leaderboard
lb = aml.leaderboard
lb.head()

model_id,mean_residual_deviance,rmse,mse,mae,rmsle
XGBoost_1_AutoML_20190417_141543,20.836,4.56465,20.836,2.76727,0.0559362
XGBoost_3_AutoML_20190417_141543,21.7471,4.66338,21.7471,3.07955,0.0911628
GBM_1_AutoML_20190417_141543,26.9473,5.19108,26.9473,3.26028,0.0735548
XGBoost_2_AutoML_20190417_141543,34.1898,5.84721,34.1898,3.41192,0.0725628
GLM_grid_1_AutoML_20190417_141543_model_1,38.387,6.19573,38.387,4.45197,
DRF_1_AutoML_20190417_141543,61.2881,7.82867,61.2881,4.36407,0.0801306
StackedEnsemble_AllModels_AutoML_20190417_141543,65.6652,8.10341,65.6652,5.71673,
StackedEnsemble_BestOfFamily_AutoML_20190417_141543,111.489,10.5588,111.489,7.89211,0.382145
GBM_2_AutoML_20190417_141543,3124.34,55.8958,3124.34,39.6476,0.896506




In [19]:
aml.leader

Model Details
H2OXGBoostEstimator :  XGBoost
Model Key:  XGBoost_1_AutoML_20190417_141543


ModelMetricsRegression: xgboost
** Reported on train data. **

MSE: 1.4326776184714103
RMSE: 1.1969451192395624
MAE: 0.8309335654566193
RMSLE: 0.020809386114201345
Mean Residual Deviance: 1.4326776184714103

ModelMetricsRegression: xgboost
** Reported on cross-validation data. **

MSE: 20.83599849530585
RMSE: 4.56464659040608
MAE: 2.7672702805724585
RMSLE: 0.055936167243912635
Mean Residual Deviance: 20.83599849530585
Cross-Validation Metrics Summary: 


0,1,2,3,4,5,6,7
,mean,sd,cv_1_valid,cv_2_valid,cv_3_valid,cv_4_valid,cv_5_valid
mae,2.7672753,0.0532250,2.628191,2.775974,2.7770326,2.7998576,2.8553214
mean_residual_deviance,20.836102,1.5239158,17.106339,19.807367,21.755531,22.89566,22.615606
mse,20.836102,1.5239158,17.106339,19.807367,21.755531,22.89566,22.615606
r2,0.9993665,0.0000473,0.9994807,0.9994026,0.999335,0.9993054,0.9993089
residual_deviance,20.836102,1.5239158,17.106339,19.807367,21.755531,22.89566,22.615606
rmse,4.5582676,0.1707292,4.135981,4.4505467,4.6642823,4.784941,4.7555866
rmsle,0.0558277,0.0024628,0.0562938,0.0572713,0.0490943,0.0590697,0.0574094


Scoring History: 


0,1,2,3,4,5,6
,timestamp,duration,number_of_trees,training_rmse,training_mae,training_deviance
,2019-04-17 14:20:01,4 min 18.231 sec,0.0,261.9286476,188.9630876,68606.6164192
,2019-04-17 14:20:01,4 min 18.367 sec,5.0,203.5481229,146.4994800,41431.8383331
,2019-04-17 14:20:01,4 min 18.465 sec,10.0,158.4862240,113.6445494,25117.8831919
,2019-04-17 14:20:01,4 min 18.566 sec,15.0,123.8277791,88.1928762,15333.3188879
,2019-04-17 14:20:02,4 min 18.670 sec,20.0,96.5556989,68.4392085,9323.0029909
---,---,---,---,---,---,---
,2019-04-17 14:20:39,4 min 55.807 sec,380.0,1.2372287,0.8556211,1.5307348
,2019-04-17 14:20:40,4 min 56.675 sec,385.0,1.2256590,0.8484709,1.5022400
,2019-04-17 14:20:40,4 min 57.566 sec,390.0,1.2129220,0.8407993,1.4711798



See the whole table with table.as_data_frame()
Variable Importances: 


0,1,2,3
variable,relative_importance,scaled_importance,percentage
registered,2585197824.0000000,1.0,0.7403071
casual,535572896.0000000,0.2071690,0.1533687
hr,230089488.0000000,0.0890027,0.0658893
workingday,39991820.0000000,0.0154695,0.0114522
instant,35030364.0000000,0.0135504,0.0100314
dteday,29959240.0000000,0.0115888,0.0085792
atemp,12163877.0000000,0.0047052,0.0034833
temp,9519879.0,0.0036825,0.0027261
weekday,9429757.0,0.0036476,0.0027003




In [20]:
aml.leader.algo

'xgboost'

## Ensemble Exploration

In [21]:
aml_leaderboard_df=aml.leaderboard.as_data_frame()
aml_leaderboard_df

Unnamed: 0,model_id,mean_residual_deviance,rmse,mse,mae,rmsle
0,XGBoost_1_AutoML_20190417_141543,20.835998,4.564647,20.835998,2.76727,0.055936
1,XGBoost_3_AutoML_20190417_141543,21.747068,4.663375,21.747068,3.079545,0.091163
2,GBM_1_AutoML_20190417_141543,26.947297,5.191079,26.947297,3.260282,0.073555
3,XGBoost_2_AutoML_20190417_141543,34.189808,5.847205,34.189808,3.411921,0.072563
4,GLM_grid_1_AutoML_20190417_141543_model_1,38.387049,6.195728,38.387049,4.451969,
5,DRF_1_AutoML_20190417_141543,61.288108,7.828672,61.288108,4.364072,0.080131
6,StackedEnsemble_AllModels_AutoML_20190417_141543,65.665224,8.103408,65.665224,5.716734,
7,StackedEnsemble_BestOfFamily_AutoML_20190417_1...,111.488617,10.558817,111.488617,7.892115,0.382145
8,GBM_2_AutoML_20190417_141543,3124.342041,55.895814,3124.342041,39.647593,0.896506


## Generating JSON file for all the models through FOR loop

In [23]:
aml_leaderboard_df=aml.leaderboard.as_data_frame()
model_set=aml_leaderboard_df['model_id']
mod_best=h2o.get_model(model_set[0])

In [25]:
aml_leaderboard_df.shape

(9, 6)

In [26]:
model_set.shape

(9,)

In [24]:
##iterating over number of rows(all model_id)
for i in range(model_set.shape[0]):
    mod_best = h2o.get_model(model_set[i])
    hy_parameter = mod_best.params
    n = run_id + '_' + model_set[i] + '.json'
    dict_to_json(hy_parameter, n)

# CONCLUSION

<table style="width:50%">
  <tr>
      <th>Runtime of model in Number of Seconds<br></th>
    <th>Models Generated</th> 
  </tr>
    
   <tr>
    <td>500</td>
    <td>7</td> 
  </tr>
    
  <tr>
    <td>1000</td>
    <td>13</td> 
  </tr>
  
  <tr>
    <td>1350</td>
    <td>33</td> 
  </tr>
  
   <tr>
    <td>1500</td>
    <td>56</td> 
  </tr>
  
   <tr>
    <td>1850</td>
    <td>84<td> 
  </tr>
    
</table>

# CONTRIBUTION

# CITATIONS

https://github.com/nikbearbrown/CSYE_7245/blob/master/H2O/H2O_automl_model.ipynb

# LICENSE


Copyright 2019 Newzy Sharma 

Permission is hereby granted, free of charge, to any person obtaining a copy of this software and associated 
documentation files (the "Software"), to deal in the Software without restriction, including without limitation the 
rights to use, copy, modify, merge, publish, distribute, sublicense, and/or sell copies of the Software, and to permit
persons to whom the Software is furnished to do so, subject to the following conditions:

The above copyright notice and this permission notice shall be included in all copies or substantial portions of the 
Software.

THE SOFTWARE IS PROVIDED "AS IS", WITHOUT WARRANTY OF ANY KIND, EXPRESS OR IMPLIED, INCLUDING BUT NOT LIMITED TO THE 
WARRANTIES OF MERCHANTABILITY, FITNESS FOR A PARTICULAR PURPOSE AND NONINFRINGEMENT. IN NO EVENT SHALL THE AUTHORS OR 
COPYRIGHT HOLDERS BE LIABLE FOR ANY CLAIM, DAMAGES OR OTHER LIABILITY, WHETHER IN AN ACTION OF CONTRACT, TORT OR 
OTHERWISE, ARISING FROM, OUT OF OR IN CONNECTION WITH THE SOFTWARE OR THE USE OR OTHER DEALINGS IN THE SOFTWARE.