## Modep Tabular AutoML

https://github.com/modep-ai

We'll run all of these on the Kaggle Otto Group Product Classification Challenge dataset and make a submission with each:

- [AutoGluon](https://auto.gluon.ai): Amazons's version of AutoML
- AutoGluon_bestquality: above with extra setting for maximum accuracy
- [auto-sklearn](https://www.automl.org/automl/auto-sklearn/): the most popular by GitHub stars, winner of ChaLearn competition
- [auto-sklearn 2.0](https://www.automl.org/auto-sklearn-2-0-the-next-generation/): newer version 2.0 of the above
- [Auto-WEKA](https://www.cs.ubc.ca/labs/beta/Projects/autoweka/): one of the oldest, but it's a JAR if you're into that
- [FLAML](https://github.com/microsoft/FLAML): Microsoft's version of AutoML
- [GAMA](https://github.com/PGijsbers/gama): AutoML project from OpenML
- [H2O AutoML](https://docs.h2o.ai/h2o/latest-stable/h2o-docs/automl.html): by H2O.ai
- [hyperopt-sklearn](http://hyperopt.github.io/hyperopt-sklearn/): uses hyperopt to search scikit-learn models
- [mljar-supervised](https://supervised.mljar.com/): a JAR that does AutoML
- mljar-supervised_compete: above with extra setting for maximum accuracy
- [MLNet](https://docs.microsoft.com/en-us/dotnet/machine-learning/reference/ml-net-cli-reference): command line AutoML tool by Microsoft
- [TPOT](https://github.com/EpistasisLab/tpot): optimizes scikit-learn pipelines using genetic programming

In addition, the following non-AutoML baseline frameworks are available for comparison:

- Constant Predictor: predicts empirical target class probabilities for classification or the target median for regression
- Decision Tree: scikit-learn Decision Tree with default parameters
- Random Forest: scikit-learn Random Forest with default parameters except `n_estimators = 2000`
- Tuned Random Forest: above with tuned `max_features` parameter

### Python client example 

### Install the modep.ai client

In [1]:
pip install git+https://github.com/jimgoo/modep-client.git

Collecting git+https://github.com/jimgoo/modep-client.git
  Cloning https://github.com/jimgoo/modep-client.git to /tmp/pip-req-build-8quonbdt
  Running command git clone -q https://github.com/jimgoo/modep-client.git /tmp/pip-req-build-8quonbdt
Note: you may need to restart the kernel to use updated packages.


### Install the kaggle client 

This will be used to download data and make submissions. Setup your credentials using the guide here: https://github.com/Kaggle/kaggle-api

In [2]:
pip install kaggle

Note: you may need to restart the kernel to use updated packages.


In [3]:
import os
import time
import numpy as np
import pandas as pd
from IPython.display import display, HTML

from modep_client import ModepClient

In [4]:
client = ModepClient('<YOUR API KEY>', url='http://localhost:5000/v1/', ensure_https=False)

### Download kaggle data

https://www.kaggle.com/c/otto-group-product-classification-challenge

> For this competition, we have provided a dataset with 93 features for more than 200,000 products. The objective is to build a predictive model which is able to distinguish between our main product categories. The winning models will be open sourced.

In [6]:
dataset = 'otto-group'

In [7]:
if not os.path.exists(dataset):
    !kaggle competitions download -p {dataset} -q otto-group-product-classification-challenge
    !unzip -d {dataset} {dataset}/otto-group-product-classification-challenge.zip

In [8]:
!ls $dataset

otto-group-product-classification-challenge.zip  submissions  train.csv
sampleSubmission.csv				 test.csv


In [9]:
# training set
train = pd.read_csv(dataset + '/train.csv')

In [10]:
train.head()

Unnamed: 0,id,feat_1,feat_2,feat_3,feat_4,feat_5,feat_6,feat_7,feat_8,feat_9,...,feat_85,feat_86,feat_87,feat_88,feat_89,feat_90,feat_91,feat_92,feat_93,target
0,1,1,0,0,0,0,0,0,0,0,...,1,0,0,0,0,0,0,0,0,Class_1
1,2,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,Class_1
2,3,0,0,0,0,0,0,0,1,0,...,0,0,0,0,0,0,0,0,0,Class_1
3,4,1,0,0,1,6,1,5,0,0,...,0,1,2,0,0,0,0,0,0,Class_1
4,5,0,0,0,0,0,0,0,0,0,...,1,0,0,0,0,1,0,0,0,Class_1


In [11]:
train = train.drop(['id'], 1)

In [12]:
# name of the target column
target = 'target'

In [13]:
# test set for public leaderboard (no target column)
test = pd.read_csv(dataset + '/test.csv')

In [14]:
test.head()

Unnamed: 0,id,feat_1,feat_2,feat_3,feat_4,feat_5,feat_6,feat_7,feat_8,feat_9,...,feat_84,feat_85,feat_86,feat_87,feat_88,feat_89,feat_90,feat_91,feat_92,feat_93
0,1,0,0,0,0,0,0,0,0,0,...,0,0,11,1,20,0,0,0,0,0
1,2,2,2,14,16,0,0,0,0,0,...,0,0,0,0,0,4,0,0,2,0
2,3,0,1,12,1,0,0,0,0,0,...,0,0,0,0,2,0,0,0,0,1
3,4,0,0,0,1,0,0,0,0,0,...,0,3,1,0,0,0,0,0,0,0
4,5,1,0,0,1,0,0,1,2,0,...,0,0,0,0,0,0,0,9,0,0


In [15]:
# we will use the ID column later when submitting our predictions
test_ids = test.id

# drop the ID column
test = test.drop(['id'], 1)

In [16]:
# add a dummy target column for the test set
if target not in test.columns:
    test[target] = train[target].values[0]

### Upload datasets

In [17]:
client.list_datasets()

In [18]:
# dsets = client.list_datasets().sort_values(by='mbytes')
# train_data_id = dsets.index[-1]
# test_data_id = dsets.index[0]
# dsets

In [19]:
# upload the training set
train_dset = client.upload_dataset(train)

In [20]:
train_dset

{'id': 'f23b01c1-a77e-46bc-a55d-ca43e84a67d6',
 'path': '/tmp/tmp_xmyayjd.csv',
 'name': 'tmp_xmyayjd.csv',
 'ext': 'csv',
 'mbytes': 11.5139217376709,
 'created': '2021-08-05T21:35:38.372441'}

In [21]:
train_data_id = train_dset['id']

In [22]:
# upload the test set
test_dset = client.upload_dataset(test)

In [23]:
test_dset

{'id': '9897bb9b-f9f8-49a8-b581-927a330eb1df',
 'path': '/tmp/tmpg71axo7v.csv',
 'name': 'tmpg71axo7v.csv',
 'ext': 'csv',
 'mbytes': 26.8632764816284,
 'created': '2021-08-05T21:35:49.258719'}

In [24]:
test_data_id = test_dset['id']

In [25]:
client.list_datasets()

Unnamed: 0_level_0,path,name,ext,mbytes,created
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1
9897bb9b-f9f8-49a8-b581-927a330eb1df,/tmp/tmpg71axo7v.csv,tmpg71axo7v.csv,csv,26.863276,2021-08-05T21:35:49.258719
f23b01c1-a77e-46bc-a55d-ca43e84a67d6,/tmp/tmp_xmyayjd.csv,tmp_xmyayjd.csv,csv,11.513922,2021-08-05T21:35:38.372441


In [26]:
# get information on each framework
frameworks = client.list_framework_info()
frameworks

Unnamed: 0_level_0,description,project,params
framework_name,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1
AutoGluon,AutoGluon-Tabular: Unlike existing AutoML fram...,https://auto.gluon.ai,{'_save_artifacts': ['leaderboard']}
AutoGluon_bestquality,AutoGluon with 'best_quality' preset provides ...,,"{'_save_artifacts': ['leaderboard'], 'presets'..."
autosklearn,auto-sklearn frees a machine learning user fro...,https://automl.github.io/auto-sklearn/,{'_save_artifacts': ['models']}
autosklearn2,,,"{'_askl2': True, '_save_artifacts': ['models']}"
AutoWEKA,Auto-WEKA considers the problem of simultaneou...,https://www.cs.ubc.ca/labs/beta/Projects/autow...,
constantpredictor,Fast dummy classifier mainly used to test the ...,https://scikit-learn.org/stable/modules/genera...,
DecisionTree,A simple decision tree implementation (scikit-...,https://scikit-learn.org/stable/modules/genera...,
flaml,FLAML is a lightweight Python library that fin...,https://github.com/microsoft/FLAML,
GAMA,GAMA tries to find a good machine learning pip...,https://github.com/PGijsbers/gama,
H2OAutoML,"H2O AutoML is a highly scalable, fully-automat...",http://docs.h2o.ai/h2o/latest-stable/h2o-docs/...,{'_save_artifacts': ['leaderboard']}


### Train each AutoML framework

In [27]:
# maximum amount of time to run each AutoML framework for
max_runtime_seconds = 3600

# experiment ID to tag each framework run with
exp_id = 'otto-v1'

In [None]:
# train each AutoML framework on the training set and get predictions for the test set
for framework_id, row in frameworks.iterrows(): 
    display(HTML(f"<h4>{framework_id}</h4>"))
    
    model_task = client.train_framework(framework_id, train_data_id, test_data_id, 
                                        target, max_runtime_seconds, experiment_id=exp_id)
    
    # this waits for training to complete
    run = model_task.result()
    
    print(pd.Series(run))

id                                  b5e3fa7c-f5e2-44dd-a7dc-17e9c65f9917
framework_name                                                 AutoGluon
version                                                            0.2.0
train_ids                       ["f23b01c1-a77e-46bc-a55d-ca43e84a67d6"]
test_ids                        ["9897bb9b-f9f8-49a8-b581-927a330eb1df"]
target                                                            target
max_runtime_seconds                                                 3600
created                                       2021-08-05T21:36:14.727742
status                                                           SUCCESS
problem_type                                                  multiclass
metric_name                                                  neg_logloss
metric_value                                                    -8.05542
other_metrics          {'logloss': 8.05542, 'acc': 0.0227197, 'balacc...
duration                                           

id                                  253ae835-e451-4678-9290-8fe2a0b9bf89
framework_name                                     AutoGluon_bestquality
version                                                            0.2.0
train_ids                       ["f23b01c1-a77e-46bc-a55d-ca43e84a67d6"]
test_ids                        ["9897bb9b-f9f8-49a8-b581-927a330eb1df"]
target                                                            target
max_runtime_seconds                                                 3600
created                                       2021-08-05T21:46:09.683072
status                                                           SUCCESS
problem_type                                                  multiclass
metric_name                                                  neg_logloss
metric_value                                                    -7.30574
other_metrics          {'logloss': 7.30574, 'acc': 0.0261485, 'balacc...
duration                                           

In [44]:
# get all framework runs from the experiment
runs = client.list_framework_runs().query(f"experiment_id == '{exp_id}'")

runs[['framework_name', 'version', 'status', 'experiment_id']]

Unnamed: 0_level_0,framework_name,version,status,experiment_id
id,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
6cefd03d-436b-432b-8167-d4654e820873,TunedRandomForest,0.24.2,SUCCESS,otto-v1
f55ab336-a63e-483f-94a3-9a2545c7efc8,TPOT,0.11.7,SUCCESS,otto-v1
2a9de9a4-70ba-46a6-a427-6991be3e6728,RandomForest,0.24.2,SUCCESS,otto-v1
a9560114-226d-4918-96fa-4d79cf423141,MLNet,latest,FAIL,otto-v1
8ce2aa86-bf05-4c03-9c73-f927a29a9755,mljarsupervised_compete,0.10.4,SUCCESS,otto-v1
fc3df818-a0f9-43f5-95f8-b0197f4ff1e3,mljarsupervised,0.10.4,SUCCESS,otto-v1
831e2bdd-cde7-47ac-8692-13e638c25ac3,hyperoptsklearn,latest,SUCCESS,otto-v1
9f4bc6fc-9fe4-4320-8df4-ad8c88853000,H2OAutoML,3.32.1.3,SUCCESS,otto-v1
69c68137-feeb-4a82-87bf-03ef3adbf4c4,GAMA,21.0.0,SUCCESS,otto-v1
7456f14a-d5c8-4d12-935f-be2c04bcbc73,flaml,0.5.3,SUCCESS,otto-v1


### Inspect created models

Some frameworks like AutoGluon and H2O have leaderboards, while others like TPOT and autosklearn have text descriptions.

In [45]:
for idx, row in runs.iterrows():
    if row.fold_leaderboard:
        display(HTML(f'<h4>{row.framework_name} leaderboard</h4>'))
        # [0] is for the first and only test set fold
        lb = pd.DataFrame(row.fold_leaderboard[0])
        display(HTML(lb.to_html()))

Unnamed: 0,model_id,logloss,mean_per_class_error,rmse,mse,auc,aucpr
0,StackedEnsemble_AllModels_AutoML_20210805_225324,0.467361,0.223162,0.387139,0.149877,,
1,StackedEnsemble_BestOfFamily_AutoML_20210805_225324,0.471185,0.225353,0.388839,0.151196,,
2,XGBoost_grid__1_AutoML_20210805_225324_model_3,0.479894,0.236635,0.385185,0.148367,,
3,GBM_grid__1_AutoML_20210805_225324_model_1,0.492117,0.246393,0.38995,0.152061,,
4,XGBoost_grid__1_AutoML_20210805_225324_model_1,0.493685,0.242592,0.392907,0.154376,,
5,XGBoost_grid__1_AutoML_20210805_225324_model_2,0.497113,0.239849,0.392999,0.154448,,
6,XGBoost_1_AutoML_20210805_225324,0.498158,0.250752,0.396303,0.157056,,
7,XGBoost_2_AutoML_20210805_225324,0.503283,0.247462,0.398668,0.158936,,
8,XGBoost_3_AutoML_20210805_225324,0.511631,0.258881,0.406398,0.165159,,
9,XGBoost_grid__1_AutoML_20210805_225324_model_4,0.537923,0.26431,0.415823,0.172908,,


Unnamed: 0,model,score_val,pred_time_val,fit_time,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,WeightedEnsemble_L3,-0.411892,80.572402,3325.451152,0.017344,24.811585,3,True,21
1,LightGBMXT_BAG_L2,-0.416483,59.52254,2436.966243,0.846717,113.560937,2,True,16
2,LightGBM_BAG_L2,-0.423612,59.280401,2449.176803,0.604577,125.771497,2,True,17
3,NeuralNetFastAI_BAG_L2,-0.4282,60.880225,3012.58104,2.204401,689.175733,2,True,15
4,WeightedEnsemble_L2,-0.437446,45.625245,1733.399005,0.017288,63.279117,2,True,12
5,CatBoost_BAG_L2,-0.45192,59.455119,2436.863344,0.779295,113.458037,2,True,20
6,RandomForestEntr_BAG_L2,-0.459155,64.167467,2355.730778,5.491643,32.325472,2,True,19
7,RandomForestGini_BAG_L2,-0.466616,64.213904,2339.455117,5.538081,16.04981,2,True,18
8,LightGBM_BAG_L1,-0.467133,4.063668,237.912334,4.063668,237.912334,1,True,5
9,LightGBMXT_BAG_L1,-0.480973,28.523388,745.386145,28.523388,745.386145,1,True,4


Unnamed: 0,model,score_val,pred_time_val,fit_time,pred_time_val_marginal,fit_time_marginal,stack_level,can_infer,fit_order
0,WeightedEnsemble_L2,-0.406633,1.136555,216.655553,0.001781,2.756016,2,True,14
1,LightGBMLarge,-0.421393,0.216485,31.677305,0.216485,31.677305,1,True,13
2,LightGBM,-0.425818,0.314181,21.856558,0.314181,21.856558,1,True,5
3,XGBoost,-0.432559,0.084537,82.87762,0.084537,82.87762,1,True,11
4,LightGBMXT,-0.450137,1.395996,59.264752,1.395996,59.264752,1,True,4
5,CatBoost,-0.459812,0.018013,124.325804,0.018013,124.325804,1,True,8
6,NeuralNetFastAI,-0.465241,0.090122,71.085798,0.090122,71.085798,1,True,3
7,RandomForestGini,-0.535397,0.107455,5.983227,0.107455,5.983227,1,True,6
8,RandomForestEntr,-0.550734,0.213394,6.559559,0.213394,6.559559,1,True,7
9,ExtraTreesGini,-0.57664,0.231412,6.89521,0.231412,6.89521,1,True,9


In [46]:
for idx, row in runs.iterrows():
    if row.fold_model_txt:
        display(HTML(f'<h4>{row.framework_name} description</h4>'))
        # [0] is for the first and only test set fold
        display(row.fold_model_txt[0][:10])

["{'fitness': '(1.0, -0.5787632796617306)',\n",
 " 'model': 'ExtraTreesClassifier(input_matrix, '\n",
 "          'ExtraTreesClassifier__bootstrap=False, '\n",
 "          'ExtraTreesClassifier__criterion=gini, '\n",
 "          'ExtraTreesClassifier__max_features=0.7000000000000001, '\n",
 "          'ExtraTreesClassifier__min_samples_leaf=5, '\n",
 "          'ExtraTreesClassifier__min_samples_split=5, '\n",
 "          'ExtraTreesClassifier__n_estimators=100)',\n",
 " 'pipeline': Pipeline(steps=[('extratreesclassifier',\n",
 '                 ExtraTreesClassifier(max_features=0.7000000000000001,\n']

["[(0.380000, SimpleClassificationPipeline({'balancing:strategy': 'none', 'classifier:__choice__': 'gradient_boosting', 'data_preprocessing:categorical_transformer:categorical_encoding:__choice__': 'no_encoding', 'data_preprocessing:categorical_transformer:category_coalescence:__choice__': 'no_coalescense', 'data_preprocessing:numerical_transformer:imputation:strategy': 'most_frequent', 'data_preprocessing:numerical_transformer:rescaling:__choice__': 'minmax', 'feature_preprocessor:__choice__': 'no_preprocessing', 'classifier:gradient_boosting:early_stop': 'valid', 'classifier:gradient_boosting:l2_regularization': 4.834606545261537e-08, 'classifier:gradient_boosting:learning_rate': 0.15062492227512742, 'classifier:gradient_boosting:loss': 'auto', 'classifier:gradient_boosting:max_bins': 255, 'classifier:gradient_boosting:max_depth': 'None', 'classifier:gradient_boosting:max_leaf_nodes': 169, 'classifier:gradient_boosting:min_samples_leaf': 67, 'classifier:gradient_boosting:scoring': 'l

["[(0.200000, SimpleClassificationPipeline({'balancing:strategy': 'weighting', 'classifier:__choice__': 'gradient_boosting', 'data_preprocessing:categorical_transformer:categorical_encoding:__choice__': 'one_hot_encoding', 'data_preprocessing:categorical_transformer:category_coalescence:__choice__': 'minority_coalescer', 'data_preprocessing:numerical_transformer:imputation:strategy': 'mean', 'data_preprocessing:numerical_transformer:rescaling:__choice__': 'robust_scaler', 'feature_preprocessor:__choice__': 'no_preprocessing', 'classifier:gradient_boosting:early_stop': 'train', 'classifier:gradient_boosting:l2_regularization': 0.8458957961294217, 'classifier:gradient_boosting:learning_rate': 0.07363874651150268, 'classifier:gradient_boosting:loss': 'auto', 'classifier:gradient_boosting:max_bins': 255, 'classifier:gradient_boosting:max_depth': 'None', 'classifier:gradient_boosting:max_leaf_nodes': 999, 'classifier:gradient_boosting:min_samples_leaf': 178, 'classifier:gradient_boosting:sc

In [47]:
# show information on failed runs
for idx, row in runs.iterrows():
    if row.status != 'SUCCESS':
        print(row.framework_name)
        print(row.info)

MLNet
ParserError: Error tokenizing data. C error: Expected 1 fields in line 3, saw 3

AutoWEKA
NoResultError: AutoWEKA failed producing any prediction.


### Submit framework predictions to kaggle

In [48]:
# select members for submission to kaggle
mems = runs.query(f'status == "SUCCESS" & experiment_id == "{exp_id}" & framework_name != "constantpredictor"')
len(mems)

14

In [49]:
# get test set predictions for each framework
preds = [client.get_framework_predictions(id) for id in mems.index]

In [50]:
# pred[0] is the first and only test set fold predictions
pd.DataFrame(preds[0][0]).head()

Unnamed: 0,class_1,class_2,class_3,class_4,class_5,class_6,class_7,class_8,class_9,predictions,truth
0,0.0075,0.159,0.241,0.5415,0.0,0.0115,0.0315,0.0065,0.0015,class_4,class_1
1,0.033,0.061,0.0525,0.0475,0.0005,0.3865,0.0055,0.3915,0.022,class_8,class_1
2,0.0,0.0,0.0,0.0,0.0,1.0,0.0,0.0,0.0,class_6,class_1
3,0.005,0.591,0.3355,0.04,0.0,0.0055,0.0025,0.002,0.0185,class_2,class_1
4,0.044,0.0,0.0,0.0,0.0,0.0315,0.047,0.057,0.8205,class_9,class_1


In [51]:
classes = pd.DataFrame(preds[0][0]).columns[:-2]
classes

Index(['class_1', 'class_2', 'class_3', 'class_4', 'class_5', 'class_6',
       'class_7', 'class_8', 'class_9'],
      dtype='object')

In [52]:
!head {dataset}/sampleSubmission.csv

id,Class_1,Class_2,Class_3,Class_4,Class_5,Class_6,Class_7,Class_8,Class_9
1,1,0,0,0,0,0,0,0,0
2,1,0,0,0,0,0,0,0,0
3,1,0,0,0,0,0,0,0,0
4,1,0,0,0,0,0,0,0,0
5,1,0,0,0,0,0,0,0,0
6,1,0,0,0,0,0,0,0,0
7,1,0,0,0,0,0,0,0,0
8,1,0,0,0,0,0,0,0,0
9,1,0,0,0,0,0,0,0,0


In [53]:
!mkdir -p $dataset/submissions

In [54]:
for i, (idx, run) in enumerate(mems.iterrows()):
    print('\n' + run.framework_name)
    
    probs = pd.DataFrame(preds[i][0]).iloc[:, :-2]
    submission = pd.DataFrame(probs.values, columns=classes, index=test_ids)
    
    fname = dataset + f'/submissions/{exp_id}-{run.framework_name}.csv'
    submission.to_csv(fname)
    
    !kaggle competitions submit otto-group-product-classification-challenge -f {fname} -m {fname}


TunedRandomForest
100%|██████████████████████████████████████| 7.70M/7.70M [00:06<00:00, 1.31MB/s]
Successfully submitted to Otto Group Product Classification Challenge
TPOT
100%|██████████████████████████████████████| 18.9M/18.9M [00:14<00:00, 1.36MB/s]
Successfully submitted to Otto Group Product Classification Challenge
RandomForest
100%|██████████████████████████████████████| 8.48M/8.48M [00:06<00:00, 1.30MB/s]
Successfully submitted to Otto Group Product Classification Challenge
mljarsupervised_compete
100%|██████████████████████████████████████| 24.9M/24.9M [00:18<00:00, 1.38MB/s]
Successfully submitted to Otto Group Product Classification Challenge
mljarsupervised
100%|██████████████████████████████████████| 25.7M/25.7M [00:19<00:00, 1.37MB/s]
Successfully submitted to Otto Group Product Classification Challenge
hyperoptsklearn
100%|██████████████████████████████████████| 5.81M/5.81M [00:04<00:00, 1.29MB/s]
Successfully submitted to Otto Group Product Classification Challenge
H

In [55]:
!kaggle competitions submissions -q otto-group-product-classification-challenge --csv > {dataset}/submissions/table.csv

### Takeaways

- Amazon's AutoGluon with the extra setting for best quality performs best, achieving top 1% on the leaderboard.
- ML-JAR is the second best, which would be great if you needed a JVM framework.
- AutoSklearn versions take third and fourth place, with the newer 2.0 version performing slightly better than the original.
- Microsoft's FLAML and H2O's AutoML round out the top six.

In [56]:
pd.read_csv(f"{dataset}/submissions/table.csv").head(len(mems)).sort_values(by='privateScore')

Unnamed: 0,fileName,date,description,status,publicScore,privateScore
1,otto-v1-AutoGluon_bestquality.csv,2021-08-06 14:11:22,otto-group/submissions/otto-v1-AutoGluon_bestq...,complete,0.40223,0.40443
10,otto-v1-mljarsupervised_compete.csv,2021-08-06 14:08:33,otto-group/submissions/otto-v1-mljarsupervised...,complete,0.42885,0.42976
3,otto-v1-autosklearn2.csv,2021-08-06 14:10:37,otto-group/submissions/otto-v1-autosklearn2.csv,complete,0.44334,0.44535
2,otto-v1-autosklearn.csv,2021-08-06 14:10:59,otto-group/submissions/otto-v1-autosklearn.csv,complete,0.44757,0.44985
5,otto-v1-flaml.csv,2021-08-06 14:10:06,otto-group/submissions/otto-v1-flaml.csv,complete,0.45448,0.45521
7,otto-v1-H2OAutoML.csv,2021-08-06 14:09:27,otto-group/submissions/otto-v1-H2OAutoML.csv,complete,0.45529,0.45536
9,otto-v1-mljarsupervised.csv,2021-08-06 14:08:57,otto-group/submissions/otto-v1-mljarsupervised...,complete,0.46939,0.47
6,otto-v1-GAMA.csv,2021-08-06 14:09:44,otto-group/submissions/otto-v1-GAMA.csv,complete,0.54496,0.54814
11,otto-v1-RandomForest.csv,2021-08-06 14:08:10,otto-group/submissions/otto-v1-RandomForest.csv,complete,0.55121,0.55345
13,otto-v1-TunedRandomForest.csv,2021-08-06 14:07:43,otto-group/submissions/otto-v1-TunedRandomFore...,complete,0.56053,0.56354


### Cleanup

In [57]:
if 0:
    ## delete all datasets
    for idx in client.list_datasets().index:
        print(client.delete_dataset(idx))

    ## delete all framework runs
    for idx in client.list_framework_runs().index:
        print(client.delete_framework_run(idx))        