# TabZilla Metadataset Tutorial

This notebook demonstrates how analyze our experimental results, including some of the results from our paper.

### First Things First

1. Please download the TabZilla results dataset `metadataset_clean.csv`, and the dataset meta-features `metafeatures_clean.csv` from our Google Drive folder [here](https://drive.google.com/drive/folders/1cHisTmruPHDCYVOYnaqvTdybLngMkB8R?usp=sharing), and place them in the same directory as this notebook.
2. You need to run this notebook with a python (3.11+) environment with `pandas` installed.

### Read the datasets

In [1]:
import pandas as pd

metadataset_df = pd.read_csv("./metadataset_clean.csv")
metafeatures_df = pd.read_csv("./metafeatures_clean.csv")

# 1. Explore our experiment results (`metadataset.csv`)

The most important columns in this dataset are:
- `dataset_fold_id`: the name of the "dataset fold". Each dataset is split into 10 train/test/validation splits for these experiments.
- `dataset_name`: the name of the dataset, not including the fold.
- `alg_name`: the name of the algorithm.
- `hparam_source`: the set of hyperparameters used with the algorithm.

Each row contains results for a single algorithm trained on the training set (80%) of the entire dataset, and then evaluated on both the validation and test sets (each 10%). 

This file includes the following metrics:
- Log Loss
- AUC
- Accuracy
- F1 Score
- runtime ("time").

For each of the three splits: train, test, and validation. These columns have the naming convention "{metric}__{split}". For example, the column "Log Loss__val" is the Log Loss calculated on the validation set, and "time__test" is the runtime to evaluate the test test.

For example, here are the log loss and training time results for CatBoost using default hyperparameters, for all splits of the dataset "openml__adult-census__3953":

In [4]:
metadataset_df.loc[
    (metadataset_df["alg_name"] == "CatBoost") & 
    (metadataset_df["hparam_source"] == "default") &
    (metadataset_df["dataset_name"] == "openml__wdbc__9946"),
    [
        "dataset_fold_id", 
        "alg_name", 
        "hparam_source", 
        "Accuracy__test", 
        "training_time"]
]

Unnamed: 0,dataset_fold_id,alg_name,hparam_source,Accuracy__test,training_time
965990,openml__wdbc__9946__fold_0,CatBoost,default,0.982456,4.462216
966623,openml__wdbc__9946__fold_1,CatBoost,default,0.947368,0.772709
967256,openml__wdbc__9946__fold_2,CatBoost,default,0.947368,0.798852
967889,openml__wdbc__9946__fold_3,CatBoost,default,0.929825,1.079144
968522,openml__wdbc__9946__fold_4,CatBoost,default,1.0,1.082342
969155,openml__wdbc__9946__fold_5,CatBoost,default,0.982456,0.644586
969788,openml__wdbc__9946__fold_6,CatBoost,default,0.964912,0.8657
970421,openml__wdbc__9946__fold_7,CatBoost,default,0.964912,1.075601
971054,openml__wdbc__9946__fold_8,CatBoost,default,0.947368,1.08744
971687,openml__wdbc__9946__fold_9,CatBoost,default,0.982143,1.073569
