# Task A: Creating a Performance Predictor

In this task, you will use training data from 2000 configurations on a single openml dataset to train a performance predictor. You should split the data you use into into train, test and validation set and only use the first 10 epochs of the learning curves in your predictions. You are provided with the full benchmark logs for Fashion-MNIST, that is learning curves, config parameters and gradient statistics, and you can use them freely.

Note: This notebook is meant to show how to use the API. You can choose which data you use for your predictions and should create your own dataloading and splits, however your are free to use code from here.

## Specifications:

* Data: fashion_mnist.json
* Number of datasets: 1
* Number of configurations: 2000
* Number of epochs seed during prediction: 10
* Available data: Learning curves, architecture parameters and hyperparameters, gradient statistics 
* Target: Final validation accuracy
* Evaluation metric: MSE

## Importing and splitting data

Note: There are 51 steps logged, 50 epochs plus the 0th epoch, prior to any weight updates.

In [1]:
%%capture
%cd ..
import numpy as np
from sklearn.model_selection import train_test_split
from sklearn.metrics import mean_squared_error

from api import Benchmark

In [2]:
bench_dir = "cached/fashion_mnist.json"
bench = Benchmark(bench_dir, cache=False)

==> Loading data...
==> No cached data found or cache set to False.
==> Reading json data...
==> Done.


In [3]:
# Read data
dataset_names = bench.get_dataset_names()
n_configs = bench.get_number_of_configs(dataset_names[0])

data = [bench.query(dataset_name=dn, tag="Train/val_accuracy", config_id=ind) for dn in dataset_names for ind in range(n_configs)]
configs = [bench.query(dataset_name=dn, tag="config", config_id=ind) for dn in dataset_names for ind in range(n_configs)]

X = np.array([curve[:-1] for curve in data])
y = np.array([curve[-1] for curve in data])
configs = np.array(configs)

In [4]:
# Create train test and validation split
class TrainTestValSplitter():
    
    def __init__(self):
        self.ind_train, self.ind_test = train_test_split(np.arange(len(X)), test_size=0.3)
        self.subind_train, self.subind_val = train_test_split(np.arange(len(self.ind_train)), test_size=0.3)
        
    def split(self, a):
        return a[self.ind_train][self.subind_train], a[self.ind_test], a[self.ind_train][self.subind_val]
    
    def cut(self, a, outlength=11):
        return np.array([curve[:outlength] for curve in a])
    
ttv_splitter = TrainTestValSplitter()

X_train, X_test, X_val = ttv_splitter.split(X)
y_train, y_test, y_val = ttv_splitter.split(y)
configs_train, configs_test, configs_val = ttv_splitter.split(configs)

X_test, X_val = ttv_splitter.cut(X_test), ttv_splitter.cut(X_val)

print("X_train:", X_train.shape)
print("X_test:", X_test.shape)
print("X_val:", X_val.shape)

X_train: (980, 51)
X_test: (600, 11)
X_val: (420, 11)


## A simple baseline

In [5]:
class SimpleLearningCurvePredictor():
    """A learning curve predictor that predicts the last observed epoch as final performance"""
    
    def __init__(self):
        pass
        
    def fit(self, X, y):
        pass
    
    def predict(self, X):
        predictions = []
        for curve in X:
            predictions.append(curve[-1])
        return predictions
    
def score(y_true, y_pred):
    return mean_squared_error(y_true, y_pred)

In [6]:
predictor = SimpleLearningCurvePredictor()
preds = predictor.predict(X_val)
mse = score(y_val, preds)
print(mse)

33.580107630975185
