# Task B: Meta-Learning Perfomance Prediction

In this task, you will use information on training parameters and metadata on multiple OpenML dataset to train a performance predictor that performs well even for unseen datasets. You are provided with config parameters and metafeatures for six datasets. The datasets are split into training datasets and test datasets and you should only train on the training datasets.

For questions, you can contact zimmerl@informatik.uni-freiburg.de

__Note: Please use the dataloading and splits you are provided with in this notebook.__

## Specifications:

* Data: six_datasets_lw.json
* Number of datasets: 6
* Training datasets: higgs, vehicle, adult, volkert
* Test datasets: Fashion-MNIST, jasmine
* Number of configurations: 2000
* Available data: architecture parameters and hyperparameters, metafeatures 
* Target: final validation accuracy
* Evaluation metric: MSE

## Importing

Note: There are 51 steps logged, 50 epochs plus the 0th epoch, prior to any weight updates.

In [None]:
%%capture
%cd ..
#external
import numpy as np
import json
import sys
import matplotlib.pyplot as plt

#pytorch
import torch
import torch.optim as optim
import torch.nn as nn
from torch.utils.data import TensorDataset, DataLoader

#local
sys.path.append("../")
from func.networks.FNN_meta_WO_HPO_5Layer import FNN_meta_WO_HPO_5Layer
from func.train_eval import train_model, eval_model
from func.load_data import prepare_dataloaders, load_data_from_file
from func.number_neurons import number_neurons, Capturing

In [None]:
with Capturing() as output:
        number_neurons(0, 100, 5, 5)
print(output)

## Cuda config

In [None]:
torch.cuda.empty_cache()

## Load data

In [None]:
X_train, X_metafeatures_train, y_train, X_val, X_metafeatures_val, y_val, X_test, X_metafeatures_test, y_test    =   load_data_from_file("cached/six_datasets_lw.json", "cached/metafeatures_6.json")

In [None]:
print("X_train:", X_train.shape)
print("X_val:", X_val.shape)
print("X_test:", X_test.shape)
print()
print("Y_Train:",y_train.shape)
print("Y_val:",y_val.shape)
print("Y_Test:",y_test.shape)
print()
print("X_metafeatures_train:",X_metafeatures_train.shape)
print("X_metafeatures_val:" ,X_metafeatures_val.shape)
print("X_metafeatures_test:" , X_metafeatures_test.shape)

## 

# Prepare data
## Preprocess data + Create dataloaders for the preprocessed data tensors

In [None]:
batch_size = 20

train_dataloader = prepare_dataloaders(X_hp=X_train, X_mf=X_metafeatures_train, y= y_train, scaling="minmax",batch_size=batch_size)
validation_dataloader = prepare_dataloaders(X_hp=X_val, X_mf=X_metafeatures_val, y= y_val, scaling="minmax",batch_size=batch_size)
test_dataloader = prepare_dataloaders(X_hp=X_test, X_mf=X_metafeatures_test, y= y_test, scaling="minmax",batch_size=batch_size)

## Check the data in the tensors

In [None]:
for x,  y in train_dataloader:
    print("X- minibatched: ", x)
    print("y- minibatched: ", y)
    break

## Training and scoring

In [None]:
model = FNN_meta_WO_HPO_5Layer(55, 1)
print("Model:")
print(model)

In [None]:
epochs = 70
optimizer  = optim.Adam(model.parameters())
scheduler = optim.lr_scheduler.CosineAnnealingWarmRestarts(optimizer, T_0=50)
criterion = nn.MSELoss()
train_model(train_dataloader, validation_dataloader, epochs, model, optimizer, scheduler, criterion)
#eval_model(test_dataloader, model, criterion)

## Custom neural network tuned with BOHB

In [None]:
from networks.NN_HPO import PyTorchWorker

In [None]:
working_dir = os.curdir
# minimum budget that BOHB uses
min_budget = 1
# largest budget BOHB will use
max_budget = 9

In [None]:
worker = PyTorchWorker(run_id='0', input_size=55, output_size=1, train_loader=train_dataloader, validation_loader=validation_dataloader, test_loader=test_dataloader)
cs = worker.get_configspace()

config = cs.sample_configuration().get_dictionary()
print(config)

res = worker.compute(config=config, budget=min_budget, working_directory=working_dir)
print(res)