# Using CAVE with AutoPyTorch

AutoPyTorch aims at building a framework for automated neural-network-configuration. Currently it supports [BOHB](https://github.com/automl/HpBandSter) for hyperparameter search.
CAVE integrates AutoPyTorch, building on it's function for further insights and visualizations.
This notebook provides an exemplary pipeline for using CAVE on / with AutoPyTorch.

We will generate some AutoPyTorch-Output. You can use your own AutoPyTorch-routine here, we will use the openml-tasks, inspired by [AutoPyTorch's tutorial notebook](https://github.com/automl/Auto-PyTorch/blob/master/examples/basics/Auto-PyTorch%20Tutorial.ipynb).

In [1]:
import shutil
log_dir = "logs/apt-cave-notebook/"
shutil.rmtree(log_dir, ignore_errors=True)

In [2]:
from autoPyTorch import AutoNetClassification
import pandas as pd
import numpy as np
import os as os
import openml
import json
from ConfigSpace.read_and_write import json as pcs_json
# Logging
from autoPyTorch.components.metrics.additional_logs import *
from autoPyTorch.pipeline.nodes import LogFunctionsSelector

task = openml.tasks.get_task(task_id=31)

X, y = task.get_X_and_y()
ind_train, ind_test = task.get_train_test_split_indices()
X_train, Y_train = X[ind_train], y[ind_train]
X_test, Y_test = X[ind_test], y[ind_test]

autopytorch = AutoNetClassification(config_preset="medium_cs",
                                    result_logger_dir=log_dir,
                                    #log_every_n_datapoints=10,
                                    additional_logs=[test_result.__name__,
                                                     test_cross_entropy.__name__,
                                                     test_balanced_accuracy.__name__],
                                   )

# Get data from the openml task "Supervised Classification on credit-g (https://www.openml.org/t/31)"
task = openml.tasks.get_task(task_id=31)
X, y = task.get_X_and_y()
ind_train, ind_test = task.get_train_test_split_indices()
X_train, Y_train = X[ind_train], y[ind_train]
X_test, Y_test = X[ind_test], y[ind_test]

In [3]:
# Equip autopytorch with additional logs
gl = GradientLogger()
lw_gl = LayerWiseGradientLogger()
additional_logs = [gradient_max(gl), gradient_mean(gl), gradient_median(gl), gradient_std(gl),
                   gradient_q10(gl), gradient_q25(gl), gradient_q75(gl), gradient_q90(gl),
                   layer_wise_gradient_max(lw_gl), layer_wise_gradient_mean(lw_gl),
                   layer_wise_gradient_median(lw_gl), layer_wise_gradient_std(lw_gl),
                   layer_wise_gradient_q10(lw_gl), layer_wise_gradient_q25(lw_gl),
                   layer_wise_gradient_q75(lw_gl), layer_wise_gradient_q90(lw_gl),
                   gradient_norm()]

for additional_log in additional_logs:
    autopytorch.pipeline[LogFunctionsSelector.get_name()].add_log_function(name=type(additional_log).__name__,
                                                                       log_function=additional_log)

    #sampling_space["additional_logs"].append(type(additional_log).__name__)

autopytorch.pipeline[LogFunctionsSelector.get_name()].add_log_function(name=test_result.__name__, 
                                                                   log_function=test_result(autopytorch, X[ind_test], y[ind_test]))
autopytorch.pipeline[LogFunctionsSelector.get_name()].add_log_function(name=test_cross_entropy.__name__,
                                                                   log_function=test_cross_entropy(autopytorch, X[ind_test], y[ind_test]))
autopytorch.pipeline[LogFunctionsSelector.get_name()].add_log_function(name=test_balanced_accuracy.__name__,
                                                                   log_function=test_balanced_accuracy(autopytorch, X[ind_test], y[ind_test]))


In [4]:
# Fit to find an incumbent configuration with BOHB
results_fit = autopytorch.fit(X_train=X_train,
                              Y_train=Y_train,
                              validation_split=0.3,
                              max_runtime=500,
                              min_budget=10,
                              max_budget=100,
                              refit=True,
                             )

  x = um.multiply(x, x, out=x)
  loglike = -n_samples / 2 * np.log(x_trans.var())
Process pynisher function call:
Traceback (most recent call last):
  File "/usr/lib/python3.6/multiprocessing/process.py", line 258, in _bootstrap
    self.run()
  File "/usr/lib/python3.6/multiprocessing/process.py", line 93, in run
    self._target(*self._args, **self._kwargs)
  File "/home/shuki/VirtualEnvs/CAVE_dev/lib/python3.6/site-packages/pynisher/limit_function_call.py", line 93, in subprocess_func
    return_value = ((func(*args, **kwargs), 0))
  File "/home/shuki/Repos/Auto-PyTorch/autoPyTorch/core/worker.py", line 124, in optimize_pipeline
    raise e
  File "/home/shuki/Repos/Auto-PyTorch/autoPyTorch/core/worker.py", line 118, in optimize_pipeline
    refit=False, rescore=False, hyperparameter_config_id=config_id, dataset_info=self.dataset_info)
  File "/home/shuki/Repos/Auto-PyTorch/autoPyTorch/pipeline/base/pipeline.py", line 60, in fit_pipeline
    return self.root.fit_traverse(**kwargs)
 

In [5]:
# Save fit results as json
with open(os.path.join(log_dir, "results_fit.json"), "w") as f:
    json.dump(results_fit, f, indent=2)
    
# Also necessary information (can be migrated either to CAVE or (preferably) to autopytorch)
with open(os.path.join(log_dir, 'configspace.json'), 'w') as f:
    f.write(pcs_json.write(autopytorch.get_hyperparameter_search_space(X_train=X_train,
                                                                   Y_train=Y_train)))
with open(os.path.join(log_dir, 'autonet_config.json'), 'w') as f:
    json.dump(autopytorch.get_current_autonet_config(), f, indent=2)
    


We can then spin up CAVE and hand it the output, as well as the autonet-instance. That way, CAVE can refit the incumbents and we can investigate the evolution of the network a bit closer.

In [6]:
from cave.cavefacade import CAVE

cave_output_dir = "cave_output"

autopytorch.update_autonet_config(autonet_config=dict([('result_logger_dir', cave_output_dir)]))

# The information in the autonet-bundle needs to be logged and loaded eventually (or all necessary logging reliably triggered in apt itself)
autonet_bundle = {'autopytorch': autopytorch,
                  'X_train': X_train,
                  'Y_train': Y_train,
                 }

cave = CAVE([log_dir],        # List of folders holding results
            cave_output_dir,  # Output directory
            ['.'],            # Target Algorithm Directory (only relevant for SMAC)
            file_format="APT",
            autopytorch=autonet_bundle,
            verbose="DEBUG")

09:48:36 Getting attr __spec__ of LazyModule instance of emcee
09:48:36 Getting attr Kernel of LazyModule instance of skopt.learning.gaussian_process.kernels
09:48:36 Getting attr __name__ of LazyModule instance of skopt.learning.gaussian_process.kernels
09:48:36 Getting attr GaussianProcessRegressor of LazyModule instance of skopt.learning.gaussian_process
09:48:36 Getting attr __name__ of LazyModule instance of skopt.learning.gaussian_process
09:48:36 Getting attr Kernel of LazyModule instance of skopt.learning.gaussian_process.kernels
09:48:36 Getting attr __name__ of LazyModule instance of skopt.learning.gaussian_process.kernels
09:48:36 Getting attr GaussianProcessRegressor of LazyModule instance of skopt.learning.gaussian_process
09:48:36 Getting attr __name__ of LazyModule instance of skopt.learning.gaussian_process
09:48:36 Loaded backend agg version unknown.


Q: should CAVE even get an autonet-instance? is all relevant information saved with info about the autonet-instance? would be nicer if there simply was some sort of scenario-file (which is partly/mostly covered by the results-dump)

In [7]:
cave.apt_overview()

0,1
embeddings,[none]
lr_scheduler,"[cosine_annealing, plateau]"
networks,[shapedresnet]
over_sampling_methods,[smote]
preprocessors,"[none, truncated_svd, power_transformer]"
target_size_strategies,"[none, upsample, median]"
result_logger_dir,logs/apt-cave-notebook/
additional_logs,"[test_result, test_cross_entropy, test_balanced_accuracy]"
validation_split,0.3
max_runtime,500


<cave.analyzer.apt.apt_overview.APTOverview at 0x7f70302d6828>

Other analyzers also run on the APT-data:

In [8]:
cave.apt_tensorboard()

  x = um.multiply(x, x, out=x)
  loglike = -n_samples / 2 * np.log(x_trans.var())


<cave.analyzer.apt.apt_tensorboard.APTTensorboard at 0x7f70302d6c88>