# 4 - Transfer Learning for Hyperparameter Optimization

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/deephyper/anl-22-summer-workshop/blob/main/notebooks/4-Transfer-Learning-for-Hyperparameter-Search.ipynb)


In this example we present how to apply transfer-learning for hyperparameter search. Let's assume you have a bunch of similar tasks for example the search of neural networks hyperparameters for different datasets. You can easily imagine that close choices of hyperparameters can perform well these different datasets even if some light additional tuning can help improve the performance. Therefore, you can perform an expensive search once to then reuse the explored set of hyperparameters of thid search and bias the following search with it. Here, we will use a cheap to compute and easy to understand example where we maximise the $f(x) = -\sum_{i=0}^{n-1}$ function. In this case the size of the problem can be defined by the variable $n$. We will start by optimizing the small-size problem where $n=1$, then apply transfer-learning from to optimize the larger-size problem where $n=2$ and visualize the difference if were not to apply transfer-learning on this larger problem instance.

In [None]:
# Test if notebook is executed from Google Colab
IN_COLAB = False
try:
    import google.colab
    IN_COLAB = True
except:
    IN_COLAB = False
print("In Colab:", IN_COLAB)

# Install dependencies if running in google collab
if IN_COLAB:
    !pip install deephyper sdv
    !pip install matplotlib==3.5.2
    !git clone https://github.com/deephyper/anl-22-summer-workshop.git

# Download the data if running in google collab
if IN_COLAB:
    %cd /content/anl-22-summer-workshop/data
    !gdown 1J4kU3j49B9xWRpALgr8d90BjJCnwhAOS
    !gdown 1fuHM93OUcu536Ux6p2Oandbi3BrGl8vh
    !gdown 1fXHrFpM21LMUFj-S7jXLI4yZcJi3oAaU

**If running in Google Colab, restart the runtime to load installed packages**. You can also ask for a TPU.

* `Runtime > Change runtime type (select TPU)`
* `Runtime > Restart runtime`

In [None]:
# Test if notebook is executed from Google Colab
IN_COLAB = False
try:
      import google.colab
      IN_COLAB = True
except:
      IN_COLAB = False
print("In Colab:", IN_COLAB)

if not(IN_COLAB):
    import os
    root_dir = os.path.dirname(os.getcwd())
    %cd $root_dir

import gzip

import deephyper
import matplotlib.pyplot as plt
import numpy as np
import tensorflow as tf

from data.utils import load_sst_data, load_data_prepared
from sklearn.decomposition import PCA

from deephyper.nas.metrics import r2, mse

print(f"DeepHyper Version: {deephyper.__version__}")


Let us start by defining the run-functions of the small and large scale problems:

In [None]:
import functools


def run(config: dict, N: int) -> float:
    y = -sum([config[f"x{i}"] ** 2 for i in range(N)])
    return y


run_small = functools.partial(run, N=1)
run_large = functools.partial(run, N=2)

Then, we can define the hyperparameter problem space based on $n$

In [None]:
from deephyper.problem import HpProblem


N = 1
problem_small = HpProblem()
for i in range(N):
    problem_small.add_hyperparameter((-10.0, 10.0), f"x{i}")
problem_small

In [None]:
N = 2
problem_large = HpProblem()
for i in range(N):
    problem_large.add_hyperparameter((-10.0, 10.0), f"x{i}")
problem_large

Then, we define setup the search and execute it:

In [None]:
from deephyper.evaluator import Evaluator
from deephyper.evaluator.callback import TqdmCallback
from deephyper.search.hps import CBO

results = {}
max_evals = 20
evaluator_small = Evaluator.create(
    run_small, method="serial", method_kwargs={"callbacks": [TqdmCallback(max_evals)]}
)
search_small = CBO(problem_small, evaluator_small, random_state=42)
results["Small"] = search_small.search(max_evals)

In [None]:
evaluator_large_tl = Evaluator.create(
    run_large, method="serial", method_kwargs={"callbacks": [TqdmCallback(max_evals)]}
)
search_large_tl = CBO(problem_large, evaluator_large_tl, random_state=42)
search_large_tl.fit_generative_model(results["Large"])
results["Large+TL"] = search_large_tl.search(max_evals)

Finally, we compare the results and quickly see that transfer-learning provided a consequant speed-up for the search:

In [None]:
plt.figure()

for strategy, df in results.items():
    x = [i for i in range(len(df))]
    plt.scatter(x, df.objective, label=strategy)
    plt.plot(x, df.objective.cummax())

plt.xlabel("Time (sec.)")
plt.ylabel("Objective")
plt.grid()
plt.legend()
plt.show()