# Factorial design with Empirical Bayes and Thompson Sampling

<markdowncell>
This tutorial illustrates how to run a factorial experiment.
We use Empirical Bayes to model the data and Thompson Sampling to generate
arms.

In [1]:
import pandas as pd
from typing import Dict, Optional, Tuple, Union
from ax.api import (
    ChoiceParameter,
    Arm,
    ParameterType,
    SearchSpace,
    SimpleExperiment,
    modelbridge,
)
from ax.metrics.factorial import (
    evaluation_function as factorial_metric_evaluation_function,
)
from ax.plot.scatter import plot_fitted
from ax.utils.notebook.plotting import render, init_notebook_plotting

In [2]:
init_notebook_plotting()

[INFO 03-15 16:33:48] ipy_plotting: Injecting Plotly library into cell. Do not overwrite or delete cell.


<markdowncell>
First, we define our search space. In this example we have three parameters,
each of which has a set of possible string values.

In [3]:
search_space = SearchSpace(
    parameters=[
        ChoiceParameter(
            name="factor1",
            parameter_type=ParameterType.STRING,
            values=["level11", "level12", "level13"],
        ),
        ChoiceParameter(
            name="factor2",
            parameter_type=ParameterType.STRING,
            values=["level21", "level22"],
        ),
        ChoiceParameter(
            name="factor3",
            parameter_type=ParameterType.STRING,
            values=["level31", "level32", "level33", "level34"],
        ),
    ]
)

<markdowncell>
Second, we define an evaluation function, which is responsible for computing
the mean and standard error of a given parameterization. In this example,
each possible parameter value is given a weight, and the mean of a parameterization
is determined by the weights of its values. The higher the weights, the greater
the mean.

In [4]:
def factorial_evaluation_function(
    # `parameterization` is a dict of parameter names to values of those parameters.
    parameterization: Dict[str, Optional[Union[str, bool, float]]],
    # `weight` is the weight of the parameterization, which is used to determine the variance of the estimate
    weight: Optional[float] = None,
) -> Dict[str, Tuple[float, float]]:  # dict of metric names to tuple of mean and standard error.
    coefficients = {
        "factor1": {"level11": 0.1, "level12": 0.2, "level13": 0.3},
        "factor2": {"level21": 0.1, "level22": 0.2},
        "factor3": {"level31": 0.1, "level32": 0.2, "level33": 0.3, "level34": 0.4},
    }
    return {
        "success_metric": factorial_metric_evaluation_function(
            parameterization=parameterization, coefficients=coefficients, weight=weight
        )
    }

<markdowncell>
We now set up our experiment and define the status quo arm.

In [5]:
exp = SimpleExperiment(
    name="my_factorial_closed_loop_experiment",
    search_space=search_space,
    evaluation_function=factorial_evaluation_function,
    objective_name="success_metric",
)
exp.status_quo = Arm(
    params={"factor1": "level11", "factor2": "level21", "factor3": "level31"}
)

<markdowncell>
We generate an initial batch that explores the full space of the factorial
design, including the status quo. Note that the status quo gets a weight of 5x
relative to the other arms since it is within the search space of the
FullFactorial generator.

In [6]:
factorial_generator = modelbridge.get_factorial(search_space=exp.search_space)
factorial_run = factorial_generator.gen(n=-1)
trial = (
    exp.new_batch_trial()
    .add_generator_run(factorial_run, multiplier=1)
    .add_arms_and_weights(arms=[exp.status_quo], multiplier=4)
)

<markdowncell>
We use Thompson sampling with batched updates to give more weight to best-performing arms while balancing exploration.

In [7]:
generators = []
for i in range(4):
    print("Running iteration {}...".format(i))
    data = exp.eval_trial(trial)
    thompson_generator = modelbridge.get_thompson(
        experiment=exp, data=data, min_weight=0.01
    )
    generators.append(thompson_generator)
    thompson_run = thompson_generator.gen(n=-1)
    trial = exp.new_batch_trial().add_generator_run(thompson_run)

Running iteration 0...
Running iteration 1...
Running iteration 2...
Running iteration 3...


<markdowncell>
The following plots shows the predicted value of our objective for the arms in each trial. Over time, we hone in on the parameterizations with the highest values. 

In [8]:
for generator in generators:
    render(plot_fitted(generator, metric="success_metric", rel=False))

<markdowncell>
As expected given our evaluation function, parameterizations with higher levels
perform better and are given higher weight. Below we see the parameterizations 
that made it to the final trial (along with the status quo).

In [9]:
results = pd.DataFrame(
    [
        {"values": ",".join(arm.params.values()), "weight": weight}
        for arm, weight in trial.normalized_arm_weights().items()
    ]
)
print(results)

                    values   weight
0  level12,level22,level34  0.29200
1  level13,level22,level34  0.25144
2  level12,level22,level33  0.18488
3  level13,level22,level33  0.07168
4  level11,level21,level31  0.20000
