[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/sparks-baird/self-driving-lab-demo/blob/main/notebooks/3.3-random-vs-grid-vs-bayesian-liquid.ipynb)

# Random vs. Grid vs. Bayesian Optimization

Using the liquid color-matching demo.

## Setup

Imports and the main class.

### Imports

In [None]:
try:
  import google.colab
  IN_COLAB = True
  %pip install git+https://github.com/sparks-baird/self-driving-lab-demo.git[ax-platform]
  # %pip install self-driving-lab-demo[ax-platform]
except:
  IN_COLAB = False

In [1]:
%load_ext autoreload
%autoreload 2 # just some IPython magic to recognize changes to installed packages
import pandas as pd
from self_driving_lab_demo import SelfDrivingLabDemoLiquid

### SelfDrivingLabDemo

We'll instantiate the class and verify some of the functionality described in the random
search tutorial ([`2.0-random-search.ipynb`](2.0-random-search.ipynb)).

#### Instantiation

Now, we instantiate the `SelfDrivingLabDemo` class with `autoload=True` so that records
a target to optimize against. This involves selecting a set of target measurements as the "true" input values (i.e. the RYB values that define the target spectrum) based on a random seed,
setting the pump powers to those values, and then recording the spectrum intensities.

> Note: Instantiating with autoload=True will run a color-mixing experiment.

In [5]:
from self_driving_lab_demo.demos.liquid import liquid_observe_sensor_data
from self_driving_lab_demo.utils.observe import get_paho_client
from uuid import uuid4

dummy = False


def liquid_observe_sensor_data_2(R, Y, B, **kwargs):
    # gain = 32 instead of 128 to avoid saturation of the sensor data
    return liquid_observe_sensor_data(
        R, Y, B, prerinse_power=1.0, prerinse_time=10.0, runtime=5.0, gain=32, **kwargs
    )


def liquid_dummy_observe_sensor_data(R, Y, B, **kwargs):
    # return a fixed set of values (no interaction with real hardware)
    return {
        "utc_timestamp": 1671675884,
        "ch470": 21288,
        "ch410": 5835,
        "ch440": 65535,
        "sd_card_ready": False,
        "ch510": 21632,
        "ch550": 6760,
        "ch670": 8970,
        "utc_time_str": "2022-12-22 02:24:44",
        "onboard_temperature_K": 297.8537,
        "ch620": 2901,
        "ch583": 2057,
    }


observe_sensor_data_fn = (
    liquid_observe_sensor_data_2 if not dummy else liquid_dummy_observe_sensor_data
)

PICO_ID = "test"  # @param {type:"string"}

sensor_topic = f"sdl-demo/picow/{PICO_ID}/as7341/"
paho_client = get_paho_client(sensor_topic)

session_id = f"grid-random-bayes-liquid-{str(uuid4())[:4]}"

sdl = SelfDrivingLabDemoLiquid(
    autoload=True,
    max_power=0.5,
    observe_sensor_data_fn=observe_sensor_data_fn,
    observe_sensor_data_kwargs=dict(
        pico_id=PICO_ID, session_id=session_id, client=paho_client
    ),
)


In [6]:
sdl.parameters

[{'name': 'R', 'type': 'range', 'bounds': [0.0, 0.5]},
 {'name': 'Y', 'type': 'range', 'bounds': [0.0, 0.5]},
 {'name': 'B', 'type': 'range', 'bounds': [0.0, 0.5]},
 {'name': 'w', 'type': 'range', 'bounds': [0.0, 0.5]},
 {'name': 'prerinse_power', 'type': 'range', 'bounds': [0.0, 0.5]},
 {'name': 'prerinse_time', 'type': 'range', 'bounds': [1.0, 20.0]},
 {'name': 'runtime', 'type': 'range', 'bounds': [1.0, 20.0]},
 {'name': 'atime', 'type': 'range', 'bounds': [0, 255]},
 {'name': 'astep', 'type': 'range', 'bounds': [0, 65534]},
 {'name': 'gain',
  'type': 'choice',
  'is_ordered': True,
  'values': [0.5, 1.0, 2.0, 4.0, 8.0, 16.0, 32.0, 64.0, 128.0, 256.0, 512.0]}]

#### Functionality

We can do similar things to what was done in `2.0-random-search.ipynb`. For example, getting
random inputs, observing the sensor data, and evaluating the objective function.

In [7]:
[sdl.get_random_inputs(), sdl.get_random_inputs()]

[{'R': 0.38697802427798167,
  'Y': 0.21943921987602616,
  'B': 0.42929895995569123},
 {'R': 0.34868401452968195,
  'Y': 0.047088673943824766,
  'B': 0.48781117581837796}]

In [8]:
sdl.observe_sensor_data(sdl.get_random_inputs())

{'utc_timestamp': 1671723211,
 'ch470': 4270,
 'ch410': 2562,
 'ch440': 9662,
 'sd_card_ready': False,
 'ch510': 6687,
 'ch550': 21507,
 'ch670': 7507,
 'utc_time_str': '2022-12-22 15:33:31',
 'onboard_temperature_K': 297.3855,
 'ch620': 15811,
 'ch583': 25632}

In [9]:
sdl.evaluate(sdl.get_random_inputs())

{'utc_timestamp': 1671723237,
 'ch470': 10480,
 'ch410': 4005,
 'ch440': 30225,
 'sd_card_ready': False,
 'ch510': 17545,
 'ch550': 17416,
 'ch670': 5508,
 'utc_time_str': '2022-12-22 15:33:57',
 'onboard_temperature_K': 297.3855,
 'ch620': 5490,
 'ch583': 8695,
 'mae': 9144.25,
 'rmse': 10616.122291119296,
 'frechet': 12252.036728642304}

We can also flush out the chamber manually if we'll leave the experiment off for a while.

In [10]:
sdl.clear()

## Optimization

While there are great numerical tutorials comparing [grid search vs. random search vs.
Bayesian optimization](https://towardsdatascience.com/grid-search-vs-random-search-vs-bayesian-optimization-2e68f57c3c46), here, we'll compare these three search methods in a way that perhaps you've never seen before,
namely a self-driving laboratory demo!

### Setup

We define our optimization task parameters and take care of imports.

### Optimization Task Parameters

We'll use 125 iterations repeated 5 times. The use of 125 iterations instead of something
"cleaner" like 50 or 100 is due to constraints of doing uniform (full-factorial) grid
search. $n^d$ number of points are required for uniform grid search, where $n$ and $d$
represent number of points per dimension (`n_pts_per_dim`) and number of dimensions
(`3`), respectively.

In [11]:
num_iter = 27
num_repeats = 5
SEEDS = range(10, 10 + num_repeats)

We also instantiate multiple `SelfDrivingLabDemo` instances, each with their own
unique target spectrum, and then turn off the LED.

In [12]:
sdls = [
    SelfDrivingLabDemoLiquid(
        autoload=True,
        target_seed=seed,
        max_power=0.5,
        observe_sensor_data_fn=observe_sensor_data_fn,
        observe_sensor_data_kwargs=dict(pico_id=secrets["PICO_ID_3"]),
    )
    for seed in SEEDS
]
sdls[0].clear()


Notice that the target_data is different for each.

In [13]:
df = pd.DataFrame([sdl.target_results for sdl in sdls])
df.loc[:, sdl.channel_names] # sort columns by wavelength

Unnamed: 0,ch410,ch440,ch470,ch510,ch550,ch583,ch620,ch670
0,3405,18379,4635,5359,14778,12439,6048,6724
1,4054,24751,13880,22651,24324,12761,5118,8041
2,3487,13987,12110,20583,33495,29376,15051,8870
3,2910,14831,6146,8436,19239,16061,6811,7525
4,3637,19663,4970,5480,15719,13565,5647,7340




### Imports

We'll be using `scikit-learn`'s `ParameterGrid` for grid search, `self_driving_lab_demo`'s built-in
`get_random_inputs` for random search, and `ax-platform`'s Gaussian Process Expected
Improvement (GPEI) model for Bayesian
optimization. To help with defining the grid search space, we will also use the
`bounds` and `parameters` class property of `SelfDrivingLabDemo` for convenience. Note
that 89 is the upper limit for RGB values instead of 255 since 255 is very bright.

In [14]:
import numpy as np
from tqdm.notebook import trange, tqdm
from sklearn.model_selection import ParameterGrid
from ax import optimize

In [15]:
bounds = sdls[0].bounds
bounds = dict(R=bounds["R"], Y=bounds["Y"], B=bounds["B"])
bounds

{'R': [0.0, 0.5], 'Y': [0.0, 0.5], 'B': [0.0, 0.5]}

### Grid Search

First, we need to define our parameter grid. We'll divide up the 3-dimensional parameter
space as evenly as possible (see `num_pts_per_dim` below).

In [16]:
param_grid = {}
num_pts_per_dim = round(num_iter ** (1 / len(bounds)))
for name, bnd in bounds.items():
    param_grid[name] = np.linspace(bnd[0], bnd[1], num=num_pts_per_dim)
print(f"num_pts_per_dim: {num_pts_per_dim}")

num_pts_per_dim: 3


Notice how many distinct values are along each dimension.

In [17]:
param_grid

{'R': array([0.  , 0.25, 0.5 ]),
 'Y': array([0.  , 0.25, 0.5 ]),
 'B': array([0.  , 0.25, 0.5 ])}

After assembling the full grid, notice that the total number of points is $5^3 = 125$.

In [18]:
grid = list(ParameterGrid(param_grid))
print("grid:\n", grid[0:4], "...", grid[-1:])
print("\nNumber of grid points: ", len(grid))

grid:
 [{'B': 0.0, 'R': 0.0, 'Y': 0.0}, {'B': 0.0, 'R': 0.0, 'Y': 0.25}, {'B': 0.0, 'R': 0.0, 'Y': 0.5}, {'B': 0.0, 'R': 0.25, 'Y': 0.0}] ... [{'B': 0.5, 'R': 0.5, 'Y': 0.5}]

Number of grid points:  27


Now, we can start the actual search. The grid search locations are fixed
for each of the repeat optimization campaigns; however the observed sensor data will be
stochastic and the target spectrum is different for each repeat run. An alternative approach to setting a
fixed budget and varying the target solution would be to see how many iterations it takes to meet a criteria for the
objective function similar to [this post](https://towardsdatascience.com/grid-search-vs-random-search-vs-bayesian-optimization-2e68f57c3c46); however, a fixed budget seems more characteristic of a real chemistry
or materials optimization campaign due to limits on funding, time, and other resources:
(i.e. we'll search until we find what we're looking for, until we run out of
resources, or until we decide it's no longer worth the expense, whichever comes first).

In [19]:
grid_data = [[sdl.evaluate(pt) for pt in grid] for sdl in tqdm(sdls)]
sdls[0].clear()


  0%|          | 0/5 [00:00<?, ?it/s]

### Random Search

Now, let's perform random search as we did before in
[`2.0-random-search.ipynb`](2.0-random-search.ipynb), storing the inputs and outputs as we go.

In [20]:
%%time
random_inputs = []
random_data = []
for _ in tqdm(range(num_repeats)):
    random_input = []
    random_datum = []
    for i in range(num_iter):
        random_input.append(sdl.get_random_inputs())
        random_datum.append(sdl.evaluate(random_input[i]))
    random_inputs.append(random_input)
    random_data.append(random_datum)
sdls[0].clear()

  0%|          | 0/5 [00:00<?, ?it/s]

CPU times: total: 6.11 s
Wall time: 56min 11s


### Bayesian Optimization

Now, we'll use an optimization algorithm that learns from prior information. Once a
small set of initialization points have been evaluated, the algorithm will leverage the
previously observed information to intelligently select the next point to evaluate. The
selected point will be a trade-off between exploiting the highest performance and
exploring uncertain regions (i.e. exploitation/exploration trade-off). We'll also use
a discretized Frechet distance in place of mean absolute error as a more robust
comparison between discrete distributions.

In [27]:
%%time
bo_results = []
objective_name = "frechet"

for sdl in tqdm(sdls):
    def evaluation_function(parameters):
        data = sdl.evaluate(parameters)
        return data[objective_name]

    bo_results.append(optimize(
        parameters=sdl.parameters[:3], # R, Y, B
        evaluation_function=evaluation_function,
        minimize=True,
        total_trials = num_iter,
    ))

best_parameters, values, experiment, model = zip(*bo_results)
sdls[0].clear()

  0%|          | 0/5 [00:00<?, ?it/s]

[INFO 12-22 13:24:02] ax.service.utils.instantiation: Inferred value type of ParameterType.FLOAT for parameter R. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 12-22 13:24:02] ax.service.utils.instantiation: Inferred value type of ParameterType.FLOAT for parameter Y. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 12-22 13:24:02] ax.service.utils.instantiation: Inferred value type of ParameterType.FLOAT for parameter B. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 12-22 13:24:02] ax.service.utils.instantiation: Created search space: SearchSpace(parameters=[RangeParameter(name='R', parameter_type=FLOAT, range=[0.0, 0.5]), RangeParameter(name='Y', parameter_type=FLOAT, range=[0.0, 0.5]), RangeParameter(name='B', param

CPU times: total: 8min 10s
Wall time: 58min 54s


### Analysis

Now that we've run our three optimizations, let's compare the performance in tabular
form and visually.

### Preparing the data

In [28]:
grid_obj = [[g[objective_name] for g in gd] for gd in grid_data]
random_obj = [[r[objective_name] for r in rd] for rd in random_data]
bayesian_obj = [exp.fetch_data().df["mean"].tolist() for exp in experiment]

In [29]:
obj = np.array([grid_obj, random_obj, bayesian_obj])
obj.shape

(3, 5, 27)

### Tabular

In [44]:
cum_obj = np.minimum.accumulate(obj, axis=2)
avg_obj = np.mean(cum_obj, axis=1)
std_obj = np.std(cum_obj, axis=1)
print(avg_obj.shape)
print(std_obj.shape)

(3, 27)
(3, 27)


In [45]:
np.mean(random_obj)

18483.474166954922

In [46]:
best_avg_obj = np.min(avg_obj, axis=1)
best_avg_obj

array([10583.55585801,  8239.4       ,  4916.        ])

### Best Objective vs. Iteration

In [47]:
names = ["grid", "random", "bayesian"]
df = pd.DataFrame({
    **{f"{n}_{objective_name}": m for n, m in zip(names, avg_obj)},
    **{f"{n}_std": s for n, s in zip(names, std_obj)},
})


In [48]:
obj_df = pd.melt(df.reset_index(), id_vars=["index"], value_vars = [f"grid_{objective_name}", f"random_{objective_name}", f"bayesian_{objective_name}"], var_name="method", value_name=objective_name)

std_df = pd.melt(df.reset_index(), id_vars=["index"], value_vars = ["grid_std", "random_std", "bayesian_std"], var_name="method", value_name="std")

obj_df.loc[:, "method"] = obj_df.loc[:, "method"].apply(lambda x: x.replace(f"_{objective_name}", ""))
std_df.loc[:, "method"] = std_df.loc[:, "method"].apply(lambda x: x.replace("_std", ""))

In [49]:
results_df = obj_df.merge(std_df, on=["method", "index"]).rename(columns=dict(index="iteration"))
results_df.to_csv("clslab-liquid-grid-random-bayes-comparison.csv", index=False)
results_df

Unnamed: 0,iteration,method,frechet,std
0,0,grid,38102.319947,8121.454181
1,1,grid,35976.401214,11727.675433
2,2,grid,35818.341867,12012.545831
3,3,grid,30889.270616,10553.165042
4,4,grid,26537.170294,10197.708516
...,...,...,...,...
76,22,bayesian,5099.000000,955.318586
77,23,bayesian,5059.600000,913.351980
78,24,bayesian,5059.600000,913.351980
79,25,bayesian,4922.800000,759.119068


### Visualization
As we might expect, Bayesian optimization outperforms random search while grid and
random search are on par with each other.

In [50]:
# import plotly.express as px
from self_driving_lab_demo.utils.plotting import line

fig = line(
    data_frame=results_df,
    x="iteration",
    y=objective_name,
    error_y="std",
    error_y_mode="band",
    color="method",
)
max_y = (results_df[objective_name] + results_df["std"]).max()
fig.update_yaxes(range=[0.0, max_y*1.02])
fig

#### Example Output

![](clslab-liquid-grid-random-bayes-comparison.png?)