[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/AccelerationConsortium/ac-microcourses/blob/main/notebooks/sdl-demo.ipynb)

### Getting Started
The following code lets you run the SDL demo remotely from your browser. You can run cells one at a time using the play button to the left of each cell, or you can run all cells sequentially using "Runtime" --> "Run all" via the menu bar.

<!-- Right now, multiple people running will throw things off. I should make sure the requested input parameters are in the payload and then make sure mqtt_observe_sensor_data waits for the input parameters that correspond to experiment that was requested and ignores anything else. If two people request the same experiment at the same exact time, then the experiment will be run twice and only one of the experiments will be reported to both parties. Kind of a limitation, but could probably be dealt with by including a client_id or similar as a kwarg. -->

First, let's install the `self_driving_lab_demo` Python package.

In [2]:
try:
  import google.colab
  IN_COLAB = True
  %pip install self-driving-lab-demo
except:
  IN_COLAB = False

Collecting git+https://github.com/sparks-baird/self-driving-lab-demo.git
  Cloning https://github.com/sparks-baird/self-driving-lab-demo.git to /tmp/pip-req-build-oe_40pzu
  Running command git clone --filter=blob:none --quiet https://github.com/sparks-baird/self-driving-lab-demo.git /tmp/pip-req-build-oe_40pzu
  Resolved https://github.com/sparks-baird/self-driving-lab-demo.git to commit e270af0ad0199bc88d5039339cffc6e5ca2863f5
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Installing backend dependencies ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
Collecting kaleido (from self-driving-lab-demo==0.8.3.post1.dev7+ge270af0)
  Downloading kaleido-0.2.1-py2.py3-none-manylinux1_x86_64.whl (79.9 MB)
[2K     [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m79.9/79.9 MB[0m [31m12.2 MB/s[0m eta [36m0:00:00[0m
Collecting scikit-optimize (from self-driving-lab-demo==0.8.3.post1.dev7

Next, we'll import the `SelfDrivingLabDemo` class and instantiate it. We'll pass
an observation function compatible with MQTT (i.e. the interface that makes this
demo cloud-accessible) and the needed parameters for that function.

The pico ID (`PICO_ID`) is a unique identifer for the microcontroller (e.g.
`a123b456`), but is hardcoded to the value of `"test"` for both the remotely
accessible SDL-Demo and this notebook (i.e. just for this public demonstration). If you
run into problems with using the physical test demo, you can also check the box
for `dummy` (i.e., set `dummy=True`), which will run a very basic simulation in
place of running a physical experiment on the hardware.

To make sure that experiments that are requested at the same time don't get
mixed up, an experiment ID (`experiment_id`) is generated internally for each
experiment. We can also pass a session ID (`session_id`) to make it easier to
distinguish experiments from multiple sessions.

**Update `PICO_ID` with the one corresponding to your microcontroller.**  You may also replace the values for `hostname`, `username`, and `password` if you created your own HiveMQ instance and credentials and *are using the same ones on the microcontroller*, otherwise, leave it with the default credentials*. Then, run the following code.

<sup>*Public credentials are provided for this demonstration, but generally these should be kept private.</sup>

<!-- Calling it `client_id` could be confusing, since this has a separate meaning in the MQTT style of things. Maybe I should call this something else, like `client_key`, or `username`. -->

In [4]:
from uuid import uuid4  # universally unique identifier
from self_driving_lab_demo import (
    SelfDrivingLabDemoLight,
    mqtt_observe_sensor_data,
    get_paho_client,
)

PICO_ID = "test"  # @param {type:"string"}
dummy = False  # @param {type:"boolean"}
log_to_database = False  # @param {type:"boolean"}
SESSION_ID = str(uuid4())  # random session ID
print(f"session ID: {SESSION_ID}")

R_target = 70  # @param {type:"integer"}
G_target = 0  # @param {type:"integer"}
B_target = 70  # @param {type:"integer"}

target_inputs = {"R": R_target, "G": G_target, "B": B_target}

# instantiate client once and reuse (to avoid opening too many connections)
client = get_paho_client(f"sdl-demo/picow/{PICO_ID}/as7341/")

sdl = SelfDrivingLabDemoLight(
    autoload=True,  # perform target data experiment automatically
    target_inputs=target_inputs,  # if None, then defaults to random color using `target_seed` attribute
    simulation=dummy,  # run simulation instead of physical experiment
    observe_sensor_data_fn=mqtt_observe_sensor_data,  # (default)
    observe_sensor_data_kwargs=dict(
        pico_id=PICO_ID,
        session_id=SESSION_ID,
        client=client,
        mongodb=log_to_database,
        hostname="248cc294c37642359297f75b7b023374.s2.eu.hivemq.cloud", # default
        username="sgbaird", # default
        password="D.Pq5gYtejYbU#L", # default
    ),
)

session ID: 5fa35e37-b503-4931-a0d8-4550c0ea02d1


Next, we'll observe the sensor data for the following red/green/blue (RGB) values.

In [5]:
sdl.observe_sensor_data({"R": 0, "G": 55, "B": 0})

{'utc_time_str': '2024-1-5 22:04:14',
 'utc_timestamp': 1704492254,
 'ch470': 6796,
 'ch550': 1278,
 'ch670': 323,
 'ch410': 213,
 'logged_to_mongodb': False,
 'background': {'ch583': 57,
  'ch670': 80,
  'ch510': 310,
  'ch410': 35,
  'ch620': 63,
  'ch470': 2236,
  'ch550': 284,
  'ch440': 440},
 'ch620': 311,
 'sd_card_ready': True,
 'ch510': 9453,
 'ch583': 274,
 'device_nickname': 'For MongoDB, enter whatever name you want here (optional)',
 'ch440': 527,
 'onboard_temperature_K': 305.8121,
 'encrypted_device_id_truncated': 'test'}

The microcontroller will briefly turn the LED green
and collect the data from the spectrophotometer, then publish this data to the HiveMQ MQTT server, the go-between for the microcontroller and this notebook.

<p align="center">
<img src="https://github.com/sparks-baird/self-driving-lab-demo/blob/main/notebooks/green-led.jpg?raw=1" width=500>
</p>

## "Hello, World!" of Optimization

Now, let's do the "Hello, World!" of optimization tasks and compare grid search vs.
random search vs. Bayesian optimization. If you don't know what those are, see [this
Towards Data Science
Post](https://towardsdatascience.com/grid-search-vs-random-search-vs-bayesian-optimization-2e68f57c3c46).
This is the artificial intelligence (though grid and random are "uninformed" methods) that suggests what experiment to run next. We will use the predefined search space shown below with the RGB values are capped to 35% power since 100% power can be painful to look directly at for a Neopixel LED, but you can still manually send commands that use RGB values up to 255. Note that `atime`, `astep`, and `gain` (sensor parameters) are fixed for the following search campaigns.

In [None]:
sdl.bounds

{'R': [0, 89],
 'G': [0, 89],
 'B': [0, 89],
 'atime': [0, 255],
 'astep': [0, 65534],
 'gain': [0.5, 512]}

### Run Search Algorithms

Next, we'll use some convenience functions to run each of the searches. The following cell may take approximately 20 minutes to run.

In [None]:
%%time
from self_driving_lab_demo.utils.search import (
    grid_search,
    random_search,
    ax_bayesian_optimization,
)

num_iter = 27

grid, grid_data = grid_search(sdl, num_iter)
random_inputs, random_data = random_search(sdl, num_iter)
best_parameters, values, experiment, model = ax_bayesian_optimization(sdl, num_iter)

[INFO 03-06 22:48:23] ax.service.utils.instantiation: Inferred value type of ParameterType.INT for parameter R. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 03-06 22:48:23] ax.service.utils.instantiation: Inferred value type of ParameterType.INT for parameter G. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 03-06 22:48:23] ax.service.utils.instantiation: Inferred value type of ParameterType.INT for parameter B. If that is not the expected value type, you can explicity specify 'value_type' ('int', 'float', 'bool' or 'str') in parameter dict.
[INFO 03-06 22:48:23] ax.service.utils.instantiation: Created search space: SearchSpace(parameters=[RangeParameter(name='R', parameter_type=INT, range=[0, 89]), RangeParameter(name='G', parameter_type=INT, range=[0, 89]), RangeParameter(name='B', parameter_type=INT, r

CPU times: total: 1min 46s
Wall time: 19min 37s


## Results

### Best error so far vs. iteration

In [None]:
#@markdown Let's compare how each optimization algorithm did as a function of the number of
#@markdown iterations. The faster the error goes down, the better.
import plotly.express as px
import pandas as pd
grid_input_df = pd.DataFrame(grid)
grid_output_df = pd.DataFrame(grid_data)[["frechet"]]
grid_df = pd.concat([grid_input_df, grid_output_df], axis=1)
grid_df["best_so_far"] = grid_df["frechet"].cummin()

random_input_df = pd.DataFrame(random_inputs, columns=["R", "G", "B"])
random_output_df = pd.DataFrame(random_data)[["frechet"]]
random_df = pd.concat([random_input_df, random_output_df], axis=1)
random_df["best_so_far"] = random_df["frechet"].cummin()

trials = list(experiment.trials.values())
bayes_input_df = pd.DataFrame([t.arm.parameters for t in trials])
bayes_output_df = pd.Series([t.objective_mean for t in trials], name="frechet").to_frame()
bayes_df = pd.concat([bayes_input_df, bayes_output_df], axis=1)
bayes_df["best_so_far"] = bayes_df["frechet"].cummin()

grid_df["type"] = "grid"
random_df["type"] = "random"
bayes_df["type"] = "bayesian"
df = pd.concat([grid_df, random_df, bayes_df], axis=0)
px.line(df, x=df.index, y="best_so_far", color="type").update_layout(
    xaxis_title="iteration",
    yaxis_title="Best error so far",
)

#### Example Output

![](https://github.com/sparks-baird/self-driving-lab-demo/blob/main/notebooks/mqtt-optimization-comparison.png?raw=1)

### Observed Points and Corresponding Errors

Let's take a look at the points that were observed for each of the search algorithms. The axes correspond to red (R), green (G), and blue (B) input values, and the color corresponds to "Fréchet distance" (pronounced like freh-shay). Fréchet distance is a measure of how close the measured spectrum is to the target and
should be considered simply as an error metric for this demo. Lower Fréchet distance is better,
and zero Fréchet distance is perfect.

In [None]:
#@markdown Visualize the grid points that were used for searching.
px.scatter_3d(grid_df, x="R", y="G", z="B", color="frechet", title="grid")

#### Grid Example Output

![](https://github.com/sparks-baird/self-driving-lab-demo/blob/main/notebooks/grid-observations.png?raw=1)

In [None]:
#@markdown Visualize the random points that were used for searching.
px.scatter_3d(random_df, x="R", y="G", z="B", color="frechet", title="random")

#### Random Example Output

![](https://github.com/sparks-baird/self-driving-lab-demo/blob/main/notebooks/random-observations.png?raw=1)

In [None]:
#@markdown Visualize the points that were explored during Bayesian optimization.
px.scatter_3d(bayes_df, x="R", y="G", z="B", color="frechet", title="Bayesian")

#### Bayesian Example Output

![](https://github.com/sparks-baird/self-driving-lab-demo/blob/main/notebooks/bayesian-observations.png?raw=1)

In [None]:
# @markdown Finally, we can take a look at how close the best experiments from each algorithm
# @markdown compare to the true target inputs. You may need to rotate the image to get a
# better view.

target_inputs = sdl.get_target_inputs()
true_inputs = pd.DataFrame(
    {key: target_inputs[key] for key in target_inputs}, index=[0]
)
true_inputs["type"] = "true"
best_grid_inputs = grid_df.iloc[grid_df["frechet"].idxmin()][["R", "G", "B", "type"]]
best_random_inputs = random_df.iloc[random_df["frechet"].idxmin()][
    ["R", "G", "B", "type"]
]
best_bayes_inputs = bayes_df.iloc[bayes_df["frechet"].idxmin()][["R", "G", "B", "type"]]

best_df = pd.concat([best_grid_inputs, best_random_inputs, best_bayes_inputs], axis=1).T
best_df["marker"] = "observed"
true_inputs["marker"] = "target"
best_df = pd.concat([best_df, true_inputs], axis=0)
bnds = sdl.bounds
fig = px.scatter_3d(
    best_df, x="R", y="G", z="B", color="type", symbol="marker", title="best"
).update_layout(
    scene=dict(
        xaxis=dict(
            nticks=4,
            range=[bnds["R"][0], bnds["R"][1]],
        ),
        yaxis=dict(
            nticks=4,
            range=[bnds["G"][0], bnds["G"][1]],
        ),
        zaxis=dict(
            nticks=4,
            range=[bnds["B"][0], bnds["B"][1]],
        ),
    ),
)
fig.update_traces(marker={"opacity": 0.75})
fig.data[-1].marker.symbol = "diamond-open"
fig

#### Best Points vs. True RGB

![](https://github.com/sparks-baird/self-driving-lab-demo/blob/main/notebooks/best-point-visualization.png?raw=1)

That's it! You may now go back to the assignment page.
