![tracker](https://us-central1-vertex-ai-mlops-369716.cloudfunctions.net/pixel-tracking?path=statmike%2Fvertex-ai-mlops%2FApplied+Optimization&file=Vertex+AI+Vizier+-+Getting+Started.ipynb)
<!--- header table --->
<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/statmike/vertex-ai-mlops/blob/main/Applied%20Optimization/Vertex%20AI%20Vizier%20-%20Getting%20Started.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo">
      <br>Run in<br>Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2Fstatmike%2Fvertex-ai-mlops%2Fmain%2FApplied%2520Optimization%2FVertex%2520AI%2520Vizier%2520-%2520Getting%2520Started.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo">
      <br>Run in<br>Colab Enterprise
    </a>
  </td>      
  <td style="text-align: center">
    <a href="https://github.com/statmike/vertex-ai-mlops/blob/main/Applied%20Optimization/Vertex%20AI%20Vizier%20-%20Getting%20Started.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      <br>View on<br>GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/statmike/vertex-ai-mlops/main/Applied%20Optimization/Vertex%20AI%20Vizier%20-%20Getting%20Started.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      <br>Open in<br>Vertex AI Workbench
    </a>
  </td>
</table>

# Vertex AI Vizier - Getting Started


---
## Colab Setup

To run this notebook in Colab run the cells in this section.  Otherwise, skip this section.

This cell will authenticate to GCP (follow prompts in the popup).

In [None]:
PROJECT_ID = 'statmike-mlops-349915' # replace with project ID

In [None]:
try:
    from google.colab import auth
    auth.authenticate_user()
    !gcloud config set project {PROJECT_ID}
    print('Colab authorized to GCP')
except Exception:
    print('Not a Colab Environment')
    pass

---
## Installs

The list `packages` contains tuples of package import names and install names.  If the import name is not found then the install name is used to install quitely for the current user.

In [None]:
# tuples of (import name, install name, min_version)
packages = [
    ('numpy', 'numpy'),
    ('plotly', 'plotly'),
    ('scipy', 'scipy'),
    ('matplotlib', 'matplotlib'),
    ('google.cloud.aiplatform', 'google-cloud-aiplatform')
]

import importlib
install = False
for package in packages:
    if not importlib.util.find_spec(package[0]):
        print(f'installing package {package[1]}')
        install = True
        !pip install {package[1]} -U -q --user
    elif len(package) == 3:
        if importlib.metadata.version(package[0]) < package[2]:
            print(f'updating package {package[1]}')
            install = True
            !pip install {package[1]} -U -q --user

### Restart Kernel (If Installs Occured)

After a kernel restart the code submission can start with the next cell after this one.

In [None]:
if install:
    import IPython
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
    IPython.display.display(IPython.display.Markdown("""<div class=\"alert alert-block alert-warning\">
        <b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. The previous cells do not need to be run again⚠️</b>
        </div>"""))

---
## Setup

inputs:

In [None]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

In [None]:
REGION = 'us-central1'
SERIES = 'applied-optimizaton'
EXPERIMENT = 'vizier-getting-started'

packages:

In [None]:
from google.cloud import aiplatform

import scipy.optimize

import plotly.graph_objects as go
from plotly.subplots import make_subplots
import numpy as np

clients:

The [Vizier Clients](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.services.vizier_service).

In [None]:
vizier = aiplatform.gapic.VizierServiceClient(
    client_options = {"api_endpoint" : f"{REGION}-aiplatform.googleapis.com"}
)

Parameters:

In [None]:
PARENT = f"projects/{PROJECT_ID}/locations/{REGION}"

---
## The Problem

In real-world scenarios we have parameters we can control or atleast monitor.  But we don't always know the underlying system.  Why does this matter?  Well, that leaves us guessing what the best parameter values are to meet an objective, like minimum or maximum output.  Some scenarios like this:
- ML Hyperparameters
    - Best setting for hyperparameters to optimize accuracy (and minimize complexity...)
- AutoML
    - Best model type and setting for a given dataset
- System Design
    - Maximize speed while minimizing resource consumption
- Hardware Design
    - Minimize power consumption while maximizing performance
- Business Insights - The Pareto Frontier
    - optimal solution for minimizing cost while maximizing revenue

### The Knobs

Imagine we have two knobs (or 100) that can be adjusted:
- `knob_1` is [-5, 5]
- `knob_2` is [-5, 5]

In [None]:
knob_1 = np.arange(-5, 5, 0.1)
knob_2 = np.arange(-5, 5, 0.1)

### The Hidden Process(es)

We don't know the hidden processes but image we do in this case.  Here, functions defines a process.  If we knew these function then we could use calculus or numerical optimization to find the minimum/maximum values - but remember we dont know these functions.

In [None]:
def hidden_1(knob_1, knob_2):
    """
    This function has a global maximum and several local maxima.
    """
    x = knob_1
    y = knob_2
    return -(x**2 + y**2) * np.sin(x) * np.cos(0.5 * y) + 2

def hidden_2(knob_1, knob_2):
    """
    This function has a global maximum and a few local maxima.
    """
    x = knob_1
    y = knob_2
    return -(x**2 + y**2) * np.cos(0.5 * x) * np.sin(y) - 1

def hidden_3(knob_1, knob_2):
    """
    This function has a global minimum and several local minima.
    """
    x = knob_1
    y = knob_2
    return (x**2 + y**2) * np.sin(x) * np.sin(y)

### Numerical Optimization: What We Wish We Could Do In Real Life!

In [None]:
answer_1 = scipy.optimize.minimize(lambda x: hidden_1(x[0], x[1]), x0 = [0, 0], bounds = [(-5, 5), (-5, 5)]).fun
answer_1

In [None]:
answer_2 = -1*scipy.optimize.minimize(lambda x: -1*hidden_2(x[0], x[1]), x0 = [0, 0], bounds = [(-5, 5), (-5, 5)]).fun
answer_2

In [None]:
answer_3 = -1*scipy.optimize.minimize(lambda x: -1*hidden_3(x[0], x[1]), x0 = [0, 0], bounds = [(-5, 5), (-5, 5)]).fun
answer_3

### Visualize The Hidden Process(es): If Only We Could!

In [None]:
# Create the meshgrid
X, Y = np.meshgrid(knob_1, knob_2)

# Create the surface plots
surface1 = go.Surface(x=X, y=Y, z=hidden_1(X, Y), colorscale="reds", name="Hidden 1 (Min)", visible=True, showscale=False)
surface2 = go.Surface(x=X, y=Y, z=hidden_2(X, Y), colorscale="blues", name="Hidden 2 (Max)", visible=True, showscale=False)
surface3 = go.Surface(x=X, y=Y, z=hidden_3(X, Y), colorscale="greens", name="Hidden 3 (Max)", visible=True, showscale=False)

In [None]:
# Create the figure with subplots
fig = make_subplots(
    rows=1, cols=3,
    specs=[[{"is_3d": True}, {"is_3d": True}, {"is_3d": True}]],
    subplot_titles=("Hidden 1 (Min)", "Hidden 2 (Max)", "Hidden 3 (Max)"),
)

# Create the surface plots and add them to the subplots
fig.add_trace(surface1, row=1, col=1)
fig.add_trace(surface2, row=1, col=2)
fig.add_trace(surface3, row=1, col=3)

# Update layout to add color bars to each subplot
fig.update_layout(
    scene1=dict(xaxis_title="Knob 1", yaxis_title="Knob 2", zaxis_title="Result"),
    scene2=dict(xaxis_title="Knob 1", yaxis_title="Knob 2", zaxis_title="Result"),
    scene3=dict(xaxis_title="Knob 1", yaxis_title="Knob 2", zaxis_title="Result"),
    autosize=False,
    width=1200,  # Adjust width as needed
    height=500,  # Adjust height as needed
)

# Display the plot
fig.show()

In [None]:
fig.show('png')

### Competing Objectives

To make things more difficult we might even need to simultaneously balance these hidden systems to find a combined minimum and maximum.   
Once these process are overlaid it show that their is a lot of nuiance in the combine solution:

In [None]:
# Create the figure and add the surfaces
fig = go.Figure(data=[surface1, surface2, surface3])

# Add dropdown menu for toggling visibility
fig.update_layout(
    showlegend=False,  # Hide the legend
    autosize=False,
    width=1200,
    height=800,
    scene=dict(xaxis_title="Knob 1", yaxis_title="Knob 2", zaxis_title="Result"),
    title="Interactive Plot Competing Hidden Processes<br><sup>Red (Min), Blue (Max), Green (Max)</sup>",    
)

# Display the plot
fig.show()

In [None]:
fig.show('png')

### Brute Force Exploration

We have the knobs that we can adjust.  Maybe we can 'get lucky' and find a good answer by adjusting them?

Build a function to evaluate parameter suggestions:

In [None]:
def process_evaluation(knob_1, knob_2):
    results = []
    results.append(dict(metric_id = 'red', value = hidden_1(knob_1, knob_2)))
    results.append(dict(metric_id = 'blue', value = hidden_2(knob_1, knob_2)))
    results.append(dict(metric_id = 'green', value = hidden_3(knob_1, knob_2)))
    return results

In [None]:
process_evaluation(0, 0)

---
## Vertex AI Vizier: Intelligent Exploration

### Design A Study

Studies are are defined by a [StudySpec](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.types.StudySpec):
- [parameters](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.types.StudySpec.ParameterSpec)
- [metrics](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.types.StudySpec.MetricSpec)
- [algorithm](https://cloud.google.com/python/docs/reference/aiplatform/latest/google.cloud.aiplatform_v1.types.StudySpec.Algorithm) (Vizier, Grid, Random)
- more:
    - study stopping configuration
    - observation noise (if objective is reproducible (LOW) or variable (HIGH))
    - measurement selection type (last(default) or best)
    - stopic spec: decay curve, median, convex


In [None]:
parameter_spec = [
    dict(
        parameter_id = 'knob_1',
        double_value_spec = dict(min_value = np.min(knob_1), max_value = np.max(knob_1))
    ),
    dict(
        parameter_id = 'knob_2',
        double_value_spec = dict(min_value = np.min(knob_2), max_value = np.max(knob_2))
    )
]

In [None]:
metric_spec = [
    dict(metric_id = 'red', goal = 'MINIMIZE'),
    dict(metric_id = 'blue', goal = 'MAXIMIZE'),
    dict(
        metric_id = 'green',
        goal = 'MAXIMIZE',
        safety_config = dict(safety_threshold = 0, desired_min_safe_trials_fraction = 0.9)
    ),
]

In [None]:
study_spec = dict(
    display_name = f'{SERIES}-{EXPERIMENT}'.replace('-', '_'),
    study_spec = dict(
        algorithm = 'ALGORITHM_UNSPECIFIED',
        parameters = parameter_spec,
        metrics = metric_spec
    )
)

In [None]:
study_spec

### Create A Vertex AI Vizier Study

Check for existing study:

In [None]:
study = None
for s in vizier.list_studies(parent = PARENT):
    if s.display_name == study_spec['display_name']:
        study = s
        break

Create (or Retrieve) study:

In [None]:
if not study:
    study = vizier.create_study(parent = PARENT, study = study_spec)
study.name, study.display_name, study.state, study.create_time.strftime("%m-%d-%Y-%H:%M:%S")

View the study in the console:

In [None]:
print(f"https://console.cloud.google.com/vertex-ai/experiments/locations/{REGION}/studies/{study.name.split('/')[-1]}?project={PROJECT_ID}")

### Get A Suggestion: Trial

In [None]:
suggestions = vizier.suggest_trials(
    dict(
        parent = study.name,
        suggestion_count = 1,
        client_id = 'client_1'
    )
).result()

Format suggestions:

In [None]:
trial_inputs = []
for s in suggestions.trials:
    parms = dict(name = s.name)
    for p in s.parameters:
        parms[p.parameter_id] = p.value
    trial_inputs.append(parms)
trial_inputs

### Evaluate The Trial Suggestions

Turn the knobs to the suggestion and measure the outcome!  In this case we need the output from each function representing the hidden process to return to the Vizier service as measurements.

In [None]:
trial_results = [
    dict(
        name = trial['name'],
        final_measurement = dict(
            metrics = process_evaluation(trial['knob_1'], trial['knob_2'])
        )
    ) for trial in trial_inputs
]
trial_results

### Return The Measurement To The Study

In [None]:
responses = [vizier.complete_trial(trial) for trial in trial_results]

In [None]:
responses[0].state

### Iterate Up To 100 Trials

In [None]:
trial_count = 100
n_trials = 1 # complete 1 above
batch_size = 1

while n_trials < trial_count:
    # get suggestion(s)
    suggestions = vizier.suggest_trials(
        dict(
            parent = study.name,
            suggestion_count = batch_size,
            client_id = 'client_1'
        )
    ).result()

    # format suggestion
    trial_inputs = []
    for s in suggestions.trials:
        parms = dict(name = s.name)
        for p in s.parameters:
            parms[p.parameter_id] = p.value
        trial_inputs.append(parms)

    # evaluate suggestion
    trial_results = [
        dict(
            name = trial['name'],
            final_measurement = dict(
                metrics = process_evaluation(trial['knob_1'], trial['knob_2'])
            )
        ) for trial in trial_inputs
    ]

    # register result to study
    responses = [vizier.complete_trial(trial) for trial in trial_results]

    # increment counter
    n_trials += batch_size

### Get Optimal Trials

In [None]:
optimal_trials = vizier.list_optimal_trials(dict(parent = study.name))

In [None]:
len(optimal_trials.optimal_trials)

In [None]:
optimal_values = dict(
    names = [trial.name for trial in optimal_trials.optimal_trials],
    knob_1 = [trial.parameters[0].value for trial in optimal_trials.optimal_trials],
    knob_2 = [trial.parameters[1].value  for trial in optimal_trials.optimal_trials],
    red = [trial.final_measurement.metrics[0].value for trial in optimal_trials.optimal_trials],
    blue = [trial.final_measurement.metrics[0].value for trial in optimal_trials.optimal_trials],
    green = [trial.final_measurement.metrics[0].value for trial in optimal_trials.optimal_trials]
)

### Get All Trails

In [None]:
all_trials = vizier.list_trials(dict(parent = study.name))

In [None]:
len(all_trials.trials)

In [None]:
non_optimal_values = dict(
    names = [trial.name for trial in all_trials.trials if trial.name not in optimal_values['names']],
    knob_1 = [trial.parameters[0].value for trial in all_trials.trials if trial.name not in optimal_values['names']],
    knob_2 = [trial.parameters[1].value  for trial in all_trials.trials if trial.name not in optimal_values['names']],
    red = [trial.final_measurement.metrics[0].value for trial in all_trials.trials if trial.name not in optimal_values['names']],
    blue = [trial.final_measurement.metrics[0].value for trial in all_trials.trials if trial.name not in optimal_values['names']],
    green = [trial.final_measurement.metrics[0].value for trial in all_trials.trials if trial.name not in optimal_values['names']]    
)

In [None]:
len(non_optimal_values['names'])

### Add Trials To Visualization

In [None]:
# Create the surface plots with transparency
surface1 = go.Surface(x=X, y=Y, z=hidden_1(X, Y), colorscale="reds", name="Hidden 1 (Min)", visible=True, showscale=False)#, opacity = 0.95)
surface2 = go.Surface(x=X, y=Y, z=hidden_2(X, Y), colorscale="blues", name="Hidden 2 (Max)", visible=True, showscale=False)#, opacity = 0.95)
surface3 = go.Surface(x=X, y=Y, z=hidden_3(X, Y), colorscale="greens", name="Hidden 3 (Max)", visible=True, showscale=False)#, opacity = 0.95)

# Create the figure and add the surfaces
fig = go.Figure(data=[surface1, surface2, surface3])

# add trials: optimal
for k1, k2, r, b, g in zip(optimal_values['knob_1'], optimal_values['knob_2'], optimal_values['red'], optimal_values['blue'], optimal_values['green']):
    fig.add_trace(go.Scatter3d(x = [k1, k1, k1], y = [k2, k2, k2], z = [r, b, g], mode = 'lines+markers', line = dict(color = "#00ff00"), marker = dict(size = 5)))

# add trials: non-optimal
for k1, k2, r, b, g in zip(non_optimal_values['knob_1'], non_optimal_values['knob_2'], non_optimal_values['red'], non_optimal_values['blue'], non_optimal_values['green']):
    fig.add_trace(go.Scatter3d(x = [k1, k1, k1], y = [k2, k2, k2], z = [r, b, g], mode = 'lines+markers', line = dict(color = "#000000"), marker = dict(size = 5)))

# Add dropdown menu for toggling visibility
fig.update_layout(
    showlegend=False,  # Hide the legend
    autosize=False,
    width=1200,
    height=800,
    scene=dict(xaxis_title="Knob 1", yaxis_title="Knob 2", zaxis_title="Result"),
    title="Interactive Plot Competing Hidden Processes<br><sup>Red (Min), Blue (Max), Green (Max)</sup>",    
)

# Display the plot
fig.show()

In [None]:
fig.show('png')

---
## Continue The Study

### Iterate Up To 150 Trials

Continue in batches of 5 up to 150 total trials:

In [None]:
n_trials

In [None]:
trial_count = 150
batch_size = 5

while n_trials < trial_count:
    # get suggestion(s)
    suggestions = vizier.suggest_trials(
        dict(
            parent = study.name,
            suggestion_count = batch_size,
            client_id = 'client_1'
        )
    ).result()

    # format suggestion
    trial_inputs = []
    for s in suggestions.trials:
        parms = dict(name = s.name)
        for p in s.parameters:
            parms[p.parameter_id] = p.value
        trial_inputs.append(parms)

    # evaluate suggestion
    trial_results = [
        dict(
            name = trial['name'],
            final_measurement = dict(
                metrics = process_evaluation(trial['knob_1'], trial['knob_2'])
            )
        ) for trial in trial_inputs
    ]

    # register result to study
    responses = [vizier.complete_trial(trial) for trial in trial_results]

    # increment counter
    n_trials += batch_size

### Get Optimal Trials

In [None]:
optimal_trials = vizier.list_optimal_trials(dict(parent = study.name))

In [None]:
len(optimal_trials.optimal_trials)

In [None]:
optimal_values = dict(
    names = [trial.name for trial in optimal_trials.optimal_trials],
    knob_1 = [trial.parameters[0].value for trial in optimal_trials.optimal_trials],
    knob_2 = [trial.parameters[1].value  for trial in optimal_trials.optimal_trials],
    red = [trial.final_measurement.metrics[0].value for trial in optimal_trials.optimal_trials],
    blue = [trial.final_measurement.metrics[0].value for trial in optimal_trials.optimal_trials],
    green = [trial.final_measurement.metrics[0].value for trial in optimal_trials.optimal_trials]
)

### Get All Trails

In [None]:
all_trials = vizier.list_trials(dict(parent = study.name))

In [None]:
len(all_trials.trials)

In [None]:
non_optimal_values = dict(
    names = [trial.name for trial in all_trials.trials if trial.name not in optimal_values['names']],
    knob_1 = [trial.parameters[0].value for trial in all_trials.trials if trial.name not in optimal_values['names']],
    knob_2 = [trial.parameters[1].value  for trial in all_trials.trials if trial.name not in optimal_values['names']],
    red = [trial.final_measurement.metrics[0].value for trial in all_trials.trials if trial.name not in optimal_values['names']],
    blue = [trial.final_measurement.metrics[0].value for trial in all_trials.trials if trial.name not in optimal_values['names']],
    green = [trial.final_measurement.metrics[0].value for trial in all_trials.trials if trial.name not in optimal_values['names']]    
)

In [None]:
len(non_optimal_values['names'])

### Add Trials To Visualization

In [None]:
# Create the surface plots with transparency
surface1 = go.Surface(x=X, y=Y, z=hidden_1(X, Y), colorscale="reds", name="Hidden 1 (Min)", visible=True, showscale=False)#, opacity = 0.95)
surface2 = go.Surface(x=X, y=Y, z=hidden_2(X, Y), colorscale="blues", name="Hidden 2 (Max)", visible=True, showscale=False)#, opacity = 0.95)
surface3 = go.Surface(x=X, y=Y, z=hidden_3(X, Y), colorscale="greens", name="Hidden 3 (Max)", visible=True, showscale=False)#, opacity = 0.95)

# Create the figure and add the surfaces
fig = go.Figure(data=[surface1, surface2, surface3])

# add trials: optimal
for k1, k2, r, b, g in zip(optimal_values['knob_1'], optimal_values['knob_2'], optimal_values['red'], optimal_values['blue'], optimal_values['green']):
    fig.add_trace(go.Scatter3d(x = [k1, k1, k1], y = [k2, k2, k2], z = [r, b, g], mode = 'lines+markers', line = dict(color = "#00ff00"), marker = dict(size = 5)))

# add trials: non-optimal
for k1, k2, r, b, g in zip(non_optimal_values['knob_1'], non_optimal_values['knob_2'], non_optimal_values['red'], non_optimal_values['blue'], non_optimal_values['green']):
    fig.add_trace(go.Scatter3d(x = [k1, k1, k1], y = [k2, k2, k2], z = [r, b, g], mode = 'lines+markers', line = dict(color = "#000000"), marker = dict(size = 5)))

# Add dropdown menu for toggling visibility
fig.update_layout(
    showlegend=False,  # Hide the legend
    autosize=False,
    width=1200,
    height=800,
    scene=dict(xaxis_title="Knob 1", yaxis_title="Knob 2", zaxis_title="Result"),
    title="Interactive Plot Competing Hidden Processes<br><sup>Red (Min), Blue (Max), Green (Max)</sup>",    
)

# Display the plot
fig.show()

In [None]:
fig.show('png')

---
## Delete Study

Uncomment this section to delete the study from Vertex AI.

In [None]:
#vizier.delete_study(dict(name = study.name))