![tracker](https://us-central1-vertex-ai-mlops-369716.cloudfunctions.net/pixel-tracking?path=statmike%2Fvertex-ai-mlops%2FApplied+GenAI%2FEvaluation&file=Optimize+Prompts+Using+Evaluation+Metrics.ipynb)
<!--- header table --->
<table align="left">
  <td style="text-align: center">
    <a href="https://colab.research.google.com/github/statmike/vertex-ai-mlops/blob/main/Applied%20GenAI/Evaluation/Optimize%20Prompts%20Using%20Evaluation%20Metrics.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/colab-logo-32px.png" alt="Google Colaboratory logo">
      <br>Run in<br>Colab
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/colab/import/https%3A%2F%2Fraw.githubusercontent.com%2Fstatmike%2Fvertex-ai-mlops%2Fmain%2FApplied%2520GenAI%2FEvaluation%2FOptimize%2520Prompts%2520Using%2520Evaluation%2520Metrics.ipynb">
      <img width="32px" src="https://lh3.googleusercontent.com/JmcxdQi-qOpctIvWKgPtrzZdJJK-J3sWE1RsfjZNwshCFgE_9fULcNpuXYTilIR2hjwN" alt="Google Cloud Colab Enterprise logo">
      <br>Run in<br>Colab Enterprise
    </a>
  </td>      
  <td style="text-align: center">
    <a href="https://github.com/statmike/vertex-ai-mlops/blob/main/Applied%20GenAI/Evaluation/Optimize%20Prompts%20Using%20Evaluation%20Metrics.ipynb">
      <img src="https://cloud.google.com/ml-engine/images/github-logo-32px.png" alt="GitHub logo">
      <br>View on<br>GitHub
    </a>
  </td>
  <td style="text-align: center">
    <a href="https://console.cloud.google.com/vertex-ai/workbench/deploy-notebook?download_url=https://raw.githubusercontent.com/statmike/vertex-ai-mlops/main/Applied%20GenAI/Evaluation/Optimize%20Prompts%20Using%20Evaluation%20Metrics.ipynb">
      <img src="https://lh3.googleusercontent.com/UiNooY4LUgW_oTvpsNhPpQzsstV5W8F7rYgxgGBD85cWJoLmrOzhVs_ksK_vgx40SHs7jCqkTkCk=e14-rj-sc0xffffff-h130-w32" alt="Vertex AI logo">
      <br>Open in<br>Vertex AI Workbench
    </a>
  </td>
</table>

# Optimize Prompts Using Evaluation Metrics

Prompt optimization rewrites system instructions to optimize the performance of set of prompts on one or more evaluation metrics.

First it is helpful to understand the Vertex AI GenAI evaluation service as covered in this workflow: [Evaluation For GenAI](./Evaluation%20For%20GenAI.ipynb).  Evaluation is the comparison of a models output to a baseline or ground truth using a metric to quantify the performance.  Vertex AI offers pointwise metrics, pairwise metrics, and computed metrics for GenAI evaluation.  These same evalaution can be used as the optimization goal for prompt optimization - the focus of this workflow.

The Vertex AI Prompt Optimization service is a tool:
- Documentation Link: [Optimize Prompts](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer)
- This service provides code: [GitHub Link to .py file](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/prompts/prompt_optimizer/vapo_lib.py)
    - And [example notebooks](https://github.com/GoogleCloudPlatform/generative-ai/tree/main/gemini/prompts/prompt_optimizer) 
- The user uses the code to initialize a prompt optimization job which runs as a Vertex AI Custom Training Job
- The inputs are:
    - The current **System Instructions** - [Documentation Link](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#template-si)
    - The **Prompt Template** for the sample prompts - [Documentation Link](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#template-si)
        - if you have ground truth responses, or response from a known good model then the template can include a `{taget}` variable that maps to the `target` values in the input file
        - if you don't have these responses you can provide the `source_model` parameter in the configuration and the optimziation job will use the model to generate responses for comaparison.
    - A **file of input data** for each sample prompt to be used with the prompt template - [Documentation Link](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#prepare-sample-prompts)
        - either a JSONL file or a CSV stored in a GCS bucket
    - **Configuation parameters** for the job -[Documentation Link](https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#configuration)
        - Which metrics to use including custom metrics
        - Which target and source LLM to use
        - many more optional parameters
- The outputs are:
    - more here



---
## Colab Setup

To run this notebook in Colab run the cells in this section.  Otherwise, skip this section.

This cell will authenticate to GCP (follow prompts in the popup).

In [1]:
PROJECT_ID = 'statmike-mlops-349915' # replace with project ID

In [2]:
try:
    from google.colab import auth
    auth.authenticate_user()
    !gcloud config set project {PROJECT_ID}
    print('Colab authorized to GCP')
except Exception:
    print('Not a Colab Environment')
    pass

Not a Colab Environment


---
## Installs

The list `packages` contains tuples of package import names and install names.  If the import name is not found then the install name is used to install quitely for the current user.

In [3]:
# tuples of (import name, install name, min_version)
packages = [
    ('google.cloud.aiplatform', 'google-cloud-aiplatform', '1.78.0'),
    ('google.cloud.storage', 'google-cloud-storage'),
    ('pandas', 'pandas')
]

import importlib
install = False
for package in packages:
    if not importlib.util.find_spec(package[0]):
        print(f'installing package {package[1]}')
        install = True
        !pip install {package[1]} -U -q --user
    elif len(package) == 3:
        if importlib.metadata.version(package[0]) < package[2]:
            print(f'updating package {package[1]}')
            install = True
            !pip install {package[1]} -U -q --user

### API Enablement

In [4]:
!gcloud services enable aiplatform.googleapis.com

### Restart Kernel (If Installs Occured)

After a kernel restart the code submission can start with the next cell after this one.

In [5]:
if install:
    import IPython
    app = IPython.Application.instance()
    app.kernel.do_shutdown(True)
    IPython.display.display(IPython.display.Markdown("""<div class=\"alert alert-block alert-warning\">
        <b>⚠️ The kernel is going to restart. Please wait until it is finished before continuing to the next step. The previous cells do not need to be run again⚠️</b>
        </div>"""))

---
## Setup

inputs:

In [6]:
project = !gcloud config get-value project
PROJECT_ID = project[0]
PROJECT_ID

'statmike-mlops-349915'

In [7]:
REGION = 'us-central1'
SERIES = 'applied-genai'
EXPERIMENT = 'prompt-optimization'

BUCKET = PROJECT_ID # change to Bucket name if not the same as the Project ID

packages:

In [34]:
# Python standard library imports:
import json, io, requests, sys, types, datetime, time

# package imports
from IPython.display import Markdown, HTML, display

# vertex ai imports
from google.cloud import aiplatform
from google.cloud import storage
import vertexai

In [9]:
aiplatform.__version__

'1.78.0'

clients:

In [10]:
vertexai.init(project = PROJECT_ID, location = REGION)
gcs = storage.Client(project = PROJECT_ID)

---
## Optimize Prompts

### Load The Code

The code is on [GitHub](https://github.com/GoogleCloudPlatform/generative-ai/blob/main/gemini/prompts/prompt_optimizer/vapo_lib.py) as a `.py` file that is loaded in this session as a module with name `promptopt` by the following cell:

In [11]:
module_name = 'vapo_lib'
url = 'https://raw.githubusercontent.com/GoogleCloudPlatform/generative-ai/main/gemini/prompts/prompt_optimizer/vapo_lib.py'
response = requests.get(url)
vapo_lib = types.ModuleType(module_name)
vapo_lib.__file__ = f'<remote>/{module_name}.py'
sys.modules[module_name] = vapo_lib
exec(response.text, vapo_lib.__dict__)

2025-01-29 22:15:01.754682: E external/local_xla/xla/stream_executor/cuda/cuda_fft.cc:477] Unable to register cuFFT factory: Attempting to register factory for plugin cuFFT when one has already been registered
E0000 00:00:1738188901.791165 3313079 cuda_dnn.cc:8310] Unable to register cuDNN factory: Attempting to register factory for plugin cuDNN when one has already been registered
E0000 00:00:1738188901.802344 3313079 cuda_blas.cc:1418] Unable to register cuBLAS factory: Attempting to register factory for plugin cuBLAS when one has already been registered
2025-01-29 22:15:01.842139: I tensorflow/core/platform/cpu_feature_guard.cc:210] This TensorFlow binary is optimized to use available CPU instructions in performance-critical operations.
To enable the following instructions: AVX2 FMA, in other operations, rebuild TensorFlow with the appropriate compiler flags.


In [12]:
import vapo_lib as prompt_opt

### Define Inputs

In [14]:
SYSTEM_INSTRUCTIONS = "Write poems in the style requested based on the topic provided"

In [15]:
PROMPT_TEMPLATE = "Write a {type} about {topic}."

Create data for the variables define in the prompt template:

In [16]:
prompt_parameters = [
    dict(type = 'Haiku', topic = 'Lego'),
    dict(type = 'Sonnet', topic = 'Lego'),
    dict(type = 'Limerick', topic = 'Lego'),
    dict(type = 'Acrostic', topic = 'Lego'),
    dict(type = 'Ode', topic = 'Lego')
]

Store the data in GCS as either JSONL or CSV:

In [17]:
bucket = gcs.bucket(BUCKET)
blob = bucket.blob(f'{SERIES}/{EXPERIMENT}/prompt_parameters.jsonl')
with io.StringIO() as jsonl_file:
    for item in prompt_parameters:
        json.dump(item, jsonl_file)
        jsonl_file.write('\n')
    blob.upload_from_string(jsonl_file.getvalue(), content_type = 'application/jsonl')

### Define Inputs Parameters And Validate Data

In [18]:
SOURCE_MODEL = "gemini-1.5-flash-001" # or provide ground truth
TARGET_MODEL = "gemini-1.5-flash-002" # the model for which the optimized system instructions are created for
OPTIMIZATION_MODE = "instruction" # choices are instuction, demonstration, instruction_and_demo
EVAL_METRICS = ['coherence', 'fluency'] # https://cloud.google.com/vertex-ai/generative-ai/docs/learn/prompts/prompt-optimizer#supported-evaluation-metrics

In [19]:
prompt_opt.is_run_target_required(
    eval_metric_types = EVAL_METRICS,
    source_model = SOURCE_MODEL
)

False

In [20]:
prompt_opt.validate_prompt_and_data(
    template = '\n'.join([SYSTEM_INSTRUCTIONS, PROMPT_TEMPLATE]),
    dataset_path = f'gs://{bucket.name}/{blob.name}',
    placeholder_to_content = '{}',
    label_enforced = prompt_opt.is_run_target_required(
        eval_metric_types = EVAL_METRICS,
        source_model = SOURCE_MODEL
    )
)

### Run Optimization Job On Vertex AI Training

In [27]:
job_name = 'lego_lyrics_' + datetime.datetime.now().strftime("%Y-%m-%dT%H:%M:%S")
vertex_job = prompt_opt.run_apd(
    config = dict(
        project = PROJECT_ID,
        system_instruction = SYSTEM_INSTRUCTIONS,
        prompt_template = PROMPT_TEMPLATE,
        target_model = TARGET_MODEL,
        target_model_location = REGION, 
        eval_metrics_types = EVAL_METRICS,
        eval_metrics_weights = [.5, .5],
        aggregation_type = 'weighted_sum',
        source_model = SOURCE_MODEL,
        optimization_mode = OPTIMIZATION_MODE,
        input_data_path = f'gs://{bucket.name}/{blob.name}',
        output_path = f'gs://{bucket.name}/{SERIES}/{EXPERIMENT}/{job_name}'
    ),
    bucket_uri = f'gs://{bucket.name}/{SERIES}/{EXPERIMENT}/{job_name}',
    display_name = job_name
)



Job display name: lego_lyrics_2025-01-29T22:37:33
Creating CustomJob
CustomJob created. Resource name: projects/1026793852137/locations/us-central1/customJobs/8303196768523255808
To use this CustomJob in another session:
custom_job = aiplatform.CustomJob.get('projects/1026793852137/locations/us-central1/customJobs/8303196768523255808')
View Custom Job:
https://console.cloud.google.com/ai/platform/locations/us-central1/training/8303196768523255808?project=1026793852137


In [28]:
vertex_job.name, vertex_job.display_name, vertex_job.resource_name

('8303196768523255808',
 'lego_lyrics_2025-01-29T22:37:33',
 'projects/1026793852137/locations/us-central1/customJobs/8303196768523255808')

In [35]:
while vertex_job.state == aiplatform.gapic.JobState.JOB_STATE_PENDING or vertex_job.state == aiplatform.gapic.JobState.JOB_STATE_RUNNING:
    print(f"Job state: {vertex_job.state}, checking again in 30 seconds...")
    time.sleep(30)

Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...
Job state: 3, checking again in 30 seconds...


In [39]:
vertex_job.state.name

'JOB_STATE_SUCCEEDED'

### Review The Optimized Result

In [None]:
statmike-mlops-349915/applied-genai/prompt-optimization/lego_lyrics_2025-01-29T22:37:33/instruction/optimized_results.json


In [42]:
result = json.loads(
    bucket.blob(f'{SERIES}/{EXPERIMENT}/{job_name}/instruction/optimized_results.json').download_as_string()
)

In [43]:
result

{'step': 0,
 'metrics': {'coherence/mean': 4.6,
  'fluency/mean': 4.4,
  'composite_metric/mean': 4.5},
 'prompt': 'Write poems in the style requested based on the topic provided'}

### Review The Evaluation Process

In [44]:
results_ui = prompt_opt.ResultsUI(path = f'gs://{bucket.name}/{SERIES}/{EXPERIMENT}/{job_name}')

In [45]:
results_df_html = """
<style>
  .scrollable {
    width: 100%;
    height: 80px;
    overflow-y: auto;
    overflow-x: hidden;  /* Hide horizontal scrollbar */
  }
  tr:nth-child(odd) {
    background: var(--colab-highlighted-surface-color);
  }
  tr:nth-child(even) {
    background-color: var(--colab-primary-surface-color);
  }
  th {
    background-color: var(--colab-highlighted-surface-color);
  }
</style>
"""

display(HTML(results_df_html))
display(results_ui.get_container())

VBox(children=(Label(value='Select Run:'), Dropdown(layout=Layout(width='200px'), options=('gs://statmike-mlop…