# Prompt Optimization

## Load API Keys

<p style="background-color:#fff6e4; padding:15px; border-width:3px; border-color:#f5ecda; border-style:solid; border-radius:6px"> ⏳ <b>Note <code>(Kernel Starting)</code>:</b> This notebook takes about 30 seconds to be ready to use. You may start and watch the video while you wait.</p>

In [None]:
import warnings
warnings.filterwarnings('ignore')

In [None]:
import os
from utils import get_llama_api_key, get_llama_base_url, get_together_api_key

llama_api_key = get_llama_api_key()
llama_base_url = get_llama_base_url()
together_api_key = get_together_api_key()

In [None]:
#!pip install llama-prompt-ops==0.0.7

<div style="background-color:#fff6ff; padding:13px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">
<p> 💻 &nbsp; <b>Access <code>requirements.txt</code> and <code>helper.py</code> files:</b> 1) click on the <em>"File"</em> option on the top menu of the notebook and then 2) click on <em>"Open"</em>.</p>

<p> ⬇ &nbsp; <b>Download Notebooks:</b> 1) click on the <em>"File"</em> option on the top menu of the notebook and then 2) click on <em>"Download as"</em> and select <em>"Notebook (.ipynb)"</em>.</p>

<p> 📒 &nbsp; For more help, please see the <em>"Appendix – Tips, Help, and Download"</em> Lesson.</p>
</div>

## Creating a sample project

In [None]:
## Check if the folder exists: In the line [ -d "my-project" ] returns true if the directory is present; the || (“or”) means llama-prompt-ops create my-project executes only when that first test is false
![ -d "my-project" ] || llama-prompt-ops create my-project

In [None]:
!ls ./my-project/

## System prompt and dataset

In [None]:
!cat my-project/prompts/prompt.txt

In [None]:
!head -15 my-project/data/dataset.json

In [None]:
!cat my-project/config.yaml

In [None]:
%%writefile my-project/config.yaml
system_prompt:
  file: prompts/prompt.txt
  inputs:
  - question
  outputs:
  - answer
dataset:
  path: data/dataset.json
  input_field:
  - fields
  - input
  golden_output_field: answer
model:
  task_model: together_ai/meta-llama/Llama-4-Scout-17B-16E-Instruct
  proposer_model: together_ai/meta-llama/Llama-3.3-70B-Instruct-Turbo
  api_base: https://api.together.xyz/v1
metric:
  class: llama_prompt_ops.core.metrics.FacilityMetric
  strict_json: false
  output_field: answer
optimization:
  strategy: llama


## Running prompt optimization

<div style="background-color:#fff6ff; padding:13px; border-width:3px; border-color:#efe6ef; border-style:solid; border-radius:6px">
<p>Running prompt optimization can take a long time. To speed up running the notebooks, we will load saved results. You can change <code>run_optimization</code> to <code>True</code> to run the optimization.</p>
</div>

In [None]:
run_optimization = False     # or chnage to True to run

if run_optimization:
    !cd my-project && llama-prompt-ops migrate --api-key-env TOGETHERAI_API_KEY

## Analyzing the results

In [None]:
import glob
json_files = glob.glob("my-project/results/*.json")

import json
with open(json_files[0], "r") as f:
    data = json.load(f)
optimized_prompt = data['prompt']

In [None]:
with open("my-project/prompts/prompt.txt", "r", encoding="utf-8") as file:
    original_prompt = file.read()

In [None]:
from IPython.display import display, HTML

def compare_strings_side_by_side(text1, text2):
    html = '<table style="width: 100%; border-collapse: collapse;"><tr><th>Original Prompt</th><th>Optimized Prompt</th></tr>'
    html += f'<tr><td style="width:50% padding: 10px; vertical-align: top;"><pre style="white-space: pre-wrap; word-wrap: break-word;">{text1}</pre></td><td style="width: 50% padding: 10px; vertical-align: top;"><pre style="white-space: pre-wrap; word-wrap: break-word;">{text2}</pre></td></tr></table>'

    display(HTML(html))

compare_strings_side_by_side(original_prompt, optimized_prompt)

## Few-shot examples

In [None]:
data['few_shots'][0]['question']

In [None]:
data['few_shots'][0]['answer']

In [None]:
len(data['few_shots'])

In [None]:
few_shots = "\n\nFew shot examples\n\n"
for i, shot in enumerate(data['few_shots']):
  few_shots += f"""Example {i+1}\n=================\nQuestion:\n
  {shot['question']}\n\nAnswer:\n{shot['answer']}\n\n"""

## Compare optimized and original prompt

In [None]:
with open("my-project/data/dataset.json", 'r') as f:
  ds = json.load(f)

len(ds)

In [None]:
ds_test = ds[int(len(ds)*0.7):]
len(ds_test)

In [None]:
from utils import evaluate

In [None]:
from together import Together
from tqdm.auto import tqdm

result_original = []
client = Together()

for entry in tqdm(ds_test):
    messages=[
        {"role": "system", "content": original_prompt},
        {"role": "user", "content": entry["fields"]["input"]},
    ]

    response = client.chat.completions.create(
      model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
      messages=messages,
      temperature=0
    )

    prediction = response.choices[0].message.content
    result_original.append(evaluate(entry["answer"], prediction))

In [None]:
result_optimized = []

for entry in tqdm(ds_test):
    messages=[
        {"role": "system", "content": optimized_prompt + few_shots},
        {"role": "user", "content": entry["fields"]["input"]},
    ]

    response = client.chat.completions.create(
      model="meta-llama/Llama-3.3-70B-Instruct-Turbo",
      messages=messages,
      temperature=0
    )

    prediction = response.choices[0].message.content
    result_optimized.append(evaluate(entry["answer"], prediction))

In [None]:
result_optimized[0]

In [None]:
result_optimized[1]

In [None]:
float_keys = [k for k, v in result_original[0].items() if isinstance(v,
                                                (int, float, bool))]
{k: sum([e[k] for e in result_original])/len(result_original) for k in float_keys}

In [None]:
{k: sum([e[k] for e in result_optimized])/len(result_optimized) for k in float_keys}