## Tutorial: Optimizing a Prompt

![TextGrad](https://github.com/vinid/data/blob/master/logo_full.png?raw=true)

An autograd engine -- for textual gradients!

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/github/zou-group/TextGrad/blob/main/examples/notebooks/Prompt-Optimization.ipynb)
[![GitHub license](https://img.shields.io/badge/License-MIT-blue.svg)](https://lbesson.mit-license.org/)
[![Arxiv](https://img.shields.io/badge/arXiv-2406.07496-B31B1B.svg)](https://arxiv.org/abs/2406.07496)
[![Documentation Status](https://readthedocs.org/projects/textgrad/badge/?version=latest)](https://textgrad.readthedocs.io/en/latest/?badge=latest)
[![PyPI - Python Version](https://img.shields.io/pypi/pyversions/textgrad)](https://pypi.org/project/textgrad/)
[![PyPI](https://img.shields.io/pypi/v/textgrad)](https://pypi.org/project/textgrad/)

**Objectives:**

* In this tutorial, we will run prompt optimization.

**Requirements:**

* You need to have an OpenAI API key to run this tutorial. This should be set as an environment variable as OPENAI_API_KEY.


In [27]:
!pip install textgrad # you might need to restart the notebook after installing textgrad

import argparse
import concurrent
from dotenv import load_dotenv
from tqdm import tqdm
import textgrad as tg
from textgrad.tasks import load_task
import numpy as np
import random
load_dotenv(override=True)




False

Let's first define some support functions

In [28]:
def set_seed(seed):
    np.random.seed(seed)
    random.seed(seed)

In [29]:
def eval_sample(item, eval_fn, model):
    """
    This function allows us to evaluate if an answer to a question in the prompt is a good answer.

    """
    x, y = item
    x = tg.Variable(x, requires_grad=False, role_description="query to the language model")
    y = tg.Variable(y, requires_grad=False, role_description="correct answer for the query")
    response = model(x)
    try:
        eval_output_variable = eval_fn(inputs=dict(prediction=response, ground_truth_answer=y))
        return int(eval_output_variable.value)
    except:
        eval_output_variable = eval_fn([x, y, response])
        eval_output_parsed = eval_fn.parse_output(eval_output_variable)
        return int(eval_output_parsed)

In [30]:
def eval_dataset(test_set, eval_fn, model, max_samples: int=None):
    if max_samples is None:
        max_samples = len(test_set)
    accuracy_list = []
    with concurrent.futures.ThreadPoolExecutor(max_workers=2) as executor:
        futures = []
        for _, sample in enumerate(test_set):

            future = executor.submit(eval_sample, sample, eval_fn, model)
            futures.append(future)
            if len(futures) >= max_samples:
                break
        tqdm_loader = tqdm(concurrent.futures.as_completed(futures), total=len(futures), position=0)
        for future in tqdm_loader:
            acc_item = future.result()
            accuracy_list.append(acc_item)
            tqdm_loader.set_description(f"Accuracy: {np.mean(accuracy_list)}")
    return accuracy_list

In [31]:
def run_validation_revert(system_prompt: tg.Variable, results, model, eval_fn, val_set):
    val_performance = np.mean(eval_dataset(val_set, eval_fn, model))
    previous_performance = np.mean(results["validation_acc"][-1])
    print("val_performance: ", val_performance)
    print("previous_performance: ", previous_performance)
    previous_prompt = results["prompt"][-1]

    if val_performance < previous_performance:
        print(f"rejected prompt: {system_prompt.value}")
        system_prompt.set_value(previous_prompt)
        val_performance = previous_performance

    results["validation_acc"].append(val_performance)

In [32]:
import os
os.environ['OPENAI_API_KEY'] = ''

In [33]:
!echo $OPENAI_API_KEY

sk-proj-x9b7z8ZrPGfQjsEH6PGCT3BlbkFJtP37rU4P6yNHO6UrKLFO


In [34]:
set_seed(12)
llm_api_eval = tg.get_engine(engine_name="gpt-3.5-turbo-0125")
llm_api_test = tg.get_engine(engine_name="gpt-3.5-turbo-0125")
tg.set_backward_engine(llm_api_eval, override=True)

# Load the data and the evaluation function
train_set, val_set, test_set, eval_fn = load_task("BBH_object_counting", evaluation_api=llm_api_eval)
print("Train/Val/Test Set Lengths: ", len(train_set), len(val_set), len(test_set))
STARTING_SYSTEM_PROMPT='Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.'


Train/Val/Test Set Lengths:  50 100 100


In [35]:
train_set[0]

('I have a flute, a piano, a trombone, four stoves, a violin, an accordion, a clarinet, a drum, two lamps, and a trumpet. How many musical instruments do I have?',
 8)

This is the system prompt we are going to start from:

In [36]:
loss_system_prompt = tg.Variable("""You will evaluate a summary to the text.
Do not attempt to summarize it yourself, only identify errors and ways to improve. Be super concise.""",
                                 requires_grad=False,
                                 role_description="system prompt")

loss_fn = tg.TextLoss(loss_system_prompt)


In [37]:
## load data
import pandas as pd

df=pd.read_excel('for_summarization_mbart_2048_chunks_summaries (1).xlsx')
df.shape

(3824, 10)

In [38]:
train_set=list(zip(df.text.iloc[:10].tolist(), df.summary.iloc[:10].tolist()))
val_set=list(zip(df.text.iloc[10:20].tolist(), df.summary.iloc[10:20].tolist()))
test_set=list(zip(df.text.iloc[20:30].tolist(), df.summary.iloc[20:30].tolist()))

In [39]:
train_loader = tg.tasks.DataLoader(train_set, batch_size=3, shuffle=True)


# Testing the 0-shot performance of the evaluation engine
system_prompt = tg.Variable(STARTING_SYSTEM_PROMPT,
                            requires_grad=True,
                            role_description="system prompt to the language model")
model_evaluation = tg.BlackboxLLM(llm_api_eval, system_prompt)

system_prompt = tg.Variable(STARTING_SYSTEM_PROMPT,
                            requires_grad=True,
                            role_description="structured system prompt to a somewhat capable language model that specifies the behavior and strategies for the QA task")
model = tg.BlackboxLLM(llm_api_test, system_prompt)

optimizer = tg.TextualGradientDescent(engine=llm_api_eval, parameters=[system_prompt])

results = {"test_acc": [], "prompt": [], "validation_acc": []}
results["test_acc"].append(eval_dataset(test_set, eval_fn, model))
results["validation_acc"].append(eval_dataset(val_set, eval_fn, model))
results["prompt"].append(system_prompt.get_value())


INFO:textgrad:LLMCall function forward
  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]               INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:StringBasedFunction
Accuracy: 0.6:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.7142857142857143:   0%|          | 0/

In [40]:
for epoch in range(3):
    for steps, (batch_x, batch_y) in enumerate((pbar := tqdm(train_loader, position=0))):
        pbar.set_description(f"Training step {steps}. Epoch {epoch}")
        optimizer.zero_grad()
        losses = []
        for (x, y) in zip(batch_x, batch_y):
            x = tg.Variable(x, requires_grad=False, role_description="query to the language model")
            y = tg.Variable(y, requires_grad=False, role_description="correct answer for the query")
            response = model(x)
            try:
                eval_output_variable = eval_fn(inputs=dict(prediction=response, ground_truth_answer=y))
            except:
                eval_output_variable = eval_fn([x, y, response])
            losses.append(eval_output_variable)
        total_loss = tg.sum(losses)
        total_loss.backward()
        optimizer.step()

        run_validation_revert(system_prompt, results, model, eval_fn, val_set)

        print("sys prompt: ", system_prompt)
        test_acc = eval_dataset(test_set, eval_fn, model)
        results["test_acc"].append(test_acc)
        results["prompt"].append(system_prompt.get_value())
        if steps == 3:
            break

Training step 0. Epoch 0: : 0it [00:00, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:Idempotent backward
INFO:textgrad:Idempotent backward
INFO:textgrad:Idempotent backward
INFO:textgrad:_backward_through_string_fn prompt
INFO:textgrad:_backward_through_string_fn gradient
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:_backward_through_string_fn prompt
INFO:textgrad:_backward_through_string_fn gradient
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:_backward_through_string_fn prompt
INFO:textgrad:_backward_through_string_fn gradient
INFO:textgrad:_backward_through_llm prompt
INFO:textgrad:_backward_through_llm gradient
INFO:textgrad:TextualGradientDescent prompt for update
INFO:textgrad:

val_performance:  0.1
previous_performance:  0.6
rejected prompt: Palun koosta järgneva teksti põhjal kokkuvõte. Kokkuvõte peaks sisaldama kuni 3 punkti. Kasuta ainult eesti keelt.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


INFO:textgrad:LLMCall function forward
  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.75:   0%|          | 0/10 [00:00<?, ?it/s]              INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8:   0%|          | 0/10 [00:00<?, ?it/s] INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8333333333333334:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.7142857142857143:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCa

val_performance:  0.1
previous_performance:  0.6
rejected prompt: Summeerige järgmine tekst. Kokkuvõte peaks koosnema kuni 3 punktist. Kasutage ainult eesti keelt.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.75:   0%|          | 0/10 [00:00<?, ?it/s]              INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.6:   0%|          | 0/10 [00:00<?, ?it/s] INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.7142857142857143:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:Stri

val_performance:  0.0
previous_performance:  0.6
rejected prompt: Summeerige järgnev tekst. Kokkuvõte peaks koosnema kuni 3 punktist. Kasutage ainult eesti keelt.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


INFO:textgrad:LLMCall function forward
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:StringBasedFunction
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.75:   0%|          | 0/10 [00:00<?, ?it/s]              INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8:   0%|          | 0/10 [00:00<?, ?it/s] INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8333333333333334:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.7142857142857143:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.625:   0%|          | 0/10 [00:00<?, ?it/s]  

val_performance:  0.4
previous_performance:  0.6
rejected prompt: Provide a detailed summary of the following text in Estonian using legal terminology where applicable. The summary should consist of exactly three bullet points and should cover key topics such as academic leave policies, quality versus efficiency in education, and support for students with disabilities.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.75:   0%|          | 0/10 [00:00<?, ?it/s]              INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8:   0%|          | 0/10 [00:00<?, ?it/s] INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8333333333333334:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.7142857142857143:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:Strin

val_performance:  0.0
previous_performance:  0.6
rejected prompt: Palun koosta järgneva teksti kokkuvõte. Kokkuvõte peaks koosnema kuni 3 punktist. Kasuta ainult eesti keelt.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]               INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.6:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.5714285714285714:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.625:   0

val_performance:  0.0
previous_performance:  0.6
rejected prompt: Palun koosta järgneva teksti kokkuvõte. Kokkuvõte peaks koosnema kuni 3 punktist. Kasuta ainult eesti keelt.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


INFO:textgrad:StringBasedFunction
  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 0.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 0.3333333333333333:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]               INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.6:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.7142857142857143:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.625:   0

val_performance:  0.1
previous_performance:  0.6
rejected prompt: Generate a detailed summary with up to 3 bullet points that encapsulate the main proposals discussed by Urmas Klaas in the text. Ensure the summary includes references to proposal numbers, key individuals, and specific policy points highlighted in the conversation.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


INFO:textgrad:StringBasedFunction
  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]               INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.6:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5714285714285714:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:String

val_performance:  0.0
previous_performance:  0.6
rejected prompt: Summeerige järgnev tekst. Kokkuvõte peaks koosnema kuni 3 punktist. Kasutage ainult eesti keelt.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.75:   0%|          | 0/10 [00:00<?, ?it/s]              INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8:   0%|          | 0/10 [00:00<?, ?it/s] INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8333333333333334:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.7142857142857143:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.625:   0%| 

val_performance:  0.5
previous_performance:  0.6
rejected prompt: Provide a detailed summary of the text focusing on key points such as coalition agreements, education funding, and legislative proposals. Ensure accuracy, clarity, and relevance in the summary. Use specific context clues like "coalition agreement" and "education minister" to guide the summary.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]               INFO:textgrad:StringBasedFunction
Accuracy: 0.6:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 0.7142857142857143:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.625:   0

val_performance:  0.5
previous_performance:  0.6
rejected prompt: Provide a detailed summary of the text focusing on key topics discussed by the commission. Ensure accuracy, coherence, and logical flow in the summary. Use specific keywords related to academic policies and regulations. Include up to 3 bullet points in Estonian.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


INFO:textgrad:StringBasedFunction
  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.75:   0%|          | 0/10 [00:00<?, ?it/s]              INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8:   0%|          | 0/10 [00:00<?, ?it/s] INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.8333333333333334:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.7142857142857143:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:Strin

val_performance:  0.0
previous_performance:  0.6
rejected prompt: Summeerige järgnev tekst. Kokkuvõte peaks koosnema kuni 3 punktist. Kasutage ainult eesti keelt.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


INFO:textgrad:LLMCall function forward
  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 1.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]               INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.6:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.7142857142857143:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:String

val_performance:  0.0
previous_performance:  0.6
rejected prompt: Palun genereeri kokkuvõte järgnevast tekstist, mis on haridus- ja teadusministri Jaak Aaviksoo kõne Riigikogus kõrgharidusreformi teemal. Kokkuvõte peaks sisaldama kuni 3 punkti. Veendu, et kokkuvõte kajastab olulisi aspekte nagu reformi eesmärk, ministri peamised argumendid ning tasuta kõrghariduse olulisus väikerahva jaoks. Kasuta grammatiliselt korrektset eesti keelt.
sys prompt:  Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian.


INFO:textgrad:LLMCall function forward
  0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
Accuracy: 0.0:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:LLMCall function forward
INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.3333333333333333:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5:   0%|          | 0/10 [00:00<?, ?it/s]               INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.6:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.6666666666666666:   0%|          | 0/10 [00:00<?, ?it/s]INFO:textgrad:StringBasedFunction
INFO:textgrad:LLMCall function forward
Accuracy: 0.5714285714285714:   0%|          | 0/

In [43]:
model.system_prompt

Variable(value=Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian., role=structured system prompt to a somewhat capable language model that specifies the behavior and strategies for the QA task, grads=Here is a conversation:

<CONVERSATION><LM_SYSTEM_PROMPT> Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian. </LM_SYSTEM_PROMPT>

<LM_INPUT>  
 Haridus- ja teadusminister Jaak Aaviksoo: Hea eesistuja! Lugupeetud rahvasaadikud! Alustan sellest, et tänan Riigikogu ja kultuurikomisjoni mulle antud võimaluse eest esineda ka teisel lugemisel. See ei ole tavapärane, aga ma arvan, et see on oluline. Üks hea kolleeg eelnevas arutelus ütles, et kavandamisel on suurim muudatus kõrghariduskorralduses viimase 60 aasta jooksul. Tundub, et see oli veidi ülepingutatud hinnang, aga muudatus see tõesti on. Ma olen veendunud ja loodetavasti olete ka teie veendunud, et see muudatus on

In [53]:
model.system_prompt.get_gradient_text()

'The structured system prompt for the language model could be improved in the following ways to address the feedback provided in the <OBJECTIVE_FUNCTION>:\n\n1. **Context Awareness**: Include specific context cues in the prompt to guide the language model towards generating a summary that captures the key details of the conversation. For example, mention that the text is a speech delivered by a minister in the Riigikogu regarding a kõrgharidusreform (higher education reform).\n\n2. **Fact-Checking**: Encourage the language model to verify the accuracy of the summary by incorporating fact-checking instructions in the prompt. This can involve prompting the model to cross-reference the summary with the original text to ensure factual correctness.\n\n3. **Completeness**: Specify in the prompt that the generated summary should cover essential aspects of the speech, such as the purpose of the reform, key arguments presented by the minister, and the importance of free higher education. This w

In [49]:
tg.BlackboxLLM

In [51]:
tg.autograd.function.Module??

## test with some text

In [57]:
text="""
 Helmen Kütt:  Suur tänu, austatud eesistuja! Lugupeetud peaminister! Ma pean vajalikuks siiski tuua siia ka selle mõtte, mille kohta ma küsisin. Nimelt, kas kaaluti ka üksi elava pensionäri toetust? Seda ei saaks ei teie vanemad ega paljud teised, kelle pension on aastal 2022 suurem kui 669 eurot. Seal on pensionilagi ja piir pandud. Need on registri järgi üksi elavad pensionärid. Tõsi, kui on abielupaar, pensionärid, kellest üks on kirjutatud ühte omavalitsusse või ühte majja ja teine teise, siis saavad mõlemad toetust. Sel puhul on registri baasil võimalik toetust maksta. Minu küsimus puudutas neid 80 000 inimest. Võib-olla oleks see [nende olukorda] kergendanud. Kuna meid vaatavad väga paljud inimesed, siis oli see oluline täpsustus, mida öelda. Teiseks, te tõite näite, et valitsus oli juba andnud riigieelarve Riigikogusse menetlusse. No ma ei usu, et kui valitsusel oleks olnud soov seda asja lahendada, ka Riigikogus seda lahendada, siis koalitsioonierakonda kuuluvad saadikud siin saalis ei oleks seda teinud. Sest see meede – baasosa tõus 20 euro võrra – ei oleks tähendanud mitte ühegi elektroonilise süsteemi juurdetegemist. See oleks olnud väga kiiresti teostatav. Selle kohta vastas meile ka sotsiaalkomisjonis sotsiaalkindlustusametnik, kes ütles, et see ei nõua mingi arvutisüsteemi loomist, erinevalt sellest, kui näiteks tuleks keskmine pension vabastada tulumaksust. Seda oleks võinud teha kohe aasta alguses. Tõepoolest, kuna inimesed jälgivad seda saadet, on mul näide. On tulnud kiri, kus inimene kirjutab, et mõtleb hoolega sel teemal kaasa. Peale eilset "AK-d" ja "Esimest stuudiot" ning tänaseid Delfi uudiseid on mul ka sellised küsimused. Kas ei võiks koalitsioonipartnerid avalikkusele edastada ühiseid seisukohti? Ma tõesti ei mäleta sellist olukorda riigis. Ma ei kritiseeri meetmete paketti, aga ma tahaksin, et valitsejad jagaksid selget ühist infot selle kohta, mida mina kodanikuna peaks tegema.

 Peaminister Kaja Kallas:  Jaa, ma olen selles teiega väga nõus, et eriti kriisis on väga oluline, et kõik räägiksid ühel häälel ega üritaks kuidagi tekki enda peale tõmmata või punkti võtta. Oluline on see, et inimesed saaksid ikkagi selgust, kuidas nad saavad abi, ja ärevust tuleb sellisel raskel ajal leevendada. Üksi elavate pensionäride kohta see küsimus, et kas saab lihtsamalt. Tõesti, info on ju olemas. Aga teisest küljest info, mida ei ole, on elektriarvete info. Nagu ma ütlesin, ega kõik, ka üksi elavad pensionärid, ei pruugi toetust vajada – just sellel põhjusel, et nende arve võib-olla ei ole kasvanud, sest neil on fikseeritud hind või ei ole arve märkimisväärselt suurenenud. See on see koht, mis tuleb kokku viia. Aga ma võtan selle info siit kaasa ja räägin kindlasti riigihalduse ministriga. Me arutame, kas tõesti oleks võimalik teha see kuidagi lihtsamalt. Tõesti, see mure ka, mis te ütlete omikrontüve või COVID‑i leviku kohta – seda on väga oluline arvesse võtta. Nii et ma võtan selle siit kaasa ning me räägime sellest ja vaatame, kas saame midagi lihtsamaks.

 Aseesimees Hanno Pevkur: Lõpetan selle küsimuse käsitlemise."""

x = tg.Variable(text, requires_grad=False, role_description="query to the language model")
model(x)

INFO:textgrad:LLMCall function forward


Variable(value=- Helmen Kütt tõi välja mure üksi elavate pensionäride toetusega seoses, kus ta tõstatas küsimuse, kas kaaluti ka nende toetamist, kelle pension ületab 669 eurot aastas.
- Peaminister Kaja Kallas nõustus, et oluline on kriisi ajal jagada selget ja ühist infot, et leevendada inimeste ärevust ning kaaluda võimalusi, kuidas toetusi lihtsustada.
- Arutelu lõpetas aseesimees Hanno Pevkur., role=response from the language model, grads=)

In [60]:
model.system_prompt.get_gradient_text()

'The structured system prompt for the language model could be improved in the following ways to address the feedback provided in the <OBJECTIVE_FUNCTION>:\n\n1. **Context Awareness**: Include specific context cues in the prompt to guide the language model towards generating a summary that captures the key details of the conversation. For example, mention that the text is a speech delivered by a minister in the Riigikogu regarding a kõrgharidusreform (higher education reform).\n\n2. **Fact-Checking**: Encourage the language model to verify the accuracy of the summary by incorporating fact-checking instructions in the prompt. This can involve prompting the model to cross-reference the summary with the original text to ensure factual correctness.\n\n3. **Completeness**: Specify in the prompt that the generated summary should cover essential aspects of the speech, such as the purpose of the reform, key arguments presented by the minister, and the importance of free higher education. This w

In [62]:
model.system_prompt

Variable(value=Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian., role=structured system prompt to a somewhat capable language model that specifies the behavior and strategies for the QA task, grads=Here is a conversation:

<CONVERSATION><LM_SYSTEM_PROMPT> Please provide summary of the following text. summary should consist of up to 3 bullet points. Use only Estonian. </LM_SYSTEM_PROMPT>

<LM_INPUT>  
 Haridus- ja teadusminister Jaak Aaviksoo: Hea eesistuja! Lugupeetud rahvasaadikud! Alustan sellest, et tänan Riigikogu ja kultuurikomisjoni mulle antud võimaluse eest esineda ka teisel lugemisel. See ei ole tavapärane, aga ma arvan, et see on oluline. Üks hea kolleeg eelnevas arutelus ütles, et kavandamisel on suurim muudatus kõrghariduskorralduses viimase 60 aasta jooksul. Tundub, et see oli veidi ülepingutatud hinnang, aga muudatus see tõesti on. Ma olen veendunud ja loodetavasti olete ka teie veendunud, et see muudatus on