# Optimizing Prompts with **Automatic Prompt Engineer** (APE)

This notebook demonstrates how to use Automatic Prompt Engineer (APE) (arxiv link) to optimize prompts for text generation. In its simplest form, APE takes as input a dataset (a list of inputs and a list of outputs), a prompt template, and optimizes this prompt template so that it generates the outputs given the inputs.

APE accomplishes this in two steps. First, it uses a language model to generate a set of candidate prompts. Then, it uses a prompt evaluation function to evaluate the quality of each candidate prompt. Finally, it returns the prompt with the highest evaluation score.

In [9]:
from automatic_prompt_engineer import generate, config, template

def test_generate_instruction():
    prompt_gen_template = "I gave a friend an instruction. Based on the instruction they produced the following input-output pairs:\n\n[full_DEMO]\n\nThe instruction was to [APE]"
    demos_template = "Input: [INPUT]\nOutput: [OUTPUT]"

    prompt_gen_template = template.GenerationTemplate(prompt_gen_template)
    demos_template = template.DemosTemplate(demos_template)

    inputs = ['a', 'b', 'c', 'd', 'e', 'f', 'g', 'h', 'i', 'j']
    outputs = ['1', '2', '3', '4', '5', '6', '7', '8', '9', '10']
    data = inputs, outputs

    user_config = {
        'generation': {
            'num_subsamples': 2,
            'num_demos': 5,
            'num_prompts_per_subsample': 2,
            'model': {
                'gpt_config': {
                    'model': 'gpt-35-turbo-instruct'
                }
            }
        }
    }

    conf = config.update_config(user_config)

    prompts = generate.generate_prompts(
        prompt_gen_template, demos_template, data, conf['generation'])

    for p in prompts:
        print(p)
    assert len(prompts) == 4

test_generate_instruction()

[GPT_forward] Generating 4 completions, split into 1 batches of size 1000


100%|██████████| 1/1 [00:01<00:00,  1.26s/it]

 take the letter, convert it to its corresponding number in the alphabet, and then multiply it by 2. For example, e is the 5th letter in the alphabet, so when multiplied by 2 it becomes 10. However, since
 assign each letter of the alphabet a numerical value based on its position in the alphabet. So, for example, "a" would have a value of 1, "b" would have a value of 2, and so on. In this
 assign a numerical value to each letter of the alphabet, starting with A=1 and ending with Z=26. In this case, the input corresponds to the letter's position in the alphabet, with A=1, B=2, C=
 assign a numerical value to each letter of the alphabet, starting with "a" being equal to 1 and increasing by 1 for each subsequent letter. Therefore, "b" would be equal to 2, "c" to 3,





In [2]:
# First, let's define a simple dataset consisting of words and their antonyms.
words = ["sane", "direct", "informally", "unpopular", "subtractive", "nonresidential",
    "inexact", "uptown", "incomparable", "powerful", "gaseous", "evenly", "formality",
    "deliberately", "off"]
antonyms = ["insane", "indirect", "formally", "popular", "additive", "residential",
    "exact", "downtown", "comparable", "powerless", "solid", "unevenly", "informality",
    "accidentally", "on"]

In [3]:
# Now, we need to define the format of the prompt that we are using.

eval_template = \
"""Instruction: [PROMPT]
Input: [INPUT]
Output: [OUTPUT]"""

In [4]:
# Now, let's use APE to find prompts that generate antonyms for each word.
from automatic_prompt_engineer import ape

result, demo_fn = ape.simple_ape(
    dataset=(words, antonyms),
    eval_template=eval_template,
)

Generating prompts...
[GPT_forward] Generating 50 completions, split into 1 batches of size 2000


100%|██████████| 1/1 [00:01<00:00,  1.31s/it]


Model returned 50 prompts. Deduplicating...
Deduplicated to 37 prompts.
Evaluating prompts...


Evaluating prompts: 100%|██████████| 20/20 [00:13<00:00,  1.46it/s]

Finished evaluating.





In [5]:
# Let's see the results.
print(result)

score: prompt
----------------
-0.10:  create an opposite or contrasting word.
-0.13:  find the opposite or inverse of each word. 
-0.14:  find the opposite or inverse of the given word.
-0.14:  change the word by making it an antonym.
-0.15:  find the opposite or contrasting word for each given word.
-0.15:  find the opposite or antonym of each given word.
-0.15:  create an antonym, or a word that has the opposite meaning, of the given input.
-0.17:  create an opposite or contrasting word to the given input.
-0.17:  create an opposite or contrasting word for each input.
-0.17:  change the word to its opposite or antonym.



Let's compare with a prompt written by a human:

"*Write an antonym to the following word.*"

In [6]:
from automatic_prompt_engineer import ape

manual_prompt = "Write an antonym to the following word."

human_result = ape.simple_eval(
    dataset=(words, antonyms),
    eval_template=eval_template,
    prompts=[manual_prompt],
)

In [7]:
print(human_result)

log(p): prompt
----------------
-0.59: Write an antonym to the following word.

