# Structured Output Example

## Installations

In [1]:
# !pip install --quiet --force-reinstall prompttools

## Setup imports and API keys

First, we'll need to set our API keys. If we are in DEBUG mode, we don't need to use a real OpenAI key, so for now we'll set them to empty strings.

In [2]:
import os
os.environ['DEBUG'] = "1"  # Set to "1" if you want to use debug mode.
os.environ['OPENAI_API_KEY'] = ""

Then we'll import the relevant `prompttools` modules to setup our experiment.

In [3]:
from typing import Dict, Tuple
from prompttools.harness import PromptTemplateExperimentationHarness
from prompttools.experiment import OpenAICompletionExperiment

## Run experiments

Next, we create our test inputs. For this example, we'll use a prompt template, which uses [jinja](https://jinja.palletsprojects.com/en/3.1.x/) for templating.

In [4]:
prompt_templates = ["Generate valid JSON from the following input: {{input}}", "Generate valid python to complete the following task: {{input}}"]
user_inputs = [{"input": "The task is to count all the words in a string"}, {"input": "The task is to add up numbers 1 to 100"}]

Now we can define an experimentation harness for our inputs and model. We could also pass model arguments if, for example, we wanted to change the model temperature.

In [5]:
harness = PromptTemplateExperimentationHarness(OpenAICompletionExperiment,
                                               "text-davinci-003", 
                                               prompt_templates, 
                                               user_inputs,
                                               # Zero temperature is better for
                                               # structured outputs
                                               model_arguments={"temperature": 0})

We can then run the experiment to get results.

In [6]:
harness.run()
harness.visualize()

Unnamed: 0,prompt,response(s),latency
0,Generate valid JSON from the following input: The task is to count all the words in a string,\n\nGeorge Washington,4e-06
1,Generate valid JSON from the following input: The task is to add up numbers 1 to 100,\n\nGeorge Washington,2e-06
2,Generate valid python to complete the following task: The task is to count all the words in a string,\n\nGeorge Washington,2e-06
3,Generate valid python to complete the following task: The task is to add up numbers 1 to 100,\n\nGeorge Washington,2e-06


You can use the `pivot` keyword argument to view results by the template and inputs that created them.

In [7]:
harness.visualize(pivot=True)

prompt_template,Generate valid JSON from the following input: {{input}},Generate valid python to complete the following task: {{input}}
user_input,Unnamed: 1_level_1,Unnamed: 2_level_1
{'input': 'The task is to add up numbers 1 to 100'},\n\nGeorge Washington,\n\nGeorge Washington
{'input': 'The task is to count all the words in a string'},\n\nGeorge Washington,\n\nGeorge Washington


## Evaluate the model response

To evaluate the results, we'll define an eval function. We can use the json and python utilities to validate our responses.

In [8]:
from prompttools.utils import validate_json
from prompttools.utils import validate_python

Finally, we can evaluate and visualize the results.

In [9]:
harness.evaluate("is_json", validate_json.evaluate)
harness.evaluate("is_python", validate_python.evaluate)
harness.visualize()

Unnamed: 0,prompt,response(s),latency,is_json,is_python
0,Generate valid JSON from the following input: The task is to count all the words in a string,\n\nGeorge Washington,4e-06,0.0,0.0
1,Generate valid JSON from the following input: The task is to add up numbers 1 to 100,\n\nGeorge Washington,2e-06,0.0,0.0
2,Generate valid python to complete the following task: The task is to count all the words in a string,\n\nGeorge Washington,2e-06,0.0,0.0
3,Generate valid python to complete the following task: The task is to add up numbers 1 to 100,\n\nGeorge Washington,2e-06,0.0,0.0
