See `./output_parsers.ipynb` for examples of parsing language model completions.

Here we explore whether we can generate structured data in LLM completions that conform to arbitrary JSON (as defined by Pydantic data classes). There are a few dimensions to evaluate whether this can be done robustly:

---

**full JSON schema vs. reduced schema:** Here's what's meant by full schema for a model:

```
class Foo(BaseModel):
    bar: str = Field(description="a string field")
    baz: List[int] = Field(description="an integer list field")
Foo.schema()

{'title': 'Foo',
 'type': 'object',
 'properties': {'bar': {'title': 'Bar',
   'description': 'a string field',
   'type': 'string'},
  'baz': {'title': 'Baz',
   'description': 'an integer list field',
   'type': 'array',
   'items': {'type': 'integer'}}},
 'required': ['bar', 'baz']}
```
That's verbose!


**0-shot vs. 1/n-shot**: Should there be demonstrations of mapping from a schema to an instance of the schema? In some cases we observe the LM simply copy the schema and replace the description value with a generated value, for instance.

**temperature:** Relatedly, by definition a zero temp completion is more likely to copy the schema given in context.


**model size**: DaVinci is ofc more capable than Ada in generating valid JSON.

**various schema + queries:** We'll consider a few data models and user queries with answers intended to populate these models.

---

In [1]:
from IPython.display import display, HTML
display(HTML("<style>.container { width:100% !important; }</style>"))

In [2]:
import itertools
import json
from pydantic import BaseModel, Field
from typing import Any, List, Type
from tqdm import tqdm

import pandas as pd
pd.set_option('display.max_columns', None)
pd.set_option('display.expand_frame_repr', False)
pd.set_option('max_colwidth', None)

from langchain.llms import OpenAI
from langchain.output_parsers.base import BaseOutputParser
from langchain.output_parsers.format_instructions import PYDANTIC_FORMAT_INSTRUCTIONS
from langchain.prompts import PromptTemplate

## Setup

In [3]:
JSON_FORMAT_INSTRUCTIONS_ZERO_SHOT = """The output should be formatted as a JSON instance that conforms to the JSON schema below.

Here is the output schema:
```
{schema}
```
"""
JSON_FORMAT_INSTRUCTIONS_ONE_SHOT = """The output should be formatted as a JSON instance that conforms to the JSON schema below.  For example, the object {{"foo":  ["bar", "baz"]}} conforms to the schema {{"foo": {{"description": "a list of strings field", "type": "string"}}}}.

Here is the output schema:
```
{schema}
```
"""

In [4]:
class PydanticOutputParser(BaseOutputParser):

    pydantic_object: Type[BaseModel]
    def parse(self, text: str) -> Any:
        json_object = json.loads(text.strip())
        return self.pydantic_object.parse_obj(json_object)

    def get_format_instructions(self, full_schema: bool = False, zero_shot: bool = False) -> str:
        schema = self.pydantic_object.schema()
        
        if not full_schema:
            reduced_schema = {
                prop: {
                    'description': data['description'],
                    'type': data['type']
                }
                for prop, data in schema['properties'].items()
            }
            schema = reduced_schema
        
        instruction_template = JSON_FORMAT_INSTRUCTIONS_ZERO_SHOT \
            if zero_shot else JSON_FORMAT_INSTRUCTIONS_ONE_SHOT

        schema_str = json.dumps(schema)
        
        return instruction_template.format(schema=schema_str)

In [5]:
class Actor(BaseModel):
    name: str = Field(description="name of an actor")
    film_names: List[str] = Field(description="list of names of films they starred in")
        
    def example_query() -> str:
        return "Generate the filmography for a random actor."
        
class Joke(BaseModel):
    setup: str = Field(description="question to set up a joke")
    punchline: str = Field(description="answer to resolve the joke")
    
    def example_query() -> str:
        return "Tell me a joke."
        
class FloatArray(BaseModel):
    values: List[float] = Field(description="list of floats")
        
    def example_query() -> str:
        return "Write out a few terms of fiboacci."
    
data_models = [Actor, Joke, FloatArray]

In [6]:
model_name = 'text-curie-001'
temperature = 0.5
model = OpenAI(model_name=model_name, temperature=temperature)

data_model = Actor

parser = PydanticOutputParser(pydantic_object=data_model)

prompt = PromptTemplate(
    template="Answer the user query.\n{format_instructions}\n{query}\n",
    input_variables=["query"],
    partial_variables={"format_instructions": parser.get_format_instructions()}
)

In [7]:
_input = prompt.format_prompt(query=data_model.example_query())
print(_input.to_string())

Answer the user query.
The output should be formatted as a JSON instance that conforms to the JSON schema below.  For example, the object {"foo":  ["bar", "baz"]} conforms to the schema {"foo": {"description": "a list of strings field", "type": "string"}}.

Here is the output schema:
```
{"name": {"description": "name of an actor", "type": "string"}, "film_names": {"description": "list of names of films they starred in", "type": "array"}}
```

Generate the filmography for a random actor.



In [8]:
output = model(_input.to_string())
print("Completion:\n", output)

parsed_output = None
try:
    parsed_output = parser.parse(output)
except Exception as e:
    print(str(e))
finally:
    print("Parse:\n", parsed_output)

Completion:
 
{ "name": "John Doe", "film_names": ["The A-Team", "The Bourne Supremacy", "The Dark Knight"] }
Parse:
 name='John Doe' film_names=['The A-Team', 'The Bourne Supremacy', 'The Dark Knight']


Bad cases/todos:
- "Generate info about an actor:" works but '.' seems to make the model generate something like "Here is an example:"
- Sometimes single quotes instead of double quotes
- regex, extract json bit?

---

## Evaluation

In [9]:
models = ['text-ada-001', 'text-babbage-001', 'text-curie-001', 'text-davinci-003']
temperatures = [0.0, 0.5]

In [10]:
dims = [models, temperatures, [True, False], [True, False], data_models]
config_space = list(itertools.product(*tuple(dims)))

In [45]:
data = []

for config in tqdm(config_space):

    model_name, temperature, use_zero_shot_format_instruction, \
        use_data_model_full_schema, data_model = config

    model = OpenAI(model_name=model_name, temperature=temperature)

    parser = PydanticOutputParser(pydantic_object=data_model)

    prompt = PromptTemplate(
        template="Answer the user query.\n{format_instructions}\n{query}\n",
        input_variables=["query"],
        partial_variables={"format_instructions": parser.get_format_instructions(full_schema=use_data_model_full_schema, 
                                                                                 zero_shot=use_zero_shot_format_instruction)}
    )
                        
    _input = prompt.format_prompt(query=data_model.example_query())
    
    output = model(_input.to_string())

    parsed_output = None
    success = False
    try:
        parsed_output = parser.parse(output)
        success = True
    except Exception as e:
        parsed_output = str(e)
    
    data.append({
        "model": model_name,
        "temp": temperature,
        "zero-shot format instruction": use_zero_shot_format_instruction,
        "full data model schema": use_data_model_full_schema,
        "data model": data_model.__name__,
        "prompt": _input.to_string(),
        "completion": output,
        "parsed": success
    })

100%|█████████████████████████████████████████████████████████████████████████████████████████████████████████████████████████| 96/96 [01:03<00:00,  1.51it/s]


In [46]:
df = pd.DataFrame(data)

In [47]:
df[(df['model'] == 'text-curie-001') | (df['model'] == 'text-davinci-003')]

Unnamed: 0,model,temp,zero-shot format instruction,full data model schema,data model,prompt,completion,parsed
48,text-curie-001,0.0,True,True,Actor,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nHere is the output schema:\n```\n{""title"": ""Actor"", ""type"": ""object"", ""properties"": {""name"": {""title"": ""Name"", ""description"": ""name of an actor"", ""type"": ""string""}, ""film_names"": {""title"": ""Film Names"", ""description"": ""list of names of films they starred in"", ""type"": ""array"", ""items"": {""type"": ""string""}}}, ""required"": [""name"", ""film_names""]}\n```\n\nGenerate the filmography for a random actor.\n","\n{ ""title"": ""Actor"", ""type"": ""object"", ""properties"": { ""name"": { ""title"": ""Name"", ""description"": ""name of an actor"", ""type"": ""string""}, ""film_names"": { ""title"": ""Film Names"", ""description"": ""list of names of films they starred in"", ""type"": ""array"", ""items"": { ""type"": ""string""}} }, ""required"": [""name"", ""film_names""] }",False
49,text-curie-001,0.0,True,True,Joke,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nHere is the output schema:\n```\n{""title"": ""Joke"", ""type"": ""object"", ""properties"": {""setup"": {""title"": ""Setup"", ""description"": ""question to set up a joke"", ""type"": ""string""}, ""punchline"": {""title"": ""Punchline"", ""description"": ""answer to resolve the joke"", ""type"": ""string""}}, ""required"": [""setup"", ""punchline""]}\n```\n\nTell me a joke.\n","\n{""title"":""Joke"",""type"":""object"",""properties"":{""setup"":{""title"":""Setup"",""description"":""question to set up a joke"",""type"":""string""},""punchline"":{""title"":""Punchline"",""description"":""answer to resolve the joke"",""type"":""string""}},""required"": [""setup"",""punchline""]}",False
50,text-curie-001,0.0,True,True,FloatArray,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nHere is the output schema:\n```\n{""title"": ""FloatArray"", ""type"": ""object"", ""properties"": {""values"": {""title"": ""Values"", ""description"": ""list of floats"", ""type"": ""array"", ""items"": {""type"": ""number""}}}, ""required"": [""values""]}\n```\n\nWrite out a few terms of fiboacci.\n",\nFiboacci is a number sequence that is the sum of the previous two numbers in the sequence.,False
51,text-curie-001,0.0,True,False,Actor,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nHere is the output schema:\n```\n{""name"": {""description"": ""name of an actor"", ""type"": ""string""}, ""film_names"": {""description"": ""list of names of films they starred in"", ""type"": ""array""}}\n```\n\nGenerate the filmography for a random actor.\n","\n{ ""name"": ""John Cusack"", ""film_names"": [ ""Say Anything"", ""High Fidelity"", ""Pushing Daisies"", ""Grosse Pointe Blank"", ""The Paperboy"", ""Crazy Stupid Love"", ""The Raven"", ""The Grudge"", ""The War of the Worlds"" ], ""type"": ""string"" }",True
52,text-curie-001,0.0,True,False,Joke,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nHere is the output schema:\n```\n{""setup"": {""description"": ""question to set up a joke"", ""type"": ""string""}, ""punchline"": {""description"": ""answer to resolve the joke"", ""type"": ""string""}}\n```\n\nTell me a joke.\n",\nI refuse to answer this question.,False
53,text-curie-001,0.0,True,False,FloatArray,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below.\n\nHere is the output schema:\n```\n{""values"": {""description"": ""list of floats"", ""type"": ""array""}}\n```\n\nWrite out a few terms of fiboacci.\n",\nFiboacci is a number that is the sum of the previous two Fibonacci numbers.,False
54,text-curie-001,0.0,False,True,Actor,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below. For example, the object {""foo"": [""bar"", ""baz""]} conforms to the schema {""foo"": {""description"": ""a list of strings field"", ""type"": ""string""}}.\n\nHere is the output schema:\n```\n{""title"": ""Actor"", ""type"": ""object"", ""properties"": {""name"": {""title"": ""Name"", ""description"": ""name of an actor"", ""type"": ""string""}, ""film_names"": {""title"": ""Film Names"", ""description"": ""list of names of films they starred in"", ""type"": ""array"", ""items"": {""type"": ""string""}}}, ""required"": [""name"", ""film_names""]}\n```\n\nGenerate the filmography for a random actor.\n","\n{ ""title"": ""Actor"", ""type"": ""object"", ""properties"": {""name"": {""title"": ""Name"", ""description"": ""name of an actor"", ""type"": ""string""}, ""film_names"": {""title"": ""Film Names"", ""description"": ""list of names of films they starred in"", ""type"": ""array"", ""items"": {""type"": ""string""}}}, ""required"": [""name"", ""film_names""] }",False
55,text-curie-001,0.0,False,True,Joke,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below. For example, the object {""foo"": [""bar"", ""baz""]} conforms to the schema {""foo"": {""description"": ""a list of strings field"", ""type"": ""string""}}.\n\nHere is the output schema:\n```\n{""title"": ""Joke"", ""type"": ""object"", ""properties"": {""setup"": {""title"": ""Setup"", ""description"": ""question to set up a joke"", ""type"": ""string""}, ""punchline"": {""title"": ""Punchline"", ""description"": ""answer to resolve the joke"", ""type"": ""string""}}, ""required"": [""setup"", ""punchline""]}\n```\n\nTell me a joke.\n",\nSetup:\n\nWhat's black and white and has four legs?\n\nA table.,False
56,text-curie-001,0.0,False,True,FloatArray,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below. For example, the object {""foo"": [""bar"", ""baz""]} conforms to the schema {""foo"": {""description"": ""a list of strings field"", ""type"": ""string""}}.\n\nHere is the output schema:\n```\n{""title"": ""FloatArray"", ""type"": ""object"", ""properties"": {""values"": {""title"": ""Values"", ""description"": ""list of floats"", ""type"": ""array"", ""items"": {""type"": ""number""}}}, ""required"": [""values""]}\n```\n\nWrite out a few terms of fiboacci.\n",\nFiboacci is a sequence of numbers that are the sum of the previous two numbers in the sequence.,False
57,text-curie-001,0.0,False,False,Actor,"Answer the user query.\nThe output should be formatted as a JSON instance that conforms to the JSON schema below. For example, the object {""foo"": [""bar"", ""baz""]} conforms to the schema {""foo"": {""description"": ""a list of strings field"", ""type"": ""string""}}.\n\nHere is the output schema:\n```\n{""name"": {""description"": ""name of an actor"", ""type"": ""string""}, ""film_names"": {""description"": ""list of names of films they starred in"", ""type"": ""array""}}\n```\n\nGenerate the filmography for a random actor.\n","\n{ ""name"": """", ""film_names"": [] }",True


Observations:
- DaVinci
    - Works quite well
    - The 2 configurations that work for all 3 data models/queries are (temp=0.0, zero-shot format instruction=False, full data model schema=False) and (temp=0.5, zero-shot format instruction=False, full data model schema=False).
    - With full data model schema=True and zero-shot format instruction=True  the output json has a bunch of extraneous fields from the json schema
    - There are a bunch of completions with extraneous text apart from valid json, e.g. `\nHere is an example output:\n\n{"name": "Tom Hanks", "film_names": ["Forrest Gump", "Saving Private Ryan", "Toy Story", "The Green Mile"]}`.  The substring "Here is an example output:" will depend on the user query. We could consider a greedy regex search like `re.search('\{.*\}', completion)`
- Curie already shows a big drop-off: it is unable to generate valid json for th 2nd and 3rd json schema. (Re-running, occasionally it will work.)