# Guaranteeing Custom Pydantic Schema

It is becoming increasingly important that models can output a custom schema that is guaranteed to compile. This notebook presents a framework for generating arbitrary schema outputs based on Pydantic's `BaseModel`.

In [1]:
# Creating our Pydantic BaseModel class
from pydantic import BaseModel, Field
from typing import List

class CookInstructions(BaseModel):
    """ Steak cooking instructions """
    doneness: str = Field(description="How much to cook the steak, i.e. well-done, medium, etc.")
    char_bool: bool = Field(description="Whether or not to char the steak")
    cook_time_minutes: int = Field(description="How long to cook the steak in minutes")

class Steak(BaseModel):
    cook_instructions: CookInstructions = Field()
    size_in_oz: int = Field(description="Size of the cut of steak in ounces")

class SteakhouseOrder(BaseModel):
    """Inputs for create_steak_order function for a fictional steakhouse"""

    table_number: int = Field(description="What table the order will be going to")
    customer_name: str = Field(description="Name of the customer")
    steak: Steak = Field(description="Steak attributes")
    sides: List[str] = Field(
        description="Any sides that they may have ordered"
    )
SteakhouseOrder.schema()

{'title': 'SteakhouseOrder',
 'description': 'Inputs for create_steak_order function for a fictional steakhouse',
 'type': 'object',
 'properties': {'table_number': {'title': 'Table Number',
   'description': 'What table the order will be going to',
   'type': 'integer'},
  'customer_name': {'title': 'Customer Name',
   'description': 'Name of the customer',
   'type': 'string'},
  'steak': {'title': 'Steak',
   'description': 'Steak attributes',
   'allOf': [{'$ref': '#/definitions/Steak'}]},
  'sides': {'title': 'Sides',
   'description': 'Any sides that they may have ordered',
   'type': 'array',
   'items': {'type': 'string'}}},
 'required': ['table_number', 'customer_name', 'steak', 'sides'],
 'definitions': {'CookInstructions': {'title': 'CookInstructions',
   'description': 'Steak cooking instructions ',
   'type': 'object',
   'properties': {'doneness': {'title': 'Doneness',
     'description': 'How much to cook the steak, i.e. well-done, medium, etc.',
     'type': 'string'},


### Guidance Functions to Generate Values

Below, `gen_properties` is the function that will generate the value for each pydantic property using the `gen` function. `generate_with_schema` takes in a user prompt through `user_instructions` and then generates the properties using the user input and pydantic schema definition. 

In [2]:
import sys
sys.path.insert(0, "/home/telnyxuser/guidance")
import guidance
print(guidance.__file__)

/home/telnyxuser/guidance/guidance/__init__.py


In [3]:
from guidance import models, gen, json

# Load our model
model = 'mistralai/Mistral-7B-v0.1'
lm = models.Transformers(model, device_map='auto')

  from .autonotebook import tqdm as notebook_tqdm
Loading checkpoint shards: 100%|██████████| 2/2 [00:15<00:00,  7.82s/it]


In [4]:
generation = lm + "Create an order for John at table 9 for a medium-rare steak with a side of greenbeans and potatoes. He requested no char. He has an almond allergy so be careful.\n" + json(schema=SteakhouseOrder)

### Extracting Generated JSON and Pydantic output

Now that we have generated the values for our `SteahouseOrder`, we can extract them by using the `json` library and parsing out the JSON keys and values. We can then create a new instance of our Pydantic class with these generated values.

In [7]:
import json

# Extract the output from our result JSON
def parse_lm_output(lm_out):
    gen_str = lm_out.__str__()

    json_string_indicator = "{"
    json_start_index = gen_str.find(json_string_indicator)
    full_json = gen_str[json_start_index:]

    print(full_json)

    gen_json = json.loads(full_json)
    final_json = {}

    for prop in gen_json['props']:
        final_json[prop['key']] = prop['value']

    return final_json

final_order = parse_lm_output(generation)

# Resulting output in dictionary
print(final_order)

# Resulting output in our original class
print(SteakhouseOrder(**final_order))

{
"table_number": 9,
"customer_name": "John",
"steak": {
"cook_instructions": {
"doneness": "medium-rare",
"char_bool": false,
"cook_time_minutes": 10
},
"size_in_oz": 12
},
"sides": ["greenbeans", "potatoes"],
}


JSONDecodeError: Expecting property name enclosed in double quotes: line 13 column 1 (char 211)

In [8]:
# The above is failing because of the trailing comma.