This is an example of using Azure Open AI, with the gpt-3.5-turbo model (AKA
ChatGPT) to generate a YAML file from textual description. It follows a new
format, ZooYAML, which describes the animals in a zoo. This new format is chosen
in order to avoid a format of an open-source or other publicly available format
that may have already been seen by the model in its training.

The technique used for prompting the model are simply providing a clear
explanation of the format, as well as two examples (few-shot learning).

The model seems to cope well with the task, and in addition to manual inspection
of the output, it is also being parsed as YAML to verify that it is syntactically correct.

It would be interesting, in a future iteration, to use a more complex format,
and to see if the format can also be verified semantically using a schema. In
addition, it would be good to know whether the model can learn from a schema,
rather than from a free form description of the format.

In [53]:
!pip install openai pyyaml

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/


In [54]:
from textwrap import dedent
import yaml
from pprint import pprint
import openai

openai.api_type = 'azure'
openai.api_version = '2023-03-15-preview'
openai.api_base = ' ... ' # Replace with the URL of an Azure OpenAI gpt-3.5-turbo deployment
openai.api_key = ' ... ' # Replace with the corresponding API key

deployment_id = 'gpt-35-turbo' # Replace if using a different deployment name


In [55]:
def get_completion(prompt):
  completion = openai.Completion.create(
      deployment_id=deployment_id,
      prompt=prompt,
      temperature=0.0,
      max_tokens=1234)
  return completion.choices[0].text.replace('<|im_end|>', '')

In [56]:
PROMPT_TEMPLATE = dedent("""
  You are a masterful and accurate configurator of the ZooYAML format.
  Given clear instructions in human language, you produce a YAML file
  in the ZooYAML format, triple quoted.

  Never invent any information that was not provided in the
  instructions. If you are unable to complete the task without
  additional information, output "ERROR: " followed by the reason
  you are unable to complete the task.

  Only provide a single YAML output, corresponding to the instructions. Don't
  generate any additional text after it.

  The ZooYAML format:
  ZooYAML describes the animals in a zoo. Each animal has the following
  properties:
  - name: string - name of the animal
  - size: string - one of ("xs" / "s" / "m" / "l" / "xl")
  - family: string - one of ("mammal", "bird", "fish", "reptile")
  - nutrition: string - one of ("vegetarian" / "carnivorous")
  - predator_of: string or list of strings - name of animal or multiple animals
    this animal eats. Only include this parameter if the animal it eats is one
    of the other animals in the zoo.

  Example 1:
  Instruction: This zoo has an elephant, a very large vegetarian mammal, a
  parrot, which is a small bird eating grains, and a tiger, a medium-sized
  carnivorous mammal who can hunt and eat elephants.
  ZooYAML:
  ```
  zoo:
    - name: elephant
      size: xl
      family: mammal
      nutrition: vegetarian
    - name: parrot
      size: s
      family: bird
      nutrition: vegetarian
    - name: tiger
      size: m
      family: mammal
      nutrition: carnivorous
      predator_of: elephant
  ```

  Example 2:
  Instruction: This an interesting zoo where all animals are predators and may
  eat each other. It houses a shark - medium size fish, a dolphin, which is a
  medium sized marine mammal, a hyiena which is a small mamal that eats
  anything, and also an albatros, a bird of medium size and big appetite.
  ZooYAML:
  ```
  zoo:
    - name: shark
      size: m
      family: fish
      nutrition: carnivorous
      predator_of:
        - dolphin
        - hyiena
        - albatros
    - name: dolphin
      size: m
      family: mammal
      nutrition: carnivorous
      predator_of:
        - shark
        - hyiena
        - albatros
    - name: hyiena
      size: s
      family: mammal
      nutrition: carnivorous
      predator_of:
        - shark
        - dolphin
        - albatros
    - name: albatros
      size: m
      family: bird
      nutrition: carnivorous
      predator_of:
        - shark
        - dolphin
        - hyiena
  ```

  Instructions: {instructions}
  ZooYAML:
""")

In [57]:
def get_yaml(instructions):
  completion = get_completion(PROMPT_TEMPLATE.format(instructions=instructions))
  yaml_text = completion.split('```')[1]
  return yaml_text

In [58]:
yaml_text = get_yaml(
    "A zoo with a zebra - a large vegetarian mamml, a chameleon - small "
    "carnivorous reptile, a comodo dragon - medium sized carnivorous reptile "
    "who eats zebras, and a wolf, which is a medium sized mammal who is "
    "carnivorous and feasts on chameleons.")

print(yaml_text)
pprint(yaml.safe_load(yaml_text))


zoo:
  - name: zebra
    size: l
    family: mammal
    nutrition: vegetarian
  - name: chameleon
    size: s
    family: reptile
    nutrition: carnivorous
  - name: comodo dragon
    size: m
    family: reptile
    nutrition: carnivorous
    predator_of: zebra
  - name: wolf
    size: m
    family: mammal
    nutrition: carnivorous
    predator_of: chameleon

{'zoo': [{'family': 'mammal',
          'name': 'zebra',
          'nutrition': 'vegetarian',
          'size': 'l'},
         {'family': 'reptile',
          'name': 'chameleon',
          'nutrition': 'carnivorous',
          'size': 's'},
         {'family': 'reptile',
          'name': 'comodo dragon',
          'nutrition': 'carnivorous',
          'predator_of': 'zebra',
          'size': 'm'},
         {'family': 'mammal',
          'name': 'wolf',
          'nutrition': 'carnivorous',
          'predator_of': 'chameleon',
          'size': 'm'}]}


In [59]:
yaml_text = get_yaml(
    "Here we have a case of a very strange zoo, where all animals are "
    "mammals. There is a lion, which is of medium sized and can eat all "
    "other animals. Then there's a gazelle, which is also medium-size, and "
    "is vegetarian, as well as giant rat, which is really medium-sized (it is "
    "only giant by rat standards) and is carnivorous, but can only eat "
    "the gazelles, since lions are too big for it.")

print(yaml_text)
pprint(yaml.safe_load(yaml_text))


zoo:
  - name: lion
    size: m
    family: mammal
    nutrition: carnivorous
    predator_of:
      - gazelle
      - giant rat
  - name: gazelle
    size: m
    family: mammal
    nutrition: vegetarian
  - name: giant rat
    size: m
    family: mammal
    nutrition: carnivorous
    predator_of: gazelle

{'zoo': [{'family': 'mammal',
          'name': 'lion',
          'nutrition': 'carnivorous',
          'predator_of': ['gazelle', 'giant rat'],
          'size': 'm'},
         {'family': 'mammal',
          'name': 'gazelle',
          'nutrition': 'vegetarian',
          'size': 'm'},
         {'family': 'mammal',
          'name': 'giant rat',
          'nutrition': 'carnivorous',
          'predator_of': 'gazelle',
          'size': 'm'}]}
