# Using Ollama to get structured output

To structure text, using a structured output enables better extraction and validation.

First setup the notebook.

In [21]:
#load "load.fsx"

open Informedica.OpenAI.Lib
open Ollama.Operators

let extraction = function
| Ok x -> printfn $"## Extracted:\n{x}"
| Error _ -> printfn "## Extraction failed"

## Define a schema and a type for the output

The json function will output the type used as a type parameter. However, due to limitations of the ollama framework you need to add the schema to the prompt as well.

In [22]:

"""
Use schema: { number: int; unit: string }
What is the minimal corrected gestational age mentioned in the text between '''

'''A neonate 28 weeks to 32 weeks corrected gestational age.'''

Reply in JSON."""
|> Message.user
|> Ollama.json<{| number: int; unit: string |}>
    Ollama.Models.llama2
    []
|> Async.RunSynchronously
|> extraction

ℹ INFO: 
EndPoint: http://localhost:11434/api/chat
Payload:
{"format":"json","messages":[{"content":"\nUse schema: { number: int; unit: string }\nWhat is the minimal corrected gestational age mentioned in the text between '''\n\n'''A neonate 28 weeks to 32 weeks corrected gestational age.'''\n\nReply in JSON.","role":"user"}],"model":"llama2","options":{"num_keep":null,"seed":101,"num_predict":null,"top_k":null,"top_p":null,"tfs_z":null,"typical_p":null,"repeat_last_n":64,"temperature":0.0,"repeat_penalty":null,"presence_penalty":null,"frequency_penalty":null,"mirostat":0,"mirostat_tau":null,"mirostat_eta":null,"penalize_newline":null,"stop":[],"numa":null,"num_ctx":2048,"num_batch":null,"num_gqa":null,"num_gpu":null,"main_gpu":null,"low_vram":null,"f16_kv":null,"vocab_only":null,"use_mmap":null,"use_mlock":null,"rope_frequency_base":null,"rope_frequency_scale":null,"num_thread":null},"response_format":{"schema":{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "<>f_

## Getting  the maximum age as a json structure

The above output is correct, now try and get the maximum age.

In [23]:
"""
Use schema: { number: int; unit: string }
What is the maximum corrected gestational age mentioned in the text between '''

'''A neonate 28 weeks to 32 weeks corrected gestational age.'''

Reply in JSON."""
|> Message.user
|> Ollama.json<{| number: int; unit: string |}>
    Ollama.Models.llama2
    []
|> Async.RunSynchronously
|> extraction

ℹ INFO: 
EndPoint: http://localhost:11434/api/chat
Payload:
{"format":"json","messages":[{"content":"\nUse schema: { number: int; unit: string }\nWhat is the maximum corrected gestational age mentioned in the text between '''\n\n'''A neonate 28 weeks to 32 weeks corrected gestational age.'''\n\nReply in JSON.","role":"user"}],"model":"llama2","options":{"num_keep":null,"seed":101,"num_predict":null,"top_k":null,"top_p":null,"tfs_z":null,"typical_p":null,"repeat_last_n":64,"temperature":0.0,"repeat_penalty":null,"presence_penalty":null,"frequency_penalty":null,"mirostat":0,"mirostat_tau":null,"mirostat_eta":null,"penalize_newline":null,"stop":[],"numa":null,"num_ctx":2048,"num_batch":null,"num_gqa":null,"num_gpu":null,"main_gpu":null,"low_vram":null,"f16_kv":null,"vocab_only":null,"use_mmap":null,"use_mlock":null,"rope_frequency_base":null,"rope_frequency_scale":null,"num_thread":null},"response_format":{"schema":{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title": "<>f_

## Try a different structured output

Somehow, the prompt is misunderstood and the minimum age value is returned instead of the maximum age value.

Let's try again using a more complex structure.

In [24]:
"""
Use schema: { minAge: int; maxAge: int; unit: string }
What is corrected gestational age range mentioned in the text between '''

'''A neonate 28 weeks to 32 weeks corrected gestational age.'''

Reply in JSON."""
|> Message.user
|> Ollama.json<{| minAge: int; maxAge: int; unit: string |}>
    Ollama.Models.llama2
    []
|> Async.RunSynchronously
|> extraction

ℹ INFO: 
EndPoint: http://localhost:11434/api/chat
Payload:
{"format":"json","messages":[{"content":"\nUse schema: { minAge: int; maxAge: int; unit: string }\nWhat is corrected gestational age range mentioned in the text between '''\n\n'''A neonate 28 weeks to 32 weeks corrected gestational age.'''\n\nReply in JSON.","role":"user"}],"model":"llama2","options":{"num_keep":null,"seed":101,"num_predict":null,"top_k":null,"top_p":null,"tfs_z":null,"typical_p":null,"repeat_last_n":64,"temperature":0.0,"repeat_penalty":null,"presence_penalty":null,"frequency_penalty":null,"mirostat":0,"mirostat_tau":null,"mirostat_eta":null,"penalize_newline":null,"stop":[],"numa":null,"num_ctx":2048,"num_batch":null,"num_gqa":null,"num_gpu":null,"main_gpu":null,"low_vram":null,"f16_kv":null,"vocab_only":null,"use_mmap":null,"use_mlock":null,"rope_frequency_base":null,"rope_frequency_scale":null,"num_thread":null},"response_format":{"schema":{
  "$schema": "http://json-schema.org/draft-04/schema#",
  "title"

## Extraction with different units

Now try a Dutch text with differrent units for the minimum and maximum age.

In [25]:
"""
Use schema: { minAge: int, maxAge: int, minAgeUnit: string, maxAgeUnit: string }
What is age range mentioned in the text between '''

'''
paracetamol
Oraal: Bij milde tot matige pijn en/of koorts: volgens het Kinderformularium van het NKFK bij een leeftijd van 1 maand–18 jaar: 10–15 mg/kg lichaamsgewicht per keer, zo nodig 4×/dag, max. 60 mg/kg/dag en max. 4 g/dag.
'''

Respond in JSON
"""
|> Message.user
|> Ollama.json<{| minAge: int; maxAge: int; minAgeUnit: string; maxAgeUnit: string |}>
    Ollama.Models.llama2
    []
|> Async.RunSynchronously
|> extraction

ℹ INFO: 
EndPoint: http://localhost:11434/api/chat
Payload:
{"format":"json","messages":[{"content":"\nUse schema: { minAge: int, maxAge: int, minAgeUnit: string, maxAgeUnit: string }\nWhat is age range mentioned in the text between '''\n\n'''\nparacetamol\nOraal: Bij milde tot matige pijn en/of koorts: volgens het Kinderformularium van het NKFK bij een leeftijd van 1 maand–18 jaar: 10–15 mg/kg lichaamsgewicht per keer, zo nodig 4×/dag, max. 60 mg/kg/dag en max. 4 g/dag.\n'''\n\nRespond in JSON\n","role":"user"}],"model":"llama2","options":{"num_keep":null,"seed":101,"num_predict":null,"top_k":null,"top_p":null,"tfs_z":null,"typical_p":null,"repeat_last_n":64,"temperature":0.0,"repeat_penalty":null,"presence_penalty":null,"frequency_penalty":null,"mirostat":0,"mirostat_tau":null,"mirostat_eta":null,"penalize_newline":null,"stop":[],"numa":null,"num_ctx":2048,"num_batch":null,"num_gqa":null,"num_gpu":null,"main_gpu":null,"low_vram":null,"f16_kv":null,"vocab_only":null,"use_mmap":null,"u

## Use a more explicit structure

A more explicit structure also has more semantic meaning. The below structure is an explicit range structure with a min and a max object containing an age structure.

In [26]:
"""
Use schema: { ageRange : { minAge: { age: int, unit: string }, maxAge: { age: int, unit: string } } }
What is age range mentioned in the text between '''

'''
paracetamol
Oraal: Bij milde tot matige pijn en/of koorts: volgens het Kinderformularium van het NKFK bij een leeftijd van 1 maand–18 jaar: 10–15 mg/kg lichaamsgewicht per keer, zo nodig 4×/dag, max. 60 mg/kg/dag en max. 4 g/dag.
'''

Respond in JSON
"""
|> Message.user
|> Ollama.json<{| ageRange : {| minAge: {| age: int; unit: string |}; maxAge: {| age: int; unit: string |} |} |} >
    Ollama.Models.llama2
    []
|> Async.RunSynchronously
|> extraction

ℹ INFO: 
EndPoint: http://localhost:11434/api/chat
Payload:
{"format":"json","messages":[{"content":"\nUse schema: { ageRange : { minAge: { age: int, unit: string }, maxAge: { age: int, unit: string } } }\nWhat is age range mentioned in the text between '''\n\n'''\nparacetamol\nOraal: Bij milde tot matige pijn en/of koorts: volgens het Kinderformularium van het NKFK bij een leeftijd van 1 maand–18 jaar: 10–15 mg/kg lichaamsgewicht per keer, zo nodig 4×/dag, max. 60 mg/kg/dag en max. 4 g/dag.\n'''\n\nRespond in JSON\n","role":"user"}],"model":"llama2","options":{"num_keep":null,"seed":101,"num_predict":null,"top_k":null,"top_p":null,"tfs_z":null,"typical_p":null,"repeat_last_n":64,"temperature":0.0,"repeat_penalty":null,"presence_penalty":null,"frequency_penalty":null,"mirostat":0,"mirostat_tau":null,"mirostat_eta":null,"penalize_newline":null,"stop":[],"numa":null,"num_ctx":2048,"num_batch":null,"num_gqa":null,"num_gpu":null,"main_gpu":null,"low_vram":null,"f16_kv":null,"vocab_only":nu

## Extract a dose structure

Let's try to extract a dose from a text.

In [27]:
"""
Use schema: { maxDose: float, unit: string }
What is the max dose mentioned in the text between '''

'''
paracetamol
Oraal: Bij milde tot matige pijn en/of koorts: volgens het Kinderformularium van het NKFK bij een leeftijd van 1 maand–18 jaar: 10–15 mg/kg lichaamsgewicht per keer, zo nodig 4×/dag, max. 60 mg/kg/dag en max. 4 g/dag.
'''

Respond in JSON
"""
|> Message.user
|> Ollama.json<{| maxDose: int; unit: string |}>
    Ollama.Models.llama2
    []
|> Async.RunSynchronously
|> extraction

ℹ INFO: 
EndPoint: http://localhost:11434/api/chat
Payload:
{"format":"json","messages":[{"content":"\nUse schema: { maxDose: float, unit: string }\nWhat is the max dose mentioned in the text between '''\n\n'''\nparacetamol\nOraal: Bij milde tot matige pijn en/of koorts: volgens het Kinderformularium van het NKFK bij een leeftijd van 1 maand–18 jaar: 10–15 mg/kg lichaamsgewicht per keer, zo nodig 4×/dag, max. 60 mg/kg/dag en max. 4 g/dag.\n'''\n\nRespond in JSON\n","role":"user"}],"model":"llama2","options":{"num_keep":null,"seed":101,"num_predict":null,"top_k":null,"top_p":null,"tfs_z":null,"typical_p":null,"repeat_last_n":64,"temperature":0.0,"repeat_penalty":null,"presence_penalty":null,"frequency_penalty":null,"mirostat":0,"mirostat_tau":null,"mirostat_eta":null,"penalize_newline":null,"stop":[],"numa":null,"num_ctx":2048,"num_batch":null,"num_gqa":null,"num_gpu":null,"main_gpu":null,"low_vram":null,"f16_kv":null,"vocab_only":null,"use_mmap":null,"use_mlock":null,"rope_frequency_ba