# Getting Started with Ollama: Structured Outputs

By default, models return responses in plain text format.

Structured Outputs is a feature that can force a model to generate responses in JSON format, based on the JSON schema provided by you.

Structured Outputs is available in two forms in the OpenAI API:
- Function Calling: Demonstrated in next example.
- JSON Schema Response Format: Specify a `format` to directly control the structure of the model's output

In this demo, we'll focus on using the JSON Schema Response Format.

## Steps:
1. Define your schema: Write Pydantic classes to define the object schema that represents the structure of the desired output.
2. Supply your schema to the API call: Pass the object schema to the model using the `format` parameter.
3. Handle edge cases: In some cases, the model might not generate a valid response that matches the provided JSON schema.

## Differences from OpenAI responses API
1. Instead of `response_format`, Ollama chat API has `format` attribute
2. `response_format` accepts the Pydantic class name, while `format` accepts the "JSON schema" of the Pydantic Class
3. The model output response of Ollama API is plain JSON. To use the output in your python code, you'll need to convert it into the appropriate Pydantic model instance.

## Important notes:
- Structured output not working with gpt-oss model (Issue: https://github.com/ollama/ollama/issues/11691)
- Structured output response is not great with llama3.2:3b model, its a bit better with gemma3:4b and deepseek-r1:8b, but still unreliable
- qwen2.5:7b model seems to work best with structured output

## Prerequisites
1. Make sure that python3 is installed on your system.
2. Make sure Ollama is installed and "running" on your system.
3. Create an .env file, and add the following line:
   ```
   OLLAMA_MODEL=<model_name>
   ```
   model_name will be the name of the local model you want to use
4. Create and Activate a Virtual Environment:
   ```bash
   python3 -m venv venv
   source venv/bin/activate
   ```
5. The required libraries are listed in the requirements.txt file. Use the following command to install them:
   ```bash
   pip3 install -r requirements.txt
   ```

## Import Required Modules

In [1]:
# Import the chat API from Ollama (Think of OpenAI chat completion API equivalent)
from ollama import chat, ResponseError, pull    

# The `dotenv` library is used to load environment variables from a .env file
from dotenv import load_dotenv                  

# Used to get the values from environment variables
import os                                       

# Pydantic is used to define the structure of the output we want
from pydantic import BaseModel, Field           

# Used for type hints in our Pydantic models
from typing import List                         

## Load Environment Variables

In [2]:
load_dotenv()
MODEL = os.environ['OLLAMA_MODEL']

## Define Output Structure
We'll define the output structure we want by writing Pydantic classes:

In [3]:
class LLMConfidence(BaseModel):
    confidence: float = Field(description="Confidence level in the prediction. " \
                                    "Value between 0 lowest to 100 highest." \
                                    "Highest confidence - when all values are clearly mentioned in the input. " \
                                    "More the assumptions made by the model, lower the confidence. "
                                    )
    confidence_reason: str = Field(description="Reasoning behind the confidence level.")
    assumptions: List[str] = Field(description="List of assumptions made by the model.")

class CalendarEvent(BaseModel):
    name: str = Field(description="The name of the event")
    date: str = Field(description="The date of the event")
    participants: List[str] = Field(description="List of participants attending the event")
    
    llm_confidence: LLMConfidence = Field(description="Confidence information from the model")

## Define Example Inputs
Let's define some example inputs for which we'll generate JSON output in our defined format:

In [4]:
inputs = [
    "Mike will attend the Chris Rock Concert on 24 Jan 2025",
    "Vijay and Venu are going to a science fair on Friday.",
    "The project deadline is next Monday.",
    "Vijay and Venu are going to a science fair",
    "Build Team is planning a team outing first week of August",
    "Solve 2+2"
]

## Process Inputs and Generate Structured Output
Now let's process each input and generate structured JSON output using our defined schema:

In [5]:
for input in inputs:
    print(f"Input: {input}")
    try:
        response = chat(
            model = MODEL,
            messages = [
                {"role": "system", "content": "Extract the event information from the provided user input"},
                {"role": "user", "content": input}
            ],
            format = CalendarEvent.model_json_schema(), # Use Pydantic to generate the JSON schema of the Class
            options = {
                "temperature": 0, # Make responses more deterministic
            }
        )

        # Extract answer and print it
        print("\nLLM Response:")

        # response output is in json format.
        response_json = response.message.content

        # Use `model_validate_json` class method to convert
        # the response JSON into a Pydantic model instance.
        calendarEvent = CalendarEvent.model_validate_json(response_json)
        print(calendarEvent)
        print("\nExtracted Event Information:")
        print(f"Name: {calendarEvent.name}")
        print(f"Date: {calendarEvent.date}")
        print(f"Participants: {', '.join(calendarEvent.participants)}")
        print(f"Confidence: {calendarEvent.llm_confidence.confidence}")
        print(f"Confidence Reason: {calendarEvent.llm_confidence.confidence_reason}")
        print(f"Assumptions: {', '.join(calendarEvent.llm_confidence.assumptions)}")
        print("-------\n")
        
    except ResponseError as e:
        print('Error getting answer from AI:', e)
        if e.status_code == 404: # Model not installed
            try:
                print('Pulling model:', MODEL)
                pull(MODEL) 
                print('Model pulled successfully:', MODEL)
                print('Restart the program again ...')

            except Exception as e:
                print('Error pulling model. Error:', e)

    except Exception as e:
        print('Error getting answer from AI:', e)

Input: Mike will attend the Chris Rock Concert on 24 Jan 2025

LLM Response:
name='Chris Rock Concert' date='2025-01-24' participants=['Mike'] llm_confidence=LLMConfidence(confidence=1.0, confidence_reason='The input clearly states the event name, date and participant.', assumptions=[])

Extracted Event Information:
Name: Chris Rock Concert
Date: 2025-01-24
Participants: Mike
Confidence: 1.0
Confidence Reason: The input clearly states the event name, date and participant.
Assumptions: 
-------

Input: Vijay and Venu are going to a science fair on Friday.

LLM Response:
name='science fair' date='Friday' participants=['Vijay', 'Venu'] llm_confidence=LLMConfidence(confidence=1.0, confidence_reason='The sentence clearly states the event name and participants.', assumptions=[])

Extracted Event Information:
Name: science fair
Date: Friday
Participants: Vijay, Venu
Confidence: 1.0
Confidence Reason: The sentence clearly states the event name and participants.
Assumptions: 
-------

Input: Th

#### Note 
Surprizingly, none of model I tried, i.e. llama3.2:3b model, gemma3:4b, deepseek-r1:8b and qwen2.5:7b - respected the instruction "Value between 0 lowest to 100 highest."