# Structured Output with Kamiwaza Models

This notebook demonstrates how to use Kamiwaza's OpenAI-compatible interface to generate structured outputs from language models. With this feature, you can ensure model responses adhere to a specific JSON schema, making it easier to integrate model outputs directly into your applications.

Kamiwaza's implementation is compatible with OpenAI's structured output feature, so you can use the same code patterns you're already familiar with.

## Setup

First, we'll import the necessary libraries and setup our client. We'll need:
- `KamiwazaClient` to connect to our local Kamiwaza server
- OpenAI's Python SDK for the structured output functionality
- Pydantic for defining our data models

In [11]:
from kamiwaza_client import KamiwazaClient
import openai
from pydantic import BaseModel

## Deploy a Model

We'll download and deploy a Qwen model using Kamiwaza, then create an OpenAI-compatible client to interact with it.

In [12]:
client = KamiwazaClient("http://localhost:7777/api/")
hf_repo = 'Qwen/Qwen2.5-7B-Instruct-GGUF'
client.models.download_and_deploy_model(hf_repo)
openai_client = client.openai.get_client(repo_id=hf_repo)

Initiating download for Qwen/Qwen2.5-7B-Instruct-GGUF with quantization q6_k...
Model files for Qwen/Qwen2.5-7B-Instruct-GGUF are already downloaded.
Deploying model Qwen/Qwen2.5-7B-Instruct-GGUF...
Model Qwen/Qwen2.5-7B-Instruct-GGUF successfully deployed!


## Basic Testing

Let's first confirm our model is working with a simple test - the same "How many r's in strawberry?" question we used in our evaluation notebook.

In [13]:
# Create a streaming chat completion
response = openai_client.chat.completions.create(
    messages=[
        {"role": "user", "content": "How many r's are in the word 'strawberry'? ONLY RESPOND WITH A SINGLE NUMBER"}
    ],
    model="model",
    stream=True 
)

# display the stream
for chunk in response:
    if chunk.choices[0].delta.content is not None:
        print(chunk.choices[0].delta.content, end="", flush=True)


2025-03-07 15:06:36,780 - httpx - INFO - HTTP Request: POST http://localhost:51135/v1/chat/completions "HTTP/1.1 200 OK"


3

## Structured Output Example

Now let's try using structured output. We'll define a `CalendarEvent` class using Pydantic that specifies the schema we want our response to follow.

When we use `beta.chat.completions.parse()` instead of the regular `chat.completions.create()`, we instruct the model to return a response that fits our specified schema. The OpenAI-compatible client handles parsing the response into the appropriate structure.

This feature is particularly useful when you need to extract specific information from text and want to ensure it follows a consistent structure for downstream processing.

> Note: This is the same API format used by OpenAI's structured output feature. For more details, see the [OpenAI Documentation on Structured Outputs](https://platform.openai.com/docs/guides/structured-outputs).

In [14]:

from openai import OpenAI

class CalendarEvent(BaseModel):
    name: str
    date: str
    participants: list[str]

completion = openai_client.beta.chat.completions.parse(
    model="model",
    messages=[
        {"role": "system", "content": "Extract the event information."},
        {"role": "user", "content": "Alice and Bob are going to a science fair on Friday."},
    ],
    response_format=CalendarEvent,
)

event = completion.choices[0].message.parsed

2025-03-07 15:06:37,532 - httpx - INFO - HTTP Request: POST http://localhost:51135/v1/chat/completions "HTTP/1.1 200 OK"


## Examining the Result

Let's print out the structured output to see what the model extracted from our input text. Notice how the response is now a proper Python object with typed attributes rather than raw text.

In [19]:
# Show the entire object
print(f"Full event object: {event}")

# Access individual attributes
print(f"\nEvent name: {event.name}")
print(f"Event date: {event.date}")
print(f"Participants: {', '.join(event.participants)}")

# We can use it like any Python object
if "Alice" in event.participants:
    print("\nAlice is attending!")
    
# We can modify it
event.participants.append("Charlie")
print(f"\nUpdated participants: {event.participants}")

# We can convert to dict or JSON
import json
print(f"\nAs dictionary: {event.model_dump()}")
print(f"As JSON: {json.dumps(event.model_dump())}")

Full event object: name='science fair' date='Friday' participants=['Alice', 'Bob']

Event name: science fair
Event date: Friday
Participants: Alice, Bob

Alice is attending!

Updated participants: ['Alice', 'Bob', 'Charlie']

As dictionary: {'name': 'science fair', 'date': 'Friday', 'participants': ['Alice', 'Bob', 'Charlie']}
As JSON: {"name": "science fair", "date": "Friday", "participants": ["Alice", "Bob", "Charlie"]}
