# Getting Started With OpenAI Structured Outputs

## Introduction

In August, 2024, OpenAI announced a powerful new feature, Structured Outputs, in their API. With Structured Outputs, as the name suggests, you can ensure LLMs will generate responses only in a format you specify. This capability will make it significantly easier to build applications that require precise data formatting. In this tutorial, you will learn how to get started with this new feature, understand its new syntax and explore its key applications.

## Importance of Structured Outputs in AI Applications

Deterministic responses, or in other words, responses in the same format is crucial for a lot of tasks such as data entry, information retrieval, question answering, multi-step workflows and so on. I am sure you must know how LLMs can generate outputs in wildly different formats even if the prompt is the same. 

For example, consider this simple hotel reviews sentiment classifier:

In [6]:
from openai import OpenAI

client = OpenAI()
MODEL = "gpt-4o"

SYSTEM_PROMPT = "You are a sentiment classifier assistant."
PROMPT_TEMPLATE = """
    Classify the sentiment of the following hotel review as positive, negative, or neutral:\n\n{review}
"""

# Function to classify sentiment using OpenAI's chat completions API
def classify_sentiment(review):
    response = client.chat.completions.create(
        model=MODEL,
        messages=[
            {"role": "system", "content": SYSTEM_PROMPT},
            {"role": "user", "content": PROMPT_TEMPLATE.format(review=review)}
        ],
    )
    return response.choices[0].message.content.strip()

# List of hotel reviews
reviews = [
    "The room was clean and the staff was friendly.",
    "The location was terrible and the service was slow.",
    "The food was amazing but the room was too small.",
]

# Classify sentiment for each review and print the results
for review in reviews:
    sentiment = classify_sentiment(review)
    print(f"Review: {review}\nSentiment: {sentiment}\n")

Review: The room was clean and the staff was friendly.
Sentiment: Positive

Review: The location was terrible and the service was slow.
Sentiment: Negative

Review: The food was amazing but the room was too small.
Sentiment: The sentiment of the review is neutral.



Even though the first two responses were in the same single-word format, the last one is an entire sentence. If some other downstream application depended on the output of the above code, it would have crashed as it would have been expecting a single-word response.

We can fix this problem with some prompt engineering but it is a time-consuming, iterative process. Even with a perfect prompt, we can't be 100% sure the responses will conform to our format in future requests. Unless, of course, we use Structured Outputs:

```python
def classify_sentiment_with_structured_outputs(review):
    ...

# Classify sentiment for each review with Structured Outputs
for review in reviews:
    sentiment = classify_sentiment_with_structured_outputs(review)
    print(f"Review: {review}\nSentiment: {sentiment}\n")
```


```python
Review: The room was clean and the staff was friendly.
Sentiment: {"sentiment":"positive"}

Review: The location was terrible and the service was slow.
Sentiment: {"sentiment":"negative"}

Review: The food was amazing but the room was too small.
Sentiment: {"sentiment":"neutral"}
```

With a new function, `classify_sentiment_with_structured_outputs`, the responses are all in the same format. 

## Getting Started  With Structured Outputs

In this section, we will explain how Structured Outputs work by continuing th example of the review sentiments analyzer. 

### Setting Up Your Environment

#### Prerequisites

Before you begin, ensure you have the following:

- Python 3.7 or later installed on your system.
- An OpenAI API key. You can obtain this by signing up on the [OpenAI website](https://openai.com/).

#### Setting Up the OpenAI API

1. **Install the OpenAI Python package**:
   Open your terminal and run the following command to install or update the OpenAI Python package to the latest version:
   ```bash
   pip install -U openai
   ```

2. **Set up your API key**:
   You can set your API key as an environment variable or directly in your code. To set it as an environment variable, run:
   ```bash
   export OPENAI_API_KEY='your-api-key'
   ```

3. **Verify the installation**:
   Create a simple Python script to verify the installation:

In [16]:
from openai import OpenAI

client = OpenAI()

response = client.chat.completions.create(
    model="gpt-4o-mini",
    messages=[
        {"role": "system", "content": "You are a helpful assistant."},
        {"role": "user", "content": "Say hello!"}
    ],
    max_tokens=5
)

print(response.choices[0].message.content.strip())

Hello! How can I


Run the script to ensure everything is set up correctly. You should see the model's response printed in the terminal.

#### Installing Required Libraries

In addition to the OpenAI package, you will need the Pydantic library to define and validate JSON schemas for Structured Outputs. Install it using pip:

```bash
pip install pydantic
```

With these steps, your environment is now set up to use OpenAI's Structured Outputs feature. 

### Defining the schema using Pydantic

### Using the `parse` helper

### Handling refusals

## Function Calling with Structured Outputs

## Best Practices

## Conclusion

### Table of Contents

1. **Introduction**
   - Overview of Structured Outputs
   - Importance of Structured Outputs in AI Applications

2. **Environment Setup**
   - Prerequisites
   - Setting Up the OpenAI API
   - Installing Required Libraries (Python, Node.js)

3. **Understanding Structured Outputs**
   - What are Structured Outputs?
   - Key Features and Benefits

4. **Getting Started with Structured Outputs**
   - Enabling Structured Outputs in API Calls
   - Using JSON Schemas
   - Function Calling with Structured Outputs

5. **Common Use Cases**
   - Extracting Structured Data from Unstructured Inputs
   - Generating User Interfaces Based on User Intent
   - Populating Databases with Extracted Content

6. **Best Practices**
   - Ensuring Schema Compliance
   - Handling Refusals and Errors

7. **Conclusion**
   - Recap of Key Points
   - Additional Resources and Documentation