<center>
<img src="https://supportvectors.ai/logo-poster-transparent.png" width="400px" style="opacity:0.7">
</center>

In [1]:
%run supportvectors-common.ipynb


<div style="color:#aaa;font-size:8pt">
<hr/>
&copy; SupportVectors. All rights reserved. <blockquote>This notebook is the intellectual property of SupportVectors, and part of its training material. 
Only the participants in SupportVectors workshops are allowed to study the notebooks for educational purposes currently, but is prohibited from copying or using it for any other purposes without written permission.

<b> These notebooks are chapters and sections from Asif Qamar's textbook that he is writing on Data Science. So we request you to not circulate the material to others.</b>
 </blockquote>
 <hr/>
</div>



# Using Instructor Framework

## Instructor Framework for LLMs: OpenAI and Ollama



Instructor is a Python framework that enhances OpenAI's API and similar LLMs by enforcing structured output in the form of Pydantic models.

Install Instructor by bringing in the following libraries (already included as part of `svlearn-bootcamp`):

`uv add instructor pydantic openai`


In [2]:
# Imports includes patch from the instructor framework

from instructor import patch
from openai import OpenAI
from pydantic import BaseModel
from rich import print as rprint



### 1. Connecting to OpenAI API with Instructor


In [3]:

# First, create a patched OpenAI client (your OPENAI_API_KEY should be in the .env file)
client = patch(OpenAI())


In [None]:
from pydantic import Field
# Define a structured output model
class ResponseModel(BaseModel):
    summary: str = Field(description="A concise summary of the input text.")
    keywords: list[str] = Field(description="A list of keywords extracted from the input text.")


In [5]:

# Query OpenAI with structured response enforcement
response = client.chat.completions.create(
    model="gpt-4o-mini", #"gpt-4-turbo",
    messages=[{"role": "user", "content": "Summarize the theory of relativity and provide keywords."}],
    response_model=ResponseModel  # Enforces structured response
)


In [6]:
rprint(response)


### 2. Using Instructor with a Locally Hosted Ollama Model


In [7]:
import instructor
# To use Ollama, ensure it is running locally
# We need to set base_url to the local Ollama API endpoint
ollama_client = patch(OpenAI(base_url="http://localhost:11434/v1/"),
                      mode=instructor.Mode.JSON,)


In [8]:

# Request structured response from a local LLM (e.g., Mistral or Llama3)
response = ollama_client.chat.completions.create(
    model= "mistral",#"llama3.2",#
    messages=[{"role": "user", "content": "Explain quantum entanglement and list key principles."}],
    response_model=ResponseModel,
)


In [9]:

rprint(response)



### 3. Handling More Complex Outputs with Pydantic Models


In [16]:

# Define a more complex structured model
class DetailedResponseModel(BaseModel):
    summary: str
    keywords: list[str]
    references: list[str]


In [17]:

# Query OpenAI with a detailed structured response
response = client.chat.completions.create(
    model="gpt-4o-mini", #"gpt-4-turbo",
    messages=[{"role": "user", "content": "Describe black holes and list references for further reading."}],
    response_model=DetailedResponseModel
)


In [18]:

rprint(response)



### 4. Customizing Behavior with Instructor


In [13]:
from pydantic import field_validator
# Instructor allows for setting validation rules and customizing parsing behavior.
# For example, ensuring output lists a minimum number of keywords:
class StrictResponseModel(BaseModel):
    summary: str
    keywords: list[str]
    references: list[str]
    
    @field_validator("keywords")
    @classmethod
    def check_keywords_length(cls, v):
        if len(v) < 3:
            raise ValueError("At least 3 keywords required.")
        return v


In [14]:

response = client.chat.completions.create(
    model="gpt-4o-mini", #"gpt-4-turbo",
    messages=[{"role": "user", "content": "Explain superconductivity with key takeaways."}],
    response_model=StrictResponseModel
)


In [15]:

rprint(response)



### 5. Summary
- Instructor helps enforce structured LLM output using Pydantic.
- It works with both OpenAI's API and local models via Ollama.
- Using Pydantic, we can enforce validation, typing, and custom logic.

This notebook demonstrates practical usage of Instructor for structured AI output.
