## Summary

Repo: https://github.com/pgahq/instructor-groq-openai-llm-examples

This notebook shows how to use Instructor to extract structured info from unstructured text where the responses are constrained by an enumerated list. Instructor handles [Enum and Literal](https://jxnl.github.io/instructor/concepts/enums/) differently. Literal seems simpler.

Note: this notebook assumes you're using Google Colab. You can safely edit / play here. Or go to `File` -> `Save a copy in Google Drive` to make your own version.

In [1]:
!pip install --quiet instructor groq openai jsonref


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


On the left, click the key and set two secrets with your keys. Be sure to enable "Notebook access" for them. This is how Google Colab works...you're not sharing your keys with anyone.

OPENAI_API_KEY - get a key from https://platform.openai.com/api-keys

GROQ_API_KEY - get a key from https://console.groq.com/keys

In [2]:
import instructor
import openai
import groq
from pydantic import BaseModel, Field
from typing import Optional, List, Literal
import os

try:
    from google.colab import userdata
    os.environ['OPENAI_API_KEY'] = '' or userdata.get('OPENAI_API_KEY') # or put your key in the '' on this line
    os.environ['GROQ_API_KEY'] = '' or userdata.get('GROQ_API_KEY')
except Exception as e:
    # print(e)
    pass

if not os.environ.get('OPENAI_API_KEY') or not os.environ.get('GROQ_API_KEY'):
    raise ValueError("Both OPENAI_API_KEY and GROQ_API_KEY environment variables must be set and non-empty. Read the text in the notebook (above this block) for more info.")


In [15]:
inference_provider = "openai"   # "openai" or "groq"
client = instructor.from_openai(openai.OpenAI()) if inference_provider == "openai" else instructor.from_groq(groq.Groq())

class Review(BaseModel):
    business_type: Literal[
        "dining establishment", 
        "service business", 
        "hotel", 
        "other"
    ] = Field(description="Type of business.")

    sentiment: Literal[
        "good vibes", 
        "ok-ish", 
        "not incredible"
    ] = Field(description="Sentiment of the review.")

messages = [
    {"role": "user", "content": "Amazing biscuits"},
    {"role": "user", "content": "A shower in every room"},
    {"role": "user", "content": "Inaccurate appointment reminders"}
]

for message in messages:
    response = client.chat.completions.create(
        model="llama-3.1-70b-versatile" if inference_provider == "groq" else "gpt-4o-mini",
        response_model=Review, # this is Instructor at work!
        temperature=0.0,
        messages=[message]
    )
    
    print(f"For message: '{message['content']}'")
    print(response.model_dump_json(indent=4))
    print("\n")


For message: 'Amazing biscuits'
{
    "business_type": "dining establishment",
    "sentiment": "good vibes"
}


For message: 'A shower in every room'
{
    "business_type": "hotel",
    "sentiment": "good vibes"
}


For message: 'Inaccurate appointment reminders'
{
    "business_type": "service business",
    "sentiment": "not incredible"
}


