## Overview

Repo: https://github.com/pgahq/instructor-groq-openai-llm-examples

This notebook shows how to use Instructor to extract data and then validate the result. It will feed a text description of any validation issues back to the LLM (up to the specified number of retries) so it can correct its own response.

Note: this notebook assumes you're using Google Colab. You can safely edit / play here. Or go to `File` -> `Save a copy in Google Drive` to make your own version.

In [1]:
!pip install --quiet instructor groq openai


[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m A new release of pip is available: [0m[31;49m24.0[0m[39;49m -> [0m[32;49m24.1.2[0m
[1m[[0m[34;49mnotice[0m[1;39;49m][0m[39;49m To update, run: [0m[32;49mpip install --upgrade pip[0m


On the left, click the key and set two secrets with your keys. Be sure to enable "Notebook access" for them. This is how Google Colab works...you're not sharing your keys with anyone.

OPENAI_API_KEY - get a key from https://platform.openai.com/api-keys

GROQ_API_KEY - get a key from https://console.groq.com/keys

In [3]:
import openai
import groq
import instructor
from pydantic import BaseModel, Field
import os

try:
    from google.colab import userdata
    os.environ['OPENAI_API_KEY'] = '' or userdata.get('OPENAI_API_KEY') # or put your key in the '' on this line
    os.environ['GROQ_API_KEY'] = '' or userdata.get('GROQ_API_KEY')
except Exception as e:
    # print(e)
    pass

if not os.environ.get('OPENAI_API_KEY') or not os.environ.get('GROQ_API_KEY'):
    raise ValueError("Both OPENAI_API_KEY and GROQ_API_KEY environment variables must be set and non-empty. Read the text in the notebook (above this block) for more info.")

Now to the cool stuff...

In [7]:
inference_provider = "openai"   # "openai" or "groq"
client = instructor.from_openai(openai.OpenAI()) if inference_provider == "openai" else instructor.from_groq(groq.Groq())

class UserDetail(BaseModel):
    """
    Details about the user
    """
    name: str = Field(description="First name (only) of the user.")
    age: int = Field(description="Age of the user.")

    @field_validator("name")
    @classmethod
    def validate(cls, v):
        print(f"\033[90mValidating: {v}\033[0m")  # Grey text output
        if not v.isupper():
            error_message = "each character must be uppercase." # this is the text that gets fed back to the LLM on the retry
            print(f"\033[90mError: {error_message}\033[0m")  # Grey text output
            raise ValueError(error_message)
        print(f"\033[90mSuccess\033[0m")  # Grey text output
        return v

model = client.chat.completions.create(
    model="llama-3.1-70b-versatile" if inference_provider == "groq" else "gpt-4o-mini",
    response_model=UserDetail,
    max_retries=4,
    temperature=0,  # don't be creative
    messages=[{"role": "user", "content": "Eric Smith is 12 years old."}]
    )

print("\nFinal result: ", model.name)

[90mValidating: Eric[0m
[90mError: each character must be uppercase.[0m
[90mValidating: ERIC[0m
[90mSuccess[0m

Final result:  ERIC
