I came across a YT video about 'Make your Agents teachable!'.

I thought I had missed something...

In reality, it was extracting data from a response and then storing it in a DB so that it could be retrieved as needed in the future.

Again, traditional Python not some new programming structure.


We saw in 05 that tool calling can not just give us the function we need to run but also the arguments. 

This example is more extensive in data extraction.

In [1]:
import os
import json
from openai import OpenAI
from dotenv import load_dotenv
from pprint import pprint

In [2]:
def get_llm_client(llm_choice):
    if llm_choice == "GROQ":
        client = OpenAI(
            base_url="https://api.groq.com/openai/v1",
            api_key=os.environ.get("GROQ_API_KEY"),
        )
        return client
    elif llm_choice == "OPENAI":
        load_dotenv()  # load environment variables from .env fil
        client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))
        return client
    else:
        raise ValueError("Invalid LLM choice. Please choose 'GROQ' or 'OPENAI'.")

In [3]:
# Load environment variables in a file called .env
# Print the key prefixes to help with any debugging
load_dotenv()

OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")
GROQ_API_KEY = os.getenv("GROQ_API_KEY")

LLM_CHOICE = "OPENAI"
LLM_CHOICE = "GROQ"

if OPENAI_API_KEY:
    print(f"OPENAI_API_KEY exists and begins {OPENAI_API_KEY[:14]}...")
else:
    print("OPENAI_API_KEY not set")

if GROQ_API_KEY:
    print(f"GROQ_API_KEY exists and begins {GROQ_API_KEY[:14]}...")
else:
    print("GROQ_API_KEY not set")


client = get_llm_client(LLM_CHOICE)
if LLM_CHOICE == "GROQ":
    MODEL = "llama-3.3-70b-versatile"
else:
    MODEL = "gpt-4o-mini"

print(f"LLM_CHOICE: {LLM_CHOICE} - MODEL: {MODEL}")

OPENAI_API_KEY exists and begins sk-proj-1WUVgv...
GROQ_API_KEY exists and begins gsk_11hFN1EMfj...
LLM_CHOICE: GROQ - MODEL: llama-3.3-70b-versatile


Here we instruct the Agent to extract data from the response and also use some XML type tags as part of our prompt. Many LLMs use this feature in their responses.


I am going to use some different syntax just to show that we can create our own output formats. We can use CAPS and Markdown styling to emphasise important points.


Remember, can we as humans/developers understand what the prompt requires. If so, the LLM will be able to understand it too as it has been trained on many similar examples sufficient to predict what to do.

In [4]:
system_message = """
You are a teachability agent. You examine a conversation listed between <conv> and </conv> and output a list of pertinent facts,
as well as a concise summary. 
The OUTPUT FORMAT *must have* in the following JSON FORMAT:

{
    "summary": <SUMMARY>,
    "number_of_facts": <NUMBER_OF_FACTS>,
    "facts": [<FACTS>]
}

A fact is a dictionary with the following keys: "fact" and "catgerory".

Here are some examples of facts:

{"fact": "Charles is a vegan and won't eat any meat.", "category": "personal"}
{"fact": "Charles works in Brighton", "category": "work"}
{"fact": "They have four dogs", "category": "pets"}

## NOT A FACT
This is not a fact as it does not refer to a person:
"London is a city and the capital of England"

## NUMBER_OF_FACTS
<NUMBER_OF_FACTS> stores the number of facts in the "facts" list
*Be as specific as you can about the categories*
"""

In [5]:
data = """Paul is a vegan and won't eat any meat. He has been a vegan for over five years when he met his current wife Angela. They have two dogs, Roxy and Petra, and they both eat meat.

They both work in London but live in Brighton near Seven Dials.

Their jopb description is Django Developers.

They travel a lot and have visited the following countries in the last year - Italy, France and Germany.

They have an active YouTube channel where they post videos about their travels.

They have a cat named Marmalade.

Brighton is based on the South Coast of England and is known for its beaches and nightlife."""

In [6]:
system_message += "<conv>" + data + "</conv>"

In [7]:
prompts = [
    {"role": "system", "content": system_message},
    {"role": "user", "content": ""},
]

In [8]:
response = client.chat.completions.create(model=MODEL, messages=prompts, temperature=0)
res = response.choices[0].message.content.replace("\n", "")

In [9]:
output = json.loads(res)

Having extracted information, we can store this in 'short term memory' - app level cache - or long term memory - stored in a DB for example
and retrieved as needed and added to the SYSTEM MESSAGE.


In [10]:
pprint(output)

{'facts': [{'category': 'personal',
            'fact': "Paul is a vegan and won't eat any meat."},
           {'category': 'personal',
            'fact': 'Paul has been a vegan for over five years'},
           {'category': 'personal', 'fact': 'Paul is married to Angela'},
           {'category': 'pets', 'fact': 'They have two dogs, Roxy and Petra'},
           {'category': 'pets', 'fact': 'They have a cat named Marmalade'},
           {'category': 'work',
            'fact': 'Paul and Angela work in London as Django Developers'},
           {'category': 'residence',
            'fact': 'Paul and Angela live in Brighton'}],
 'number_of_facts': 7,
 'summary': 'Paul is a vegan who works as a Django Developer in London, lives '
            'in Brighton with his wife Angela, and has pets including two dogs '
            'and a cat. They enjoy traveling and have a YouTube channel '
            'documenting their adventures.'}


In [11]:

print("SUMMARY")
pprint(output["summary"])
print("FACTS")
pprint(output["facts"])

SUMMARY
('Paul is a vegan who works as a Django Developer in London, lives in Brighton '
 'with his wife Angela, and has pets including two dogs and a cat. They enjoy '
 'traveling and have a YouTube channel documenting their adventures.')
FACTS
[{'category': 'personal', 'fact': "Paul is a vegan and won't eat any meat."},
 {'category': 'personal', 'fact': 'Paul has been a vegan for over five years'},
 {'category': 'personal', 'fact': 'Paul is married to Angela'},
 {'category': 'pets', 'fact': 'They have two dogs, Roxy and Petra'},
 {'category': 'pets', 'fact': 'They have a cat named Marmalade'},
 {'category': 'work',
  'fact': 'Paul and Angela work in London as Django Developers'},
 {'category': 'residence', 'fact': 'Paul and Angela live in Brighton'}]


We could store this in a DB with the user id and this would give more personalisation the next time the user logs in.