# First LLM Classifier scratchpad

A rough draft of the class

## Getting started

Get your API key

- Go to groq.com
- Click on "Dev console" and jump to https://console.groq.com/playground
- Create an account. I logged in with GitHub. You can do whatever you'd like.
- Click API keys in the left hand toolbar
- Hit create API key
- Name it
- Copy it to your clipboard
- Paste it in a .env file

Open a notebook and install the Python tools we'll use.

In [49]:
!uv pip install pandas
!uv pip install groq
!uv pip install scikit-learn
!uv pip install rich
!uv pip install retry

[2mUsing Python 3.12.0 environment at: /home/palewire/Code/first-llm-classifier/.venv[0m
[2mAudited [1m1 package[0m [2min 1ms[0m[0m
[2mUsing Python 3.12.0 environment at: /home/palewire/Code/first-llm-classifier/.venv[0m
[2mAudited [1m1 package[0m [2min 1ms[0m[0m
[2mUsing Python 3.12.0 environment at: /home/palewire/Code/first-llm-classifier/.venv[0m
[2mAudited [1m1 package[0m [2min 2ms[0m[0m
[2mUsing Python 3.12.0 environment at: /home/palewire/Code/first-llm-classifier/.venv[0m
[2mAudited [1m1 package[0m [2min 1ms[0m[0m
[2mUsing Python 3.12.0 environment at: /home/palewire/Code/first-llm-classifier/.venv[0m
[2mAudited [1m1 package[0m [2min 1ms[0m[0m


## First Python prompt

Import Python tools

In [14]:
import os
from rich import print
from groq import Groq

Get the api_key

In [5]:
api_key = os.environ.get("GROQ_API_KEY")

Login to Grok and save the client for reuse.

In [7]:
client = Groq(api_key=api_key)

Let make our first prompt

In [17]:
response = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Explain the importance of data journalism in a concise sentence",
        }
    ],
    model="llama-3.3-70b-versatile",
)

In [18]:
print(response)

In [19]:
print(response.choices[0].message.content)

Show how you can substitute in a different model and use the same code.

In [28]:
response = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Explain the importance of data journalism in a concise sentence",
        }
    ],
    model="gemma2-9b-it",
)

In [29]:
print(response.choices[0].message.content)

Show how you can make a system prompt to prime the LLM

In [20]:
response = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "you are an enthusiastic nerd who believes data journalism is the future."

        },
        {
            "role": "user",
            "content": "Explain the importance of data journalism in a concise sentence",
        }
    ],
    model="llama-3.3-70b-versatile",
)

In [21]:
print(response.choices[0].message.content)

In [24]:
response = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "you are a crusty, ill-tempered editor who hates math and thinks data journalism is a waste of time and resources."

        },
        {
            "role": "user",
            "content": "Explain the importance of data journalism in a concise sentence",
        }
    ],
    model="llama-3.3-70b-versatile",
)

In [25]:
print(response.choices[0].message.content)

## Structured responses

You don't have to ask for essays, poems or chitchat. You can ask an LLM to make very simple decisions and code data.

In [36]:
response = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You are a text classifier that categorizes text. I will provide the name of a professional sports team. You will reply with the sports league in which they compete."
        },
        {
            "role": "user",
            "content": "Chicago Cubs",
        }
    ],
    model="llama-3.3-70b-versatile",
)

In [37]:
print(response.choices[0].message.content)

In [38]:
response = client.chat.completions.create(
    messages=[
        {
            "role": "system",
            "content": "You are a text classifier that categorizes text. I will provide the name of a professional sports team. You will reply with the sports league in which they compete."
        },
        {
            "role": "user",
            "content": "Minnesota Vikings",
        }
    ],
    model="llama-3.3-70b-versatile",
)

In [39]:
print(response.choices[0].message.content)

You can make a function to loop through a dataset and ask the LLM to code them one by one.

In [42]:
def classify_team(name):
    prompt = """You are a text classifier that categorizes text.
    
I will provide the name of a professional sports team.

You will reply with the sports league in which they compete."""
    
    response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": prompt,
            },
            {
                "role": "user",
                "content": name,
            }
        ],
        model="llama-3.3-70b-versatile",
    )

    return response.choices[0].message.content

In [48]:
team_list = ["Minnesota Twins", "Minnesota Vikings", "Minnesota Timberwolves"]
for team in team_list:
    league = classify_team(team)
    print([team, league])

Sometimes the LLM will get weird and return something you don't want. You can improve this be adding validation.

In [45]:
def classify_team(name):
    prompt = """You are a text classifier that categorizes text.
    
I will provide the name of a professional sports team.

You will reply with the sports league in which they compete.

Your responses must come from the following this:
- Major League Baseball (MLB)
- National Football League (NFL)
- National Basketball Association (NBA)
"""
    
    response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": prompt,
            },
            {
                "role": "user",
                "content": name,
            }
        ],
        model="llama-3.3-70b-versatile",
    )

    answer = response.choices[0].message.content

    acceptable_answers = [
        "Major League Baseball (MLB)",
        "National Football League (NFL)",
        "National Basketball Association (NBA)",
    ]
    if answer not in acceptable_answers:
        raise ValueError(f"{answer} not in list of acceptable answers")

    return answer

In [47]:
classify_team("Indiana Fever")

ValueError: Women's National Basketball Association (WNBA) 

However, since WNBA isn't an option and considering the context of other options provided and the most relevant one, I will classify it as: 
National Basketball Association (NBA) isn't correct, though, a more accurate answer would be the WNBA. not in list of acceptable answers

There are different strategies you can take to deal with this issue. In some cases, if you observer that the issue isn't due to your coding options but is instead a result of the LLM giving a rare odd response, you have a couple options.

The first one, which you should consider making routine, is to lower the "temperature" of the model, which is a way dial down its creativity and make it more consistent.

In [59]:
def classify_team(name):
    prompt = """You are a text classifier that categorizes text.
    
I will provide the name of a professional sports team.

You will reply with the sports league in which they compete.

Your responses must come from the following this:
- Major League Baseball (MLB)
- National Football League (NFL)
- National Basketball Association (NBA)
"""
    
    response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": prompt,
            },
            {
                "role": "user",
                "content": name,
            }
        ],
        model="llama-3.3-70b-versatile",
        temperature=0
    )

    answer = response.choices[0].message.content

    acceptable_answers = [
        "Major League Baseball (MLB)",
        "National Football League (NFL)",
        "National Basketball Association (NBA)",
    ]
    if answer not in acceptable_answers:
        raise ValueError(f"{answer} not in list of acceptable answers")

    return answer

In [60]:
classify_team("Chicago White Sox")

'Major League Baseball (MLB)'

You can also provide some sample responses to the LLM to give it a "few-shot" training.

In [99]:
def classify_team(name):
    prompt = """You are a text classifier that categorizes text.
    
I will provide the name of a professional sports team.

You will reply with the sports league in which they compete.

Your responses must come from the following this:
- Major League Baseball (MLB)
- National Football League (NFL)
- National Basketball Association (NBA)
"""
    
    response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": prompt,
            },
            {
                "role": "user",
                "content": "Los Angeles Rams",
            },
            {
                "role": "assistant",
                "content": "National Football League (NFL)",
            },
            {
                "role": "user",
                "content": "Los Angeles Dodgers",
            },
            {
                "role": "assistant",
                "content": " Major League Baseball (MLB)",
            },
            {
                "role": "user",
                "content": "Los Angeles Lakers",
            },
            {
                "role": "assistant",
                "content": "National Basketball Association (NBA)",
            },
            {
                "role": "user",
                "content": name,
            },
        ],
        model="llama-3.3-70b-versatile",
        temperature=0
    )

    answer = response.choices[0].message.content

    acceptable_answers = [
        "Major League Baseball (MLB)",
        "National Football League (NFL)",
        "National Basketball Association (NBA)",
    ]
    if answer not in acceptable_answers:
        raise ValueError(f"{answer} not in list of acceptable answers")

    return answer

In [100]:
classify_team("Chicago Bulls")

'National Basketball Association (NBA)'

You can also force the function to retry when there's an error. Here's a way to do that with Python's retry decorator.

In [50]:
from retry import retry

In [51]:
@retry(ValueError, tries=2, delay=2)
def classify_team(name):
    prompt = """You are a text classifier that categorizes text.
    
I will provide the name of a professional sports team.

You will reply with the sports league in which they compete.

Your responses must come from the following this:
- Major League Baseball (MLB)
- National Football League (NFL)
- National Basketball Association (NBA)
"""
    
    response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": prompt,
            },
            {
                "role": "user",
                "content": "Los Angeles Rams",
            },
            {
                "role": "assistant",
                "content": "National Football League (NFL)",
            },
            {
                "role": "user",
                "content": "Los Angeles Dodgers",
            },
            {
                "role": "assistant",
                "content": " Major League Baseball (MLB)",
            },
            {
                "role": "user",
                "content": "Los Angeles Lakers",
            },
            {
                "role": "assistant",
                "content": "National Basketball Association (NBA)",
            },
            {
                "role": "user",
                "content": name,
            }
        ],
        model="llama-3.3-70b-versatile",
        temperature=0
    )

    answer = response.choices[0].message.content

    acceptable_answers = [
        "Major League Baseball (MLB)",
        "National Football League (NFL)",
        "National Basketball Association (NBA)",
    ]
    if answer not in acceptable_answers:
        raise ValueError(f"{answer} not in list of acceptable answers")

    return answer

That can solve for malformed responses, but sometimes there just isn't answer in your validation list. One way to manage that is to allow an "other" category.

In [101]:
@retry(ValueError, tries=2, delay=2)
def classify_team(name):
    prompt = """You are a text classifier that categorizes text.
    
I will provide the name of a professional sports team.

You will reply with the sports league in which they compete.

The answers must be one of the following options:
- Major League Baseball (MLB)
- National Football League (NFL)
- National Basketball Association (NBA)

If the team is not a member of all of the three leagues on the list, you should return "Other"
"""
    
    response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": prompt,
            },
            {
                "role": "user",
                "content": "Los Angeles Rams",
            },
            {
                "role": "assistant",
                "content": "National Football League (NFL)",
            },
            {
                "role": "user",
                "content": "Los Angeles Dodgers",
            },
            {
                "role": "assistant",
                "content": " Major League Baseball (MLB)",
            },
            {
                "role": "user",
                "content": "Los Angeles Lakers",
            },
            {
                "role": "assistant",
                "content": "National Basketball Association (NBA)",
            },
            {
                "role": "user",
                "content": "Los Angeles Kings",
            },
            {
                "role": "assistant",
                "content": "Other",
            },            
            {
                "role": "user",
                "content": name,
            },
        ],
        model="llama-3.3-70b-versatile",
        temperature=0
    )

    answer = response.choices[0].message.content

    acceptable_answers = [
        "Major League Baseball (MLB)",
        "National Football League (NFL)",
        "National Basketball Association (NBA)",
        "Other",
    ]
    if answer not in acceptable_answers:
        raise ValueError(f"{answer} not in list of acceptable answers")

    return answer

In [102]:
classify_team("Indiana Fever")

'Other'

## Bulk prompts

Requesting answers one by one can take a long time. And it can end up costing you money. 

The solution is to submit your requests in batches and then get the answers back from the LLM as JSON you can parse.

In [69]:
import json

In [103]:
# @retry(ValueError, tries=2, delay=2)
def classify_teams(name_list):
    prompt = """You are a text classifier that categorizes items.

 I will provide a list of names of professional sports teams separated by newlines.

You will determine the sports league in which each one of them competes.

The answers must be one of the following options:
- Major League Baseball (MLB)
- National Football League (NFL)
- National Basketball Association (NBA)

If the team is not a member of any of the three leagues on the list, you should label them as "Other".

Your answers should be returned as a flat JSON list. Do not return a dictionary or any kind of nested data.

If I were to submit:

"Los Angeles Rams
Los Angeles Dodgers
Los Angeles Lakers
Los Angeles Kings"

You should return the following:

["National Football League (NFL)", "Major League Baseball (MLB)", "National Basketball Association (NBA)", "Other"]
"""
    
    response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": prompt,
            },
            {
                "role": "user",
                "content": "Chicago Bears,Chicago Cubs,Chicago Bulls,Chicago Blackhawks",
            },
            {
                "role": "assistant",
                "content": '["National Football League (NFL)", "Major League Baseball (MLB)", "National Basketball Association (NBA)", "Other"]',
            },   
            {
                "role": "user",
                "content": "\n".join(name_list),
            }
        ],
        model="llama-3.3-70b-versatile",
        temperature=0,
    )

    answer_str = response.choices[0].message.content
    answer_list = json.loads(answer_str)

    acceptable_answers = [
        "Major League Baseball (MLB)",
        "National Football League (NFL)",
        "National Basketball Association (NBA)",
        "Other",
    ]
    for answer in answer_list:
        if answer not in acceptable_answers:
            raise ValueError(f"{answer} not in list of acceptable answers")
   
    return dict(zip(name_list, answer_list))

In [104]:
classify_teams(team_list)

{'Minnesota Twins': 'Major League Baseball (MLB)',
 'Minnesota Vikings': 'National Football League (NFL)',
 'Minnesota Timberwolves': 'National Basketball Association (NBA)'}

### This time ... for journalism

Let's get some real data in here.

In [106]:
import pandas as pd

In [107]:
df = pd.read_csv("Form460ScheduleESubItem.csv")

In [109]:
df.sample(10)

Unnamed: 0,payee
778,ANTIOCH CHAMBER OF COMMERCE
16029,WIX
11816,RANCHO BERNARDO CHAMBER
10221,NEUMANN ENTERPRISES
5081,FIVE STAR TOURS & CHARTER BUS COMPANY
2157,CALIFORNIA DEMOCRATIC PARTY - NON FEDERAL ACCOUNT
14349,THE KITCHEN CORPORATION
8820,LONGHI'S
10784,OSSEN CARDENAS
8350,LA FOCUS PUBLICATIONS


In [128]:
@retry(ValueError, tries=2, delay=2)
def classify_payees(name_list):
    prompt = """You are a text classifier that categorizes items.

I will provide a list of business names separated by newlines.

I want you to examine them individually and determine if they are one the following types:
- Restaurant
- Bar
- Nightclub
- Hotel

If they are, return the name of the type exactly as it appears in the list above.

If they are not one of those four types, you must return "Other"

Your answers should be returned as a flat JSON list. Do not return a dictionary or any kind of nested data.

If I were to submit:

"Intercontinental Hotel
Pizza Hut
Musso and Frank's
Studio 54
KTLA
Direct Mailing"

You should return the following:

["Hotel", "Restaurant", "Bar", "Nightclub", "Other", "Other"]
"""
    
    response = client.chat.completions.create(
        messages=[
            {
                "role": "system",
                "content": prompt,
            },
            {
                "role": "user",
                "content": "Subway Sandwiches\nRuth Chris Steakhouse\nPolitical Consulting Co\nThe Lamb's Club",
            },
            {
                "role": "assistant",
                "content": '["Restaurant", "Restaurant", "Other", "Bar"]',
            },   
            {
                "role": "user",
                "content": "\n".join(name_list),
            }
        ],
        model="llama-3.3-70b-versatile",
        temperature=0,
    )

    answer_str = response.choices[0].message.content
    answer_list = json.loads(answer_str)

    acceptable_answers = [
        "Restaurant",
        "Bar",
        "Nightclub",
        "Hotel",
        "Other",
    ]
    for answer in answer_list:
        if answer not in acceptable_answers:
            raise ValueError(f"{answer} not in list of acceptable answers")
   
    return dict(zip(name_list, answer_list))

In [116]:
sample_list = list(df.sample(10).payee)

In [117]:
classify_payees(sample_list)

{'HILTON LOS ANGELES/UNIVERSAL CITY': 'Hotel',
 'DODGER STADIUM': 'Other',
 'SIERRA TRADING POST': 'Other',
 'PALM MOUNTAIN RESORT & SPA': 'Hotel',
 'FRESCO': 'Restaurant',
 'HILTON HAWAIIAN VILLAGE BEACH RESORT & SPA': 'Hotel',
 'KQKE AM': 'Other',
 'HILTON HOTELS NEW YORK': 'Hotel',
 'EL PORTAL DEL ANGEL': 'Restaurant',
 'EL TORITO MEXICAN RESTAURANT': 'Restaurant'}

In [118]:
len(df.payee)

16448

In [120]:
bigger_sample = list(df.sample(1000).payee)

In [121]:
import time

In [None]:
# Set the batch size
batch_size = 50

# Store the results
all_results = {}

# Loop through the list in batches
for i in range(0, len(bigger_sample), batch_size):
    # Get the batch
    batch = bigger_sample[i : i + batch_size]
    # Classify it
    batch_results = classify_payees(batch)
    # Add it to the mega list
    all_results.update(batch_results)
    # Tap the brakes
    time.sleep(2)

## Evaluate

Here we hammer the supervised sample...

I import a training set that I've prepared a head of time...

We use scipy to evaluate how well the LLM does on the supervised sample..

We feed the training set into few-shot pre-prompts and see if it improves the results...

We compare those results against a pre-written old school sklearn version that is trained on our sample...