# Zero-shot text classification with OpenAI's GPT models

This notebook illustrates how to use different GPT models provided by OpenAI for text classification.

In [1]:
import os
from dotenv import load_dotenv
from openai import OpenAI

import re

In [2]:
load_dotenv()
client = OpenAI(api_key=os.environ.get("OPENAI_API_KEY"))

## Define the task

In this example, we adapt the instruction for one of the tweet classification tasks examined in Gilardi et al. ([2023](https://www.pnas.org/doi/10.1073/pnas.2305016120)) "ChatGPT outperforms crowd workers for text-annotation tasks"

- see [this README file](../data/labeled/gilardi_chatgpt_2023/README.md) for a description of the data and tasks covered in the paper
- see [this file](../data/labeled/gilardi_chatgpt_2023/instructions.md) for a copy of their original task instructions

In [3]:
instructions = """
For each tweet in the sample, follow these instructions:

1. Carefully read the text of the tweet, paying close attention to details.
2. Classify the tweet as either relevant (1) or irrelevant (0)
"""

categories = ["Relevant", "Irrelevant"]

defintions = """
Tweets should be coded as RELEVANT when they directly relate to content moderation, as defined above. This includes tweets that discuss: social media platforms’ content moderation rules and practices, governments’ regulation of online content moderation, and/or mild forms of content moderation like flagging.
Tweets should be coded as IRRELEVANT if they do not refer to content moderation, as defined above, or if they are themselves examples of moderated content. This would include, for example, a Tweet by Donald Trump that Twitter has labeled as “disputed”, a tweet claiming that something is false, or a tweet containing sensitive content. Such tweets might be subject to content moderation, but are not discussing content moderation. Therefore, they should be coded as irrelevant for our purposes.
"""

In [4]:
texts = [
    # negative examples ("irrelevant")
    "\"Turns out Mike Bloomberg is exactly what Elizabeth Warren needed to break through in the 2020 Democratic primary. And he’s not just a foil for her on the campaign trail — this is something she believes in, and it shows.\" https://t.co/1SyaHXrZlO",
    "@blackhat___05 ye raha new user name change kiya kamine ne😡🗡️😡🗡️😡🗡️😡 karo abhi FNfollow reopt aur block",
    "The Kid!\n \nRETWEET for a chance at a @RawlingsSports baseball signed by Ken Griffey Jr. and tune in to #Junior tonight at 8pm ET/5pm PT on MLB Network.\n \nRules: https://t.co/MdkXLh1CdN | NoPurNec, US 18+, Ends 6/22 https://t.co/8Xw0HpHz2G",
    "TW / gore \n\nif you come across an account and want to block them, make sure to cover the bottom half of your screen. the gore is normally at the bottom of the screen. again, stay safe, and take precaution",
    "@Godlesswh_re Blocked.  Is this another Nick account?",
    # positive examples ("relevant")
    "Twitter we want you to suspend Marcon's account.\n#twitterSuspendMacronAccount #TwitterSuspendMarcon @verified @Twitter @TwitterSupport",
    "Twitter needs to permanently suspend @realDonaldTrump account.  Who's with me?",
    "Toei is one of the most active reporters of content on Youtube and everything runs through an auto filter. Today, Toei dropped a ridiculous volume of their own series onto an official Youtube channel and GOT BANNED AND REPORTED BY THEMSELVES, TOEI.",
    "Marsha Blackburn: We Are Looking at Antitrust Laws and Section 230 on Tech Censorship https://t.co/lsOWzD0Yri",
    "#Facebook has banned the iconic photograph of a #Soviet solider waving the #USSR flag over the #Reichstag in May 1945. The social network claims the image violates its community guidelines for dangerous people and organizations...\n\nMORE: https://t.co/arpDN9Ss0P https://t.co/KGtGwE4D5J"
]

## With ChatGPT

In [10]:
MODEL = 'gpt-4o-2024-08-06' # currently the latest version of GPT-4o

In [7]:
# Let's format the prompt
prompt = f"Classify the following text into one of the given categories: {categories}\n{defintions}\nOnly include the selected category in your response and no further text."
print(prompt)

Classify the following text into one of the given categories: ['Relevant', 'Irrelevant']

Tweets should be coded as RELEVANT when they directly relate to content moderation, as defined above. This includes tweets that discuss: social media platforms’ content moderation rules and practices, governments’ regulation of online content moderation, and/or mild forms of content moderation like flagging.
Tweets should be coded as IRRELEVANT if they do not refer to content moderation, as defined above, or if they are themselves examples of moderated content. This would include, for example, a Tweet by Donald Trump that Twitter has labeled as “disputed”, a tweet claiming that something is false, or a tweet containing sensitive content. Such tweets might be subject to content moderation, but are not discussing content moderation. Therefore, they should be coded as irrelevant for our purposes.

Only include the selected category in your response and no further text.


### A single text example

In [8]:
text_input = texts[0]

In [9]:
# convert to conversation history
messages = [
  # system prompt
  {"role": "system", "content": prompt},
  # user input
  {"role": "user", "content": text_input},
]

In [11]:
response = client.chat.completions.create(
  model=MODEL,
  messages=messages,
  # for reproducibility
  temperature=0.0,
  seed=42,
)

In [12]:
# parse the response
response.choices[0].message.content

'Irrelevant'

### Iterate over multiple examples

Let's first define a custom function to classify tweets:

In [13]:
def classify_tweet(text, system_message, model):

  # clean the text 
  text = re.sub(r'\s+', ' ', text).strip()

  # construct input

  messages = [
    # system prompt
    {"role": "system", "content": system_message},
    # user input
    {"role": "user", "content": text},
  ]

  response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=0.0,
    seed=42
  )
  
  result = response.choices[0].message.content
  
  return result



Now we can iterate over example texts:

In [17]:
# with GPT 3.5 turbo (legacy)
classifications_gpt35 = [classify_tweet(text, prompt, model='gpt-3.5-turbo-0125') for text in texts]
classifications_gpt35

['Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Relevant',
 'Irrelevant',
 'Irrelevant',
 'Relevant',
 'Relevant']

- 5/5 negative examples classified correctly
- 3/5 positive examples classified correctly

In [15]:
# with GPT 4 turbo
classifications_gpt4 = [classify_tweet(text, prompt, model='gpt-4-turbo-2024-04-09') for text in texts]
classifications_gpt4

['Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Relevant',
 'Relevant',
 'Relevant']

- 5/5 negative examples classified correctly
- 3/5 positive examples classified correctly
- but disagreement with GPT 3.5 turbo on 2/5 positive examples

In [16]:
# with GPT 4o turbo
classifications_gpt4o = [classify_tweet(text, prompt, model=MODEL) for text in texts]
classifications_gpt4o

['Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Irrelevant',
 'Relevant',
 'Relevant',
 'Relevant']

- 5/5 negative examples classified correctly
- 3/5 positive examples classified correctly
- perfect agreement wiht GPT 4 turbo
- but disagreement with GPT 3.5 turbo on 2/5 positive examples

## Multiple inputs per request

In theory, we can also combine several texts in one user message.

But as demonstrated below, this can cause problems, because classifications will depend on the order of texts in the input.

In [18]:
from typing import List

def classify_tweets(texts: List[str], model: str):

  # clean the text 
  texts = [re.sub(r'\s+', ' ', text).strip() for text in texts]

  # construct input

  messages = [
    # system prompt (modified to handle multiple inputs)
    {"role": "system", "content": (
      "Act as a text classification system. "
      "Each line in the input is a separate tweet. "
      f"Classify each tweet into one of the given categories: {categories}\n{defintions}\n"
      "Only include the selected category in your response and no further text. "
      "Seperate the classifications of individual tweet by newline characters."
    )},
    # user input
    {"role": "user", "content": "\n".join(texts)},
  ]

  response = client.chat.completions.create(
    model=model,
    messages=messages,
    temperature=0.0,
    seed=42,
    frequency_penalty=0,
    presence_penalty=0
  )
  
  result = response.choices[0].message.content
  
  return result.split("\n")

In [19]:
classifications = classify_tweets(texts, model=MODEL)

(10, 9)

In [23]:
# but this can lead to erreneous outputs
len(texts), len(classifications) # one missing

(10, 9)

In [24]:
from tqdm.auto import tqdm
# create a list of indexes from 0-9 and reshuffle it
import random
idxs = list(range(10))

# set the seed
random.seed(42)

results = []
n_iter = 5
for i in tqdm(range(n_iter), total=n_iter, desc="Iteration"):
    random.shuffle(idxs)
    inputs = [texts[i] for i in idxs]
    outputs = classify_tweets(inputs, model=MODEL)
    sorted_outputs = [c for _, c in sorted(zip(idxs, outputs))]
    results.append(sorted_outputs)

Iteration:   0%|          | 0/5 [00:00<?, ?it/s]

In [25]:
import pandas as pd

pd.DataFrame(results, columns=[f"text{i:02d}" for i , _ in enumerate(texts, start=1)])

Unnamed: 0,text01,text02,text03,text04,text05,text06,text07,text08,text09,text10
0,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Relevant,Relevant,Relevant,Relevant,Irrelevant,
1,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Relevant,Relevant,Relevant
2,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Relevant,Irrelevant,Relevant,Relevant,
3,Irrelevant,Relevant,Irrelevant,Irrelevant,Irrelevant,Relevant,Relevant,Relevant,Relevant,
4,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Irrelevant,Relevant,Relevant,Relevant,


As you can see, 

- in four of five attempts, the model outputs one classification too few, and
- the classifications of some texts are sensitive to the order of texts in the input 🤷‍♂️ (e.g., 2 and 5)

# Appendix

## Legacy: Example with text generation model

In [None]:

text = "@connybush Sorry hun, Ive removed the tags on IG d person handling my account thought you are my friend dats why u were tagged on both posts."

# clean the text 
text = re.sub(r'\s+', ' ', text).strip()

prompt = f"Classify the following text into one of the given categories: {categories}\n{defintions}\nOnly include the selected category in your response and no further text.\n\nText: {text}\n\nClassification:"

In [None]:
print(prompt)

Classify the following text into one of the given categories: ['Relevant', 'Irrelevant']

Tweets should be coded as RELEVANT when they directly relate to content moderation, as defined above. This includes tweets that discuss: social media platforms’ content moderation rules and practices, governments’ regulation of online content moderation, and/or mild forms of content moderation like flagging.
Tweets should be coded as IRRELEVANT if they do not refer to content moderation, as defined above, or if they are themselves examples of moderated content. This would include, for example, a Tweet by Donald Trump that Twitter has labeled as “disputed”, a tweet claiming that something is false, or a tweet containing sensitive content. Such tweets might be subject to content moderation, but are not discussing content moderation. Therefore, they should be coded as irrelevant for our purposes.

Only include the selected category in your response and no further text.

Text: @connybush Sorry hun, Iv

### Make the API Call

In [None]:
response = client.completions.create(
  model="davinci-002",
  prompt=prompt,
  max_tokens=2,
  top_p=1,
  temperature=0.0,
  seed=42,
  frequency_penalty=0,
  presence_penalty=0
)

### Parse the result

In [None]:
result = response.choices[0].text.strip()
result

'Relevant'

### Iterate over several examples

In [None]:
def classify_tweet(text):

  # clean the text 
  text = re.sub(r'\s+', ' ', text).strip()

  # construct the prompt
  prompt = f"Classify the following text into one of the given categories: {categories}\n{defintions}\nOnly include the selected category in your response and no further text.\n\nText: {text}\n\nClassification:"
  
  response = client.completions.create(
    model="davinci-002",
    prompt=prompt,
    max_tokens=2,
    top_p=1,
    temperature=0.0,
    seed=42,
    frequency_penalty=0,
    presence_penalty=0
  )
  
  result = response.choices[0].text.strip()
  
  return result

In [None]:
classifications = [classify_tweet(text) for text in texts]

In [None]:
classifications

['Relevant',
 'Relevant',
 'Relevant',
 'Relevant',
 'RELEV',
 'Relevant',
 'Relevant',
 'Relevant',
 'Relevant',
 'Relevant']

This doesn't look great =(

Let's try GPT 3.5 turbo and GPT 4 instead 👇