# Using GPT-4 turbo for Section 230 stance classification

In this notebook, we take data analyzed in Gilardi et al. ([2023]()) to illustrate how to use GPT-4-turbo through the OpenAI chat completions API to classify stances in tweets.

In [1]:
import os
from openai import OpenAI
import tiktoken

import pandas as pd
import json

client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))

Pyarrow will become a required dependency of pandas in the next major release of pandas (pandas 3.0),
(to allow more performant data types, such as the Arrow string type, and better interoperability with other libraries)
but was not found to be installed on your system.
If this would cause problems for you,
please provide us feedback at https://github.com/pandas-dev/pandas/issues/54466
        
  import pandas as pd


In [2]:
MODEL = 'gpt-4-turbo-preview'

In [3]:
from typing import Union, List

class TokenCounter:
    def __init__(self, encoding_name: Union[str, None] = None, model: Union[str, None] = None):
        """
        Initialize the tokenizer with either a model or an encoding name.

        Args:
            encoding_name (Union[str, None]): The name of the encoding to use. Default is None.
            model (Union[str, None]): The model to use for encoding. Default is None.

        Raises:
            ValueError: If neither model nor encoding_name is provided.
            ValueError: If both model and encoding_name are provided.
        """
        # ensure that either model or encoding_name is provided
        if model is None and encoding_name is None:
            raise ValueError("Either `model` or `encoding_name` must be provided.")
        if model is not None and encoding_name is not None:
            raise ValueError("Only one of `model` or `encoding_name` can be provided.")
        if encoding_name:
            self.encoding = tiktoken.get_encoding(encoding_name)
        else:
            self.encoding = tiktoken.encoding_for_model(model)
    
    def count_tokens(self, input: Union[str, List[str]]) -> Union[int, List[int]]:
        """
        Count the number of tokens in the input.

        Args:
            input (Union[str, List[str]]): The input to tokenize. Can be a string or a list of strings.

        Returns:
            Union[int, List[int]]: The number of tokens in the input. If the input is a list, returns a list of token counts.
        """
        if isinstance(input, str):
            return len(self.encoding.encode(input))
        else:
            toks = self.encoding.encode_batch(input)
            return [len(t) for t in toks]

    def __call__(self, input: Union[str, List[str]]) -> Union[int, List[int]]:
        """
        Call the tokenizer on the input. This is equivalent to calling count_tokens.

        Args:
            input (Union[str, List[str]]): The input to tokenize. Can be a string or a list of strings.

        Returns:
            Union[int, List[int]]: The number of tokens in the input. If the input is a list, returns a list of token counts.
        """
        return self.count_tokens(input)

In [4]:
token_counter = TokenCounter(model=MODEL)

In [5]:
instructions = """
Your task is to read tweets about content moderation and classify what stance they take on Section 230 (if any).

In the context of content moderation, Section 230 is a law in the United States that protects websites and other online platforms from being held legally responsible for the content posted by their users. This means that if someone posts something illegal or harmful on a website, the website itself cannot be sued for allowing it to be posted. However, websites can still choose to moderate content and remove anything that violates their own policies. 

For each tweet in the sample, follow these instructions: 

1. Carefully read the text of the tweet, paying close attention to details.
2. Classify the tweet as having a positive stance towards Section 230, a negative stance, or a neutral stance.

For each tweet, choose one of the following categories: "negative", "neutral", "positive"
"""

In [6]:
token_counter(instructions)/1000*0.01 # dollar cents per request

0.00181

In [7]:
fp = "../../data/gilardi_chatgpt_2023/gilardi_chatgpt_2023_section230_stance.csv"
df = pd.read_csv(fp)
len(df)

780

In [8]:
df.label.value_counts()

label
neutral     420
negative    327
positive     33
Name: count, dtype: int64

In [10]:
i = 0# 
df.label.values[i], df['text'].values[i]

('negative',
 'Isn’t it fascinating that my Twitter followers went from 1K to 23.9K in a matter of days and now are suddenly starting to decline over the last 48 hours? Big Tech censorship is a clear and present danger to America. Section 230 protection must go away one way or another. #USA')

In [12]:
messages = [ 
    {"role": "system", "content": instructions},
    {"role": "user", "content": df['text'].values[0]}
]

response = client.chat.completions.create(
    model=MODEL,
    messages=messages,
    seed=42,
    temperature=0.0,
)

results = response.choices[0].message.content
results

'negative'

In [13]:
def classify_text(text):
    messages = [ 
        {"role": "system", "content": instructions},
        {"role": "user", "content": text}
    ]

    response = client.chat.completions.create(
        model=MODEL,
        messages=messages,
        seed=42,
        temperature=0.0,
    )

    results = response.choices[0].message.content
    return results

In [14]:
samples = df.groupby('label').sample(25, random_state=42).reset_index(drop=True)
samples

Unnamed: 0,status_id,text,label
0,1316498227399720960,BigTech finally went too far censoring conserv...,negative
1,1347788874890715138,A lot of big section 230 talk over the last ye...,negative
2,1275813106212655104,.@Twitter is at it again — censoring @realDona...,negative
3,1336723788759752704,".@Google/@YouTube, under false and non-sensica...",negative
4,1340367091904491520,@zackfox This is why we need to abolish sectio...,negative
...,...,...,...
70,1313573294516428804,@realDonaldTrump If you repeal section 230 of ...,positive
71,1348388767032242176,@generativist Section 230 literally exists to ...,positive
72,1346763207344599041,"$2,000 checks, monthly\r\nRestore net neutrali...",positive
73,1266180983155494913,This EO is a reactionary and politicized appro...,positive


In [16]:
# tokens in inputs
n_tokens = samples.text.apply(token_counter.count_tokens).sum()
# add token count for instructions
n_tokens += token_counter(instructions) * len(samples)
# add token count for outputs (multiplied by cost factor for output vs. input)
n_tokens += len(samples) * 3

# comopute cost (see https://openai.com/pricing)
n_tokens/1000*0.01 # dollar cents

0.17465

In [18]:
# classify: apply custom classification function to all inputs
results = samples.text.apply(classify_text)

In [19]:
# evaluate: compute performance metrics
from sklearn.metrics import classification_report

print(classification_report(samples.label, results.values))

              precision    recall  f1-score   support

    negative       0.55      0.96      0.70        25
     neutral       0.77      0.40      0.53        25
    positive       0.89      0.64      0.74        25

    accuracy                           0.67        75
   macro avg       0.73      0.67      0.66        75
weighted avg       0.73      0.67      0.66        75

