**PART 2: GPT as a coder for the training dataset**

Note: This codebook is for coding the sample database we created. The coding process can be manual, or in this case, we are going to use ChatGPT and its contextual capabilities as a human coder of the sanitized text of a sample of the tweets. The objective of this notebook is to create a function and execute that function to ask GPT through the API to act as a human coder and produce a codification of the tweets. The function is to pass the dataset of the sample to ChatGPT and OpenAI, to include GPT and its contextual response as a first-hand coder. We will pass the 1000 database to the GPT code and then create a new dataset with the codification. This code will serve as the training data for the machine learning classifier. We use the JSON file to mask the API key for OpenAI. Any attempt to replicate this code will need to include their own OpenAI API key.
#Update: my quota for Open.Ai was reached at 93% of processing of the sample. We safely have 926 tweets classified with OpenAI. The cost for this in the api was USD $8.64.

**Structure:**

Library: Import libraries.

Loads the OpenAI API key from a secrets.json file and initializes the OpenAI client.

Prompt: A  polarization_prompt, is crafted. This prompt instructs GPT-4 to act as a political communication expert and provides a detailed codebook, specifying the JSON output format with fields that have our dependent variables with columns like polar_score, pole_label, valid, and comments.

label_polarization_with_gpt Function: this function reads the 1000 tweet sample CSV we build before in the santext column. For each tweet, it constructs a user prompt and sends it to the GPT-4 model via the API. It includes error handling to manage API issues (like rate limits) and appends "INV" (invalid) if an error occurs.

Execution and Output: the  function is called to begin the annotation process. The resulting lists of scores, labels, and comments are added as new columns to the sample DF.

The final annotated DataFrame is saved to a new CSV file: debate_tweets_sample_1000_for_annotation_labeled.csv.

In [None]:
# Libraries required
import pandas as pd
import json
import time
import os
from tqdm import tqdm
from openai import OpenAI

In [4]:
# Load API key from secrets.json (this is my API so the idea is that anyone who follows the code can do it with their own API)
with open("secrets.json") as f:
    secrets = json.load(f)

# Initialize OpenAI client
client = OpenAI(api_key=secrets["openai_api_key"])

# Polarization prompt based on U.S. polarization adapted codebook
polarization_prompt = """
You are a political communication expert labeling U.S. tweets based on political polarization. For each tweet, return the following in JSON format:

- polar_score: a number from -7 (extreme left) to +7 (extreme right), or 'INV' if irrelevant/off-topic
- pole_label: one of 'left', 'right', 'neutral', or 'inv'
- valid: 1 if relevant to U.S. political debate, 0 if irrelevant
- comments: a short optional comment or "" if unnecessary

Use tone, intent, political figures, partisanship, and U.S. ideological cues (e.g., MAGA, neoliberal, woke). Interpret sarcasm or humor contextually.

Output only a JSON like this:
{ "polar_score": -4, "pole_label": "left", "valid": 1, "comments": "" }

Now classify this tweet:
"""

# input file path
df_path = 'data/sample/debate_tweets_sample_1000_for_annotation.csv'

def label_polarization_with_gpt(df_path, text_column='santext', model='gpt-4', delay=1):
    df = pd.read_csv(df_path)

    polar_scores = []
    pole_labels = []
    valids = []
    comments_list = []

    for text in tqdm(df[text_column], desc="Classifying tweets"):
        try:
            prompt = polarization_prompt + f'Tweet: "{text.strip()}" →'

            response = client.chat.completions.create(
                model=model,
                messages=[
                    {"role": "system", "content": "You are a helpful assistant for political tweet labeling."},
                    {"role": "user", "content": prompt}
                ],
                temperature=0,
                max_tokens=100
            )

            # Extract model's reply
            answer = response.choices[0].message.content.strip()
            result = json.loads(answer)

            # Store results
            polar_scores.append(result.get("polar_score", "INV"))
            pole_labels.append(result.get("pole_label", "inv"))
            valids.append(result.get("valid", 0))
            comments_list.append(result.get("comments", ""))

        except Exception as e:
            print(f"Error processing tweet: {e}")
            polar_scores.append("INV")
            pole_labels.append("inv")
            valids.append(0)
            comments_list.append("Error or parsing failure")

        time.sleep(delay)

    # Add GPT labels to dataframe
    df['polar_score'] = polar_scores
    df['pole_label'] = pole_labels
    df['valid'] = valids
    df['comments'] = comments_list

    # Save results
    output_path = df_path.replace(".csv", "_labeled.csv")
    df.to_csv(output_path, index=False)
    print(f"Labeled file saved to: {output_path}")
    return df

In [5]:
label_polarization_with_gpt(df_path)

Classifying tweets:  93%|██████████████████▌ | 929/1000 [43:39<03:24,  2.89s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  93%|██████████████████▌ | 930/1000 [43:42<03:17,  2.82s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  93%|██████████████████▌ | 931/1000 [43:45<03:13,  2.80s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  93%|██████████████████▋ | 932/1000 [43:47<03:09,  2.79s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  93%|██████████████████▋ | 933/1000 [43:50<03:03,  2.73s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  93%|██████████████████▋ | 934/1000 [43:52<02:56,  2.68s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▋ | 935/1000 [43:55<02:56,  2.71s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▋ | 936/1000 [43:58<02:53,  2.71s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▋ | 937/1000 [44:01<02:51,  2.72s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▊ | 938/1000 [44:03<02:46,  2.68s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▊ | 939/1000 [44:06<02:42,  2.66s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▊ | 940/1000 [44:09<02:41,  2.68s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▊ | 941/1000 [44:11<02:39,  2.71s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▊ | 942/1000 [44:14<02:37,  2.71s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▊ | 943/1000 [44:17<02:33,  2.70s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▉ | 944/1000 [44:19<02:29,  2.66s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  94%|██████████████████▉ | 945/1000 [44:22<02:27,  2.68s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  95%|██████████████████▉ | 946/1000 [44:25<02:24,  2.68s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  95%|██████████████████▉ | 947/1000 [44:27<02:19,  2.64s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  95%|██████████████████▉ | 948/1000 [44:30<02:17,  2.65s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  95%|██████████████████▉ | 949/1000 [44:33<02:12,  2.60s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  95%|███████████████████ | 950/1000 [44:35<02:13,  2.66s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  95%|███████████████████ | 951/1000 [44:38<02:09,  2.65s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  95%|███████████████████ | 952/1000 [44:41<02:14,  2.81s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  95%|███████████████████ | 953/1000 [44:44<02:07,  2.71s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  95%|███████████████████ | 954/1000 [44:46<02:02,  2.67s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████ | 955/1000 [44:49<01:58,  2.64s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████ | 956/1000 [44:51<01:55,  2.62s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████▏| 957/1000 [44:54<01:54,  2.65s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████▏| 958/1000 [44:57<01:50,  2.63s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████▏| 959/1000 [45:00<01:51,  2.72s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████▏| 960/1000 [45:02<01:48,  2.71s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████▏| 961/1000 [45:05<01:45,  2.70s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████▏| 962/1000 [45:08<01:43,  2.73s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████▎| 963/1000 [45:11<01:41,  2.75s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████▎| 964/1000 [45:13<01:36,  2.69s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  96%|███████████████████▎| 965/1000 [45:16<01:32,  2.65s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  97%|███████████████████▎| 966/1000 [45:19<01:35,  2.81s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  97%|███████████████████▎| 967/1000 [45:22<01:32,  2.81s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  97%|███████████████████▎| 968/1000 [45:24<01:27,  2.73s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  97%|███████████████████▍| 969/1000 [45:27<01:23,  2.69s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  97%|███████████████████▍| 970/1000 [45:29<01:20,  2.67s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  97%|███████████████████▍| 971/1000 [45:32<01:17,  2.67s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  97%|███████████████████▍| 972/1000 [45:35<01:14,  2.68s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  97%|███████████████████▍| 973/1000 [45:38<01:13,  2.71s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  97%|███████████████████▍| 974/1000 [45:41<01:15,  2.89s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▌| 975/1000 [45:43<01:10,  2.82s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▌| 976/1000 [45:46<01:05,  2.73s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▌| 977/1000 [45:49<01:02,  2.73s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▌| 978/1000 [45:52<01:02,  2.84s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▌| 979/1000 [45:55<01:03,  3.04s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▌| 980/1000 [45:58<00:57,  2.88s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▌| 981/1000 [46:00<00:53,  2.81s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▋| 982/1000 [46:04<00:53,  2.95s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▋| 983/1000 [46:06<00:48,  2.84s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▋| 984/1000 [46:09<00:45,  2.82s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  98%|███████████████████▋| 985/1000 [46:12<00:41,  2.78s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  99%|███████████████████▋| 986/1000 [46:14<00:38,  2.73s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  99%|███████████████████▋| 987/1000 [46:17<00:35,  2.75s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  99%|███████████████████▊| 988/1000 [46:20<00:32,  2.73s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  99%|███████████████████▊| 989/1000 [46:23<00:30,  2.74s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  99%|███████████████████▊| 990/1000 [46:26<00:28,  2.86s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  99%|███████████████████▊| 991/1000 [46:29<00:25,  2.84s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  99%|███████████████████▊| 992/1000 [46:31<00:22,  2.79s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  99%|███████████████████▊| 993/1000 [46:34<00:19,  2.78s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets:  99%|███████████████████▉| 994/1000 [46:37<00:16,  2.72s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets: 100%|███████████████████▉| 995/1000 [46:39<00:13,  2.70s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets: 100%|███████████████████▉| 996/1000 [46:42<00:10,  2.72s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets: 100%|███████████████████▉| 997/1000 [46:45<00:08,  2.72s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets: 100%|███████████████████▉| 998/1000 [46:47<00:05,  2.67s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets: 100%|███████████████████▉| 999/1000 [46:50<00:02,  2.65s/it]

Error processing tweet: Error code: 429 - {'error': {'message': 'You exceeded your current quota, please check your plan and billing details. For more information on this error, read the docs: https://platform.openai.com/docs/guides/error-codes/api-errors.', 'type': 'insufficient_quota', 'param': None, 'code': 'insufficient_quota'}}


Classifying tweets: 100%|███████████████████| 1000/1000 [46:53<00:00,  2.81s/it]

✅ Done! Labeled file saved to: data/sample/debate_tweets_sample_1000_for_annotation_labeled.csv





Unnamed: 0,Id,Date,Time,Media Type,Site Domain,Mention URL,Publisher Name,Publisher Username,title,Mention Content,...,Twitter Retweets,Twitter Favorites,Twitter replies,Media URL,Debate,santext,polar_score,pole_label,valid,comments
0,287683906109,2024-09-10,20:56:00 -0500 CDT,Twitter,http://www.twitter.com,http://twitter.com/fuckftrump9000/status/18336...,fucktrump2024,fuckftrump9000,RT @RyanShead: Donald Trump is dumb as FUUUUUU...,RT @RyanShead: Donald Trump is dumb as FUUUUUU...,...,,,,,2,rt ryanshead donald trump is dumb as fuuuuuuuuck,-5,left,1,Uses derogatory language towards a right-wing ...
1,287682844708,2024-09-10,20:45:00 -0500 CDT,Twitter,http://www.twitter.com,http://twitter.com/david_darmofal/status/18336...,David Darmofal,david_darmofal,RT @politico: Woof. Why is Trump claiming immi...,RT @politico: Woof. Why is Trump claiming immi...,...,,,,,2,rt politico woof why is trump claiming immigra...,-2,left,1,"The tweet criticizes Trump, a figure associate..."
2,281373629110,2024-06-27,20:48:00 -0500 CDT,Twitter,http://www.twitter.com,http://twitter.com/CurtisHouck/status/18065048...,Curtis Houck,CurtisHouck,"Nevermind! RT @robbysoave: Trump: ""My retribut...","Nevermind! RT @robbysoave: Trump: ""My retribut...",...,,,,,1,nevermind rt robbysoave trump my retribution i...,3,right,1,"The tweet is supportive of Trump, a figure ass..."
3,287685923517,2024-09-10,21:14:00 -0500 CDT,Twitter,http://www.twitter.com,http://twitter.com/notnuts50/status/1833690587...,Joan Thompson,notnuts50,RT @LeadingReport: BREAKING: Former President ...,RT @LeadingReport: BREAKING: Former President ...,...,,,,,2,rt leadingreport breaking former president tru...,5,right,1,"The tweet is about former President Trump, a R..."
4,287694936672,2024-09-10,22:48:00 -0500 CDT,Twitter,http://www.twitter.com,http://twitter.com/JulesCalamity/status/183371...,Jules,JulesCalamity,RT @ProjectLincoln: ABC: JD Vance says you wou...,RT @ProjectLincoln: ABC: JD Vance says you wou...,...,,,,,2,rt projectlincoln abc jd vance says you would ...,-3,left,1,"The tweet refers to Project Lincoln, an anti-T..."
...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...,...
995,287689423608,2024-09-10,21:48:00 -0500 CDT,Twitter,http://www.twitter.com,http://twitter.com/ReneeV17542118/status/18336...,Renee V.,ReneeV17542118,RT @JoeyMannarinoUS: Kamala Harris didn’t have...,RT @JoeyMannarinoUS: Kamala Harris didn’t have...,...,,,,,2,rt joeymannarinous kamala harris didnt have a ...,INV,inv,0,Error or parsing failure
996,287687695013,2024-09-10,21:32:00 -0500 CDT,Twitter,http://www.twitter.com,http://twitter.com/peteyboo1966/status/1833695...,Shirley J Roark,peteyboo1966,RT @michaeljknowles: If they would just “fact-...,RT @michaeljknowles: If they would just “fact-...,...,,,,,2,rt michaeljknowles if they would just factchec...,INV,inv,0,Error or parsing failure
997,281369262213,2024-06-27,19:30:00 -0500 CDT,Twitter,http://www.twitter.com,http://twitter.com/JacquelineRuby1/status/1806...,Jacqueline Ruby,JacquelineRuby1,RT @FaithRubPol: Travis Kelce SLAMS Trump for ...,RT @FaithRubPol: Travis Kelce SLAMS Trump for ...,...,,,,,1,rt faithrubpol travis kelce slams trump for wh...,INV,inv,0,Error or parsing failure
998,287690289336,2024-09-10,21:56:00 -0500 CDT,Twitter,http://www.twitter.com,http://twitter.com/NubletNub/status/1833701132...,Nub,NubletNub,"RT @MartyTheElder: Ok, let me get this straigh...","RT @MartyTheElder: Ok, let me get this straigh...",...,,,,,2,rt martytheelder ok let me get this straight k...,INV,inv,0,Error or parsing failure


In [None]:
#Lets test the created data and x train and y text set.
#the file created is data/sample/debate_tweets_sample_1000_for_annotation_labeled.csv

In [3]:
import pandas as pd
from sklearn.model_selection import train_test_split

# Load the labeled tweet data
df = pd.read_csv("data/sample/debate_tweets_sample_1000_for_annotation_labeled.csv")

# We face a problem at the moment of using GPT to code
# the value of tokens allowed to use through the API didn't reach the 1000 mark of tweets to code
# so we need to Filter to valid labels only, then selectthe  first 920 rows (the actual value of coded tweets by GPT was 934)
df = df[df['valid'] == 1].copy().head(920)

# Convert categorical labels to numeric
label_map = {'left': 0, 'neutral': 1, 'right': 2}
df['label'] = df['pole_label'].map(label_map)

# Train-test split (80/20)
X_train, X_test, y_train, y_test = train_test_split(
    df['santext'], df['label'], test_size=0.2, random_state=42, stratify=df['label'])