
<h1><center> PPOL 6801 Text as Data <br><br> 
<font color='grey'> LLMs: Outsourcing Text-as-Data Tasks  <br><br>
Tiago Ventura </center> <h1> 

---

### Outsourcing Text-as-Data Tasks to Generative Text-Based Models: GPT's API

As you know, ChatGPT is an large language model (as we just saw) developed by OpenAI, based on the GPT architecture. The model was trained on a word-prediction task and it has blown the world by its capacity to engage in conversational interactions.

You should be familiar with interacting with the ChatGPT tool to solve a variety of tasks. Here I will show you how to do that at scale, by using prompts to interact with the model via Open AI API. 

The whole process requires us to have access to the Open AI API which allow us to query continously the GPT models. Notice, this is not free. You pay for every query. In general, for small tasks, it is not super expensive. However, for tasks with millions of predictions, it can get expensive. 



## Tasks and Prompts

Before we try to replicate the tasks behind the papers we read in class, let's see some simple tasks we can ask GPT models to perform. 

In [3]:
#!pip install openai

In [36]:
# load api key
# load library to get environmental files
import os
from dotenv import load_dotenv
import requests 


# load keys from  environmental var
load_dotenv() # .env file in cwd
gpt_key = os.environ.get("gpt") 

In [99]:
# simple query

# define headers
headers = {
        "Authorization": f"Bearer {gpt_key}",
        "Content-Type": "application/json",
    }

# define gpt model
question = "Please, tell me more about the Data Science and Public Policy Program at Georgetown's McCourt School"

data = {
        "model": "gpt-3.5-turbo-0301",
        "temperature": 0,
        "messages": [{"role": "user", "content": question}]
    }



# send a post request
response = requests.post("https://api.openai.com/v1/chat/completions", 
                             json=data, 
                             headers=headers)
# convert to json
response_json = response.json()

In [100]:
response_json['choices'][0]['message']['content'].strip()

"The Data Science and Public Policy Program at Georgetown's McCourt School is a unique program that combines the fields of data science and public policy. The program is designed to equip students with the skills and knowledge needed to use data science to solve complex policy problems.\n\nThe program is interdisciplinary in nature, drawing on expertise from the fields of statistics, computer science, economics, and political science. Students in the program learn how to collect, analyze, and interpret data to inform policy decisions.\n\nThe curriculum includes courses in data science, statistics, machine learning, and policy analysis. Students also have the opportunity to work on real-world policy projects, collaborating with government agencies, non-profit organizations, and private sector companies.\n\nGraduates of the program are well-equipped to pursue careers in a variety of fields, including government, non-profit organizations, and the private sector. They are able to use data 

## Sentiment Classification

In Rathje et. al., we saw the use of GPT models for sentiment classification using zero-shot prompts. 

This is a super simple task. Let's see some code below on how to go about it.  

In [101]:
# Function to interact with the ChatGPT API
def hey_chatGPT(question_text, api_key):
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    }

    data = {
        "model": "gpt-3.5-turbo-0301",
        "temperature": 0,
        "messages": [{"role": "user", "content": question_text}]
    }

    response = requests.post("https://api.openai.com/v1/chat/completions", 
                             json=data, 
                             headers=headers, timeout=5)
    
    response_json = response.json()
    return response_json['choices'][0]['message']['content'].strip()

In [102]:
import pandas as pd
# let's open some twitter data
pd_test = pd.read_csv("../data/incivility.csv")

# sample
pd_test = pd_test.sample(n=10).reset_index()

# see
pd_test.head()

Unnamed: 0,index,comment_id,comment_likes_count,comment_message,attacks
0,953,10154314327932068_10154317256602068,0,my thats a lot of black folks,1
1,2894,435099756687754_435248720006191,4,But he did give us a better understanding of s...,0
2,2104,885263181519105_886390928072997,1,VETO IT!!!,0
3,917,1254553344571289_1257252654301358,0,"Correction * million, not millions Time for a ...",1
4,1570,904974722925935_905135986243142,0,Enough liberal catholics running the house. We...,1


In [103]:
import time
output = []
# Run a loop over your dataset of reviews and prompt ChatGPT
for i in range(len(pd_test)):
    try: 
        print(i)
        question = "Is the sentiment of this text positive, neutral, or negative? \
        Answer only with a number: 1 if positive, 0 if neutral and -1 if negative. \
        Here is the text: "
        text = pd_test.loc[i, "comment_message"]
        full_question = question + str(text)
        output.append(hey_chatGPT(full_question, gpt_key))
    except:
        output.append(np.nan)

0
1
2
3
4
5
6
7
8
9


In [104]:
# save the output
pd_test["sentiment"]= output

In [105]:
# see
with pd.option_context('display.max_rows', None, 'display.max_columns', None):  # more options can be specified also
    print(pd_test[["comment_message", "sentiment"]])


                                     comment_message sentiment
0                      my thats a lot of black folks        -1
1  But he did give us a better understanding of s...         0
2                                         VETO IT!!!        -1
3  Correction * million, not millions Time for a ...        -1
4  Enough liberal catholics running the house. We...        -1
5  My mother sang old shep to make me cry!!  What...        -1
6  why are you allowing toxic Chemtrial spaying i...        -1
7  While we were all looking the other way, sacre...        -1
8                                      Shame on you.        -1
9  Chris 100% of the guns used in those crimes ar...        -1


In [106]:
for i in range(len(pd_test)):
    print("Text:" + pd_test["comment_message"][i] + " \nSentiment: " + pd_test["sentiment"][i])

Text:my thats a lot of black folks 
Sentiment: -1
Text:But he did give us a better understanding of such legal terms as "applesauce," and "jiggerly-piggerly," as well as the unique concept of executing innocent people simply because none of the Founding Fathers thought it necessary to spell it out in the Constitution. 
Sentiment: 0
Text:VETO IT!!! 
Sentiment: -1
Text:Correction * million, not millions Time for a real tax and stopping the ripoff of Ohio's resources. 
Sentiment: -1
Text:Enough liberal catholics running the house. We want a principled conservative who is not cowered down by the liberal bullies and Obama. 
Sentiment: -1
Text:My mother sang old shep to make me cry!!  What mean people!!  LOL 
Sentiment: -1
Text:why are you allowing toxic Chemtrial spaying in California? 
Sentiment: -1
Text:While we were all looking the other way, sacred Apache land (that was also property of the American people as a whole) was sold to an Australian-British mining company that will soon have 

## Scaling via pair-wise comparison

Now let's see how we can use GPT to do pairwise comparison. Notice, we saw in the paper that pairwise comparisons can be used as input for scaling models of ideology. But, this type of labeled data can be used for many different tasks, for example, readability and sophistication scores, as we saw earlier in the semester. 

Together with Lisa Signh and Leticia Bode, we are actually using a similar approach, but we human labelling, to understand levels of hummaness of social media content in the AI-Era. Next year, you can email me and I can show you some results!

The code below was actually provided by Patrick Wu. So thanks to him!

In [107]:
# bring soma
import pandas as pd
import numpy as np
import os
import time
from openai import OpenAI
from itertools import combinations
from random import sample, choices
import random
import re
from tqdm import tqdm
from joblib import delayed, Parallel

In [108]:
# create a client to interact with the API
client = OpenAI(
    # This is the default and can be omitted
    api_key=gpt_key,
)

In [109]:
chat_completion = client.chat.completions.create(
    messages=[
        {
            "role": "user",
            "content": "Say this is a test",
        }
    ],
    model="gpt-3.5-turbo",
)


In [110]:
'''
p: the prompt
system_prompt: the system prompt. The default is what is used on the ChatGPT's web interface.
temp: temperature parameter. 1.0 is the default (for GPT-3.5, temperature ranges from 0 to 2.0)
request_timeout: the amount of time, in seconds, to timeout the function.
'''
def prompting_openai_comparison(p,
                                system_prompt='You are ChatGPT, a large language model trained by OpenAI, based on the GPT-3.5 architecture.\nKnowledge cutoff: 2021-09\nCurrent date: 2023-10-28',
                                temp=1.0):
  # times used to sleep; these values approximate exponential backoff
    sleepy_times = [1, 2]

    for i in range(len(sleepy_times)):
        try:
            response = client.chat.completions.create(model="gpt-3.5-turbo",
                                              messages=[{"role": "system", 
                                                         "content": system_prompt},
                                                        {"role": "user", 
                                                         "content": p}],
                                              temperature=0)
            break
        except:
          # if OpenAI's API returns an error, this lets you know and backs off for the set time, determined using the sleepy_times list
          print('uh oh, ' + str(sleepy_times[i]))
          time.sleep(sleepy_times[i])
    return response

In [32]:
# get a list of S116 members
#!wget https://voteview.com/static/data/out/members/S116_members.csv

In [111]:
df = pd.read_csv('S116_members.csv')

In [112]:
# let's get an ordinary version of some members of the congress
# Add "ordinary" versions of senators' names
df['bioname_ordinary'] = ['Donald Trump',
'Doug Jones',
'Richard Shelby',
'Lisa Murkowski',
'Dan Sullivan',
'Kyrsten Sinema',
'Martha McSally',
'Mark Kelly',
'John Boozman',
'Tom Cotton',
'Kamala Harris',
'Dianne Feinstein',
'Cory Gardner',
'Michael Bennet',
'Chris Murphy',
'Richard Blumenthal',
'Tom Carper',
'Chris Coons',
'Marco Rubio',
'Rick Scott',
'Johnny Isakson',
'David Perdue',
'Kelly Loeffler',
'Mazie Hirono',
'Brian Schatz',
'Mike Crapo',
'James Risch',
'Dick Durbin',
'Tammy Duckworth',
'Todd Young',
'Mike Braun',
'Chuck Grassley',
'Joni Ernst',
'Pat Roberts',
'Jerry Moran',
'Mitch McConnell',
'Rand Paul',
'Bill Cassidy',
'John Kennedy',
'Angus King',
'Susan Collins',
'Ben Cardin',
'Chris Van Hollen',
'Ed Markey',
'Elizabeth Warren',
'Gary Peters',
'Debbie Stabenow',
'Amy Klobuchar',
'Tina Smith',
'Roger Wicker',
'Cindy Hyde-Smith',
'Roy Blunt',
'Josh Hawley',
'Steve Daines',
'Jon Tester',
'Deb Fischer',
'Ben Sasse',
'Jacky Rosen',
'Catherine Cortez Masto',
'Jeanne Shaheen',
'Maggie Hassan',
'Bob Menendez',
'Cory Booker',
'Martin Heinrich',
'Tom Udall',
'Chuck Schumer',
'Kirsten Gillibrand',
'Richard Burr',
'Thom Tillis',
'Kevin Cramer',
'John Hoeven',
'Rob Portman',
'Sherrod Brown',
'Jim Inhofe',
'James Lankford',
'Ron Wyden',
'Jeff Merkley',
'Pat Toomey',
'Bob Casey',
'Jack Reed',
'Sheldon Whitehouse',
'Tim Scott',
'Lindsey Graham',
'John Thune',
'Mike Rounds',
'Marsha Blackburn',
'Lamar Alexander',
'John Cornyn',
'Ted Cruz',
'Mike Lee',
'Mitt Romney',
'Patrick Leahy',
'Bernie Sanders',
'Mark Warner',
'Tim Kaine',
'Maria Cantwell',
'Patty Murray',
'Shelley Moore Capito',
'Joe Manchin',
'Tammy Baldwin',
'Ron Johnson',
'John Barrasso',
'Mike Enzi']

In [113]:
# Delete Donald Trump
df = df.iloc[1:,]

# sample just a few
df = df.sample(n=10).reset_index()


In [114]:
df

Unnamed: 0,index,congress,chamber,icpsr,state_icpsr,district_code,state_abbrev,party_code,occupancy,last_means,...,nominate_dim1,nominate_dim2,nominate_log_likelihood,nominate_geo_mean_probability,nominate_number_of_votes,nominate_number_of_errors,conditional,nokken_poole_dim1,nokken_poole_dim2,bioname_ordinary
0,100,116,Senate,41111,25,0.0,WI,200,,,...,0.636,-0.14,-45.55103,0.93161,643.0,25.0,,0.61,0.017,Ron Johnson
1,97,116,Senate,20146,56,0.0,WV,200,,,...,0.278,0.132,-24.25495,0.96229,631.0,6.0,,0.326,0.177,Shelley Moore Capito
2,30,116,Senate,41900,22,0.0,IN,200,,,...,0.8,0.6,-61.55663,0.91133,663.0,29.0,,0.837,0.335,Mike Braun
3,3,116,Senate,40300,81,0.0,AK,200,,,...,0.21,-0.316,-47.85393,0.92606,623.0,12.0,,0.283,-0.424,Lisa Murkowski
4,42,116,Senate,20330,52,0.0,MD,100,,,...,-0.392,-0.208,-104.86962,0.85451,667.0,39.0,,-0.38,-0.333,Chris Van Hollen
5,102,116,Senate,49706,68,0.0,WY,200,,,...,0.545,0.199,-42.601,0.93551,639.0,16.0,,0.587,0.414,Mike Enzi
6,56,116,Senate,41503,35,0.0,NE,200,,,...,0.669,-0.25,-41.30191,0.93788,644.0,15.0,,0.717,-0.012,Ben Sasse
7,76,116,Senate,40908,72,0.0,OR,100,,,...,-0.445,-0.734,-76.21439,0.89141,663.0,26.0,,-0.409,-0.724,Jeff Merkley
8,78,116,Senate,40703,14,0.0,PA,100,,,...,-0.313,0.165,-100.28959,0.86059,668.0,50.0,,-0.339,0.197,Bob Casey
9,35,116,Senate,14921,51,0.0,KY,200,,,...,0.404,0.02,-63.71061,0.90903,668.0,16.0,,0.297,0.201,Mitch McConnell


In [115]:
# Then get dictionaries that obtain the party and state for each senator by name
names = list(df['bioname_ordinary'])
state = list(df['state_abbrev'])
party = ['R' if j==200 else 'D' if j==100 else 'I' for j in list(df['party_code'])]

name_party_dict = {n: p for n,p in zip(names,party)}
name_state_dict = {n: s for n,s in zip(names,state)}

In [116]:
# this function samples a total number of matchups per senator. this does not mean that each senator is limited to a max of sample_size matchups
# it means each senator will appear in at least sample_size matchups
def generate_pairwise_matchups(items, sample_size=20, seed_value=42):
  random.seed(seed_value)

  if sample_size >= len(items) or sample_size < 1:
    raise ValueError("Sample size must be between 1 and one less than the total number of tweet IDs")

  all_matchups = []

  # Generate all possible pairings
  all_combinations = list(combinations(items, 2))

  for i in items:
    # Filter matchups containing the current tweet ID
    relevant_matchups = [pair for pair in all_combinations if i in pair]

    # Shuffle the matchups
    random.shuffle(relevant_matchups)

    # Sample from these matchups up to the specified sample size
    all_matchups.extend(relevant_matchups[:sample_size])

  return all_matchups

In [117]:
matchups = generate_pairwise_matchups(names, sample_size=1, seed_value=42)

In [118]:
len(matchups)

10

Here, we note the direction of comparison. We have to use liberal and conservative differently in these prompts because, when comparing two Republicans, if I prompt ChatGPT with "who is more liberal," it will often fail to answer this and reply that both senators are conservative.

In [119]:
prompts = []
comparison_direction = []

for j in matchups:
    # D vs. D
    if (name_party_dict[j[0]]=='D' or name_party_dict[j[0]]=='I') and (name_party_dict[j[1]]=='D' or name_party_dict[j[1]]=='I'):
        sent = 'Based on past voting records and statements, which senator is more liberal: ' + j[0] + ' (' + name_party_dict[j[0]] + '-' + name_state_dict[j[0]] + ') or ' + j[1] + ' (' + name_party_dict[j[1]] + '-' + name_state_dict[j[1]] + ')?'
        comparison_direction.append('liberal')
    # D vs. R
    elif (name_party_dict[j[0]]=='D' or name_party_dict[j[0]]=='I') and (name_party_dict[j[1]]=='R'):
        sent = 'Based on past voting records and statements, which senator is more liberal: ' + j[0] + ' (' + name_party_dict[j[0]] + '-' + name_state_dict[j[0]] + ') or ' + j[1] + ' (' + name_party_dict[j[1]] + '-' + name_state_dict[j[1]] + ')?'
        comparison_direction.append('liberal')
    # R vs. D
    elif (name_party_dict[j[0]]=='R') and (name_party_dict[j[1]]=='D' or name_party_dict[j[1]]=='I'):
        sent = 'Based on past voting records and statements, which senator is more liberal: ' + j[0] + ' (' + name_party_dict[j[0]] + '-' + name_state_dict[j[0]] + ') or ' + j[1] + ' (' + name_party_dict[j[1]] + '-' + name_state_dict[j[1]] + ')?'
        comparison_direction.append('liberal')
    # R vs. R
    elif (name_party_dict[j[0]]=='R') and (name_party_dict[j[1]]=='R'):
        sent = 'Based on past voting records and statements, which senator is more conservative: ' + j[0] + ' (' + name_party_dict[j[0]] + '-' + name_state_dict[j[0]] + ') or ' + j[1] + ' (' + name_party_dict[j[1]] + '-' + name_state_dict[j[1]] + ')?'
        comparison_direction.append('conservative')
    else:
        print('OH NO!')
        break
    prompts.append(sent)

In [120]:
#Set the system prompt for the pairwise comparison.
system_prompt = 'You are ChatGPT, a large language model trained by OpenAI, based on the GPT-3.5 architecture.\nKnowledge cutoff: 2021-09\nCurrent date: 2023-09-11'

In [122]:
print(prompts[0:10])

['Based on past voting records and statements, which senator is more liberal: Ron Johnson (R-WI) or Chris Van Hollen (D-MD)?', 'Based on past voting records and statements, which senator is more liberal: Shelley Moore Capito (R-WV) or Bob Casey (D-PA)?', 'Based on past voting records and statements, which senator is more conservative: Mike Braun (R-IN) or Lisa Murkowski (R-AK)?', 'Based on past voting records and statements, which senator is more conservative: Lisa Murkowski (R-AK) or Mitch McConnell (R-KY)?', 'Based on past voting records and statements, which senator is more liberal: Chris Van Hollen (D-MD) or Mike Enzi (R-WY)?', 'Based on past voting records and statements, which senator is more conservative: Mike Braun (R-IN) or Mike Enzi (R-WY)?', 'Based on past voting records and statements, which senator is more conservative: Shelley Moore Capito (R-WV) or Ben Sasse (R-NE)?', 'Based on past voting records and statements, which senator is more liberal: Ben Sasse (R-NE) or Jeff Me

Now we're finally ready to run the pairwise comparisons. We run it in parallel because it takes a very long time to run if you iterate one prompt at a time.

In [123]:
# create a container
comparison_results = []

# iterate
for p in prompts:
    results = prompting_openai_comparison(p, system_prompt, 1.0)
    comparison_results.append(results)

In [124]:
# let's look at it 
comparison_results[0]

ChatCompletion(id='chatcmpl-9F70pvXb7yJnFJfzKilUYkNbi7PNE', choices=[Choice(finish_reason='stop', index=0, logprobs=None, message=ChatCompletionMessage(content="As of my last knowledge update in September 2021, Chris Van Hollen, a Democrat from Maryland, is generally considered to be more liberal than Ron Johnson, a Republican from Wisconsin. \n\nChris Van Hollen has a voting record and policy positions that align more closely with liberal or progressive ideals, as he is a member of the Democratic Party, which is typically associated with more liberal stances on various issues.\n\nRon Johnson, on the other hand, is a Republican senator from Wisconsin and is generally associated with conservative positions and policies. \n\nPlease note that political positions and affiliations can evolve over time, so it's advisable to check more recent sources for the most up-to-date information on the political leanings of these senators.", role='assistant', function_call=None, tool_calls=None))], cre

In [125]:
# Extract the text answer from ChatGPT responses
def get_text_from_chatgpt(responses):
  return [responses[i].choices[0].message.content for i in range(len(responses))]

In [126]:
comparisons_text = get_text_from_chatgpt(comparison_results)

In [127]:
# This is what a pairwise comparison looks like.
print(comparisons_text[0])

As of my last knowledge update in September 2021, Chris Van Hollen, a Democrat from Maryland, is generally considered to be more liberal than Ron Johnson, a Republican from Wisconsin. 

Chris Van Hollen has a voting record and policy positions that align more closely with liberal or progressive ideals, as he is a member of the Democratic Party, which is typically associated with more liberal stances on various issues.

Ron Johnson, on the other hand, is a Republican senator from Wisconsin and is generally associated with conservative positions and policies. 

Please note that political positions and affiliations can evolve over time, so it's advisable to check more recent sources for the most up-to-date information on the political leanings of these senators.


Now we need to extract the answers from our comparisons. We'll append our answers from before and ask ChatGPT to extract the answer.

In [128]:
extracting_answer_prompt = []

for i in range(len(comparisons_text)):
    if comparison_direction[i]=='liberal':
        sent = 'Text: "' + comparisons_text[i] + '"\n\nIn the above Text, who is described to be the more liberal, more progressive, or less conservative senator: ' + matchups[i][0] + ' or ' + matchups[i][1] + '? Return only the full name without party affiliation or state information. If one senator is described as more conservative, return the other senator\'s name. If one senator is described as more moderate, return the other senator\'s name. If neither senators are described to be more liberal, more progressive, less conservative, more conservative, or more moderate, reply with "Tie."'
    elif comparison_direction[i]=='conservative':
        sent = 'Text: "' + comparisons_text[i] + '"\n\nIn the above Text, who is described to be the more conservative or less liberal senator: ' + matchups[i][0] + ' or ' + matchups[i][1] + '? Return only the full name without party affiliation or state information. If one senator is described as more liberal, return the other senator\'s name. If one senator is described as more moderate, return the other senator\'s name. If neither senators are described to be more conservative, less liberal, more liberal, or more moderate, reply with "Tie."'
    extracting_answer_prompt.append(sent)

In [129]:
system_prompt_extraction = 'You are reading a Text and extracting information from it according to the prompt. Follow the directions exactly.'

In [130]:
# let's see this!
extracting_answer_prompt[0]

'Text: "As of my last knowledge update in September 2021, Chris Van Hollen, a Democrat from Maryland, is generally considered to be more liberal than Ron Johnson, a Republican from Wisconsin. \n\nChris Van Hollen has a voting record and policy positions that align more closely with liberal or progressive ideals, as he is a member of the Democratic Party, which is typically associated with more liberal stances on various issues.\n\nRon Johnson, on the other hand, is a Republican senator from Wisconsin and is generally associated with conservative positions and policies. \n\nPlease note that political positions and affiliations can evolve over time, so it\'s advisable to check more recent sources for the most up-to-date information on the political leanings of these senators."\n\nIn the above Text, who is described to be the more liberal, more progressive, or less conservative senator: Ron Johnson or Chris Van Hollen? Return only the full name without party affiliation or state informati

In [131]:
# create a container
extraction = []

# iterate
for p in extracting_answer_prompt:
    results = prompting_openai_comparison(p, system_prompt, 1.0)
    extraction.append(results)

In [132]:
extraction_text = get_text_from_chatgpt(extraction)

In [133]:
extraction_text

['Chris Van Hollen',
 'Bob Casey',
 'Mike Braun',
 'Mitch McConnell',
 'Chris Van Hollen',
 'Tie.',
 'Ben Sasse',
 'Jeff Merkley',
 'Bob Casey',
 'Bob Casey']

In [134]:
# this function simply removes the period at the sentences
def remove_period(sentence):
    if sentence.endswith('.'):
        sentence = sentence[:-1]
    return sentence

# this function simply removes the 'Senator ' prefix. For example, it returns "Dianne Feinstein" if the input text is "Senator Dianne Feinstein"
def remove_senator_prefix(input_string):
    if input_string.startswith("Senator "):
        return input_string[8:]
    else:
        return input_string

In [135]:
extraction_text = [remove_period(t) for t in extraction_text]
extraction_text = [remove_senator_prefix(t) for t in extraction_text]

In [136]:
print(extraction_text[0])

Chris Van Hollen


We then use a function to check that every extraction was correct. Sometimes it will still not correctly extract the answer, which means we have to step in and manually fix it. If the function prints nothing, great!

This step will make the final dataframe with the resultant matchups.

In [137]:
def make_final_df(matchups, chatgpt_answers, final_answers, comparison_direction):
    sen1 = [j[0] for j in matchups]
    sen2 = [j[1] for j in matchups]

    matchup_results = pd.DataFrame({'matchup': matchups,
                                    'senator1': sen1,
                                    'senator2': sen2,
                                    'chatgpt_response': chatgpt_answers,
                                    'final_answers': final_answers,
                                    'comparison_direction': comparison_direction})

    opposite = []
    sen1_win = []
    sen2_win = []

    for i in range(len(matchup_results['matchup'])):
        if matchup_results['comparison_direction'][i]=='liberal':
            if matchup_results['final_answers'][i]==matchup_results['senator1'][i]:
                sen1_win.append(0.0)
                sen2_win.append(1.0)
            elif matchup_results['final_answers'][i]==matchup_results['senator2'][i]:
                sen1_win.append(1.0)
                sen2_win.append(0.0)
            elif matchup_results['final_answers'][i]=='Tie':
                sen1_win.append(0.5)
                sen2_win.append(0.5)
        elif matchup_results['comparison_direction'][i]=='conservative':
            if matchup_results['final_answers'][i]==matchup_results['senator1'][i]:
                sen1_win.append(1.0)
                sen2_win.append(0.0)
            elif matchup_results['final_answers'][i]==matchup_results['senator2'][i]:
                sen1_win.append(0.0)
                sen2_win.append(1.0)
            elif matchup_results['final_answers'][i]=='Tie':
                sen1_win.append(0.5)
                sen2_win.append(0.5)
        else:
            print(str(i) + ' is a defective outcome')

    matchup_results['win1'] = sen1_win
    matchup_results['win2'] = sen2_win

    return matchup_results

In [138]:
final_df = make_final_df(matchups=matchups,
                         chatgpt_answers=comparisons_text,
                         final_answers=extraction_text,
                         comparison_direction=comparison_direction)

In [139]:
final_df

Unnamed: 0,matchup,senator1,senator2,chatgpt_response,final_answers,comparison_direction,win1,win2
0,"(Ron Johnson, Chris Van Hollen)",Ron Johnson,Chris Van Hollen,As of my last knowledge update in September 20...,Chris Van Hollen,liberal,1.0,0.0
1,"(Shelley Moore Capito, Bob Casey)",Shelley Moore Capito,Bob Casey,As of my last knowledge update in September 20...,Bob Casey,liberal,1.0,0.0
2,"(Mike Braun, Lisa Murkowski)",Mike Braun,Lisa Murkowski,As of my last knowledge update in September 20...,Mike Braun,conservative,1.0,0.0
3,"(Lisa Murkowski, Mitch McConnell)",Lisa Murkowski,Mitch McConnell,As of my last knowledge update in September 20...,Mitch McConnell,conservative,0.0,1.0
4,"(Chris Van Hollen, Mike Enzi)",Chris Van Hollen,Mike Enzi,"Chris Van Hollen, a Democrat from Maryland, is...",Chris Van Hollen,liberal,0.0,1.0
5,"(Mike Braun, Mike Enzi)",Mike Braun,Mike Enzi,As of my last knowledge update in September 20...,Tie,conservative,0.5,0.5
6,"(Shelley Moore Capito, Ben Sasse)",Shelley Moore Capito,Ben Sasse,As of my last knowledge update in September 20...,Ben Sasse,conservative,0.0,1.0
7,"(Ben Sasse, Jeff Merkley)",Ben Sasse,Jeff Merkley,As of my last knowledge update in September 20...,Jeff Merkley,liberal,1.0,0.0
8,"(Ron Johnson, Bob Casey)",Ron Johnson,Bob Casey,As of my last knowledge update in September 20...,Bob Casey,liberal,1.0,0.0
9,"(Bob Casey, Mitch McConnell)",Bob Casey,Mitch McConnell,"Bob Casey, a Democrat from Pennsylvania, is ge...",Bob Casey,liberal,0.0,1.0


From here would just need to go the R to do the Bradley Terry model. Happy to share code about this as well, but that would be too much for today!

## Generating survey responses

In [141]:
# Function to interact with the ChatGPT API
def survey_chatGPT(profile, prompt, api_key):
    headers = {
        "Authorization": f"Bearer {api_key}",
        "Content-Type": "application/json",
    }

    data = {
        "model": "gpt-3.5-turbo-0301",
        "temperature": 0.2,
        "messages": [{"role": "system", 
                      "content": profile}, 
                    {"role":"user", 
                    "content":prompt}]
    }

    response = requests.post("https://api.openai.com/v1/chat/completions", 
                             json=data, 
                             headers=headers, timeout=5)
    
    response_json = response.json()
    return response_json

In [155]:
def gen_profile(age, race, gender, educ, inc, pid):
    profile = "You are a " + str(age) + " year old " + race + " "+ gender + " with a " + educ + ", earning $" + inc + " per year . " + "You are a registered " + pid + " living in the USA in 2019. "
    return profile

In [156]:
# test one profile
profile = gen_profile(18, "latino", "female", "post-graduate", "100,000", "Democrat")
profile

'You are a 18 year old latino female with a post-graduate, earning $100,000 per year . You are a registered Democrat living in the USA in 2019. '

In [157]:
prompt = "Provide responses from this person's perspective.\n"\
         "Use only knowledge about politics that they would have.\n"\
         "Format the output as a csv table with the following format:\n"\
         "group, thermometer\n"\
         "The following questions ask about individuals' feelings "\
         "toward different groups.\n"\
         "Responses should be given on a scale from 0 (meaning cold "\
         "feelings) to 100 (meaning warm feelings).\n"\
         "Ratings between 50 degrees and 100 degrees mean that\n"\
         "you feel favorable and warm toward the group. Ratings "\
         "between 0\n"\
         "degrees and 50 degrees mean that you don't feel "\
         "favorable toward\n"\
         "the group and that you don't care too much for that "\
         "group. You\n"\
         "would rate the group at the 50 degree mark if you don't feel\n"\
         "particularly warm or cold toward the group.\n"\
         "How do you feel toward the following groups?\n"\
         "The Democratic Party?\n"\
         "The Republican Party?\n"\
         "Democrats?\n"\
         "Republicans?\n"\
         "Black Americans?\n"\
         "White Americans?\n"\
         "Hispanic Americans?\n"\
         "Asian Americans?\n"\
         "Muslims?\n"\
         "Christians?\n"\
         "Immigrants?\n"\
         "Gays and Lesbians?\n"\
         "Jews?\n"\
         "Liberals?\n"\
         "Conservatives?\n"\
         "Women?\n"

In [158]:
# get an output
output = survey_chatGPT(profile, prompt, gpt_key)
print(output)

{'id': 'chatcmpl-9F76v1GZFz2rp6CIzbPAuaRmgddWp', 'object': 'chat.completion', 'created': 1713389281, 'model': 'gpt-3.5-turbo-0301', 'choices': [{'index': 0, 'message': {'role': 'assistant', 'content': 'group,thermometer\nThe Democratic Party,90\nThe Republican Party,20\nDemocrats,85\nRepublicans,25\nBlack Americans,90\nWhite Americans,70\nHispanic Americans,85\nAsian Americans,80\nMuslims,70\nChristians,60\nImmigrants,90\nGays and Lesbians,90\nJews,80\nLiberals,90\nConservatives,30\nWomen,95'}, 'logprobs': None, 'finish_reason': 'stop'}], 'usage': {'prompt_tokens': 262, 'completion_tokens': 89, 'total_tokens': 351}, 'system_fingerprint': None}


In [159]:
output

{'id': 'chatcmpl-9F76v1GZFz2rp6CIzbPAuaRmgddWp',
 'object': 'chat.completion',
 'created': 1713389281,
 'model': 'gpt-3.5-turbo-0301',
 'choices': [{'index': 0,
   'message': {'role': 'assistant',
    'content': 'group,thermometer\nThe Democratic Party,90\nThe Republican Party,20\nDemocrats,85\nRepublicans,25\nBlack Americans,90\nWhite Americans,70\nHispanic Americans,85\nAsian Americans,80\nMuslims,70\nChristians,60\nImmigrants,90\nGays and Lesbians,90\nJews,80\nLiberals,90\nConservatives,30\nWomen,95'},
   'logprobs': None,
   'finish_reason': 'stop'}],
 'usage': {'prompt_tokens': 262, 'completion_tokens': 89, 'total_tokens': 351},
 'system_fingerprint': None}

In [160]:
response = output["choices"][0]["message"]['content']

In [161]:
# get response
response = output["choices"][0]["message"]['content']

# clean
lines = response.split('\n')
data = [line.split(',') for line in lines]

# build data frame 
pd.DataFrame(data[1:], columns=data[0])

Unnamed: 0,group,thermometer
0,The Democratic Party,90
1,The Republican Party,20
2,Democrats,85
3,Republicans,25
4,Black Americans,90
5,White Americans,70
6,Hispanic Americans,85
7,Asian Americans,80
8,Muslims,70
9,Christians,60


# Using Meta's LLaMa2

Instead of using Open AI models, we can (and I think we should) work with open source LLM models, such as LLaMa2 from Meta. You can use the model after getting access to it on Hugging face!

I suggest you to run this on a Google Colab with GPU! Let's just see a simple example here. 

In [18]:
#!pip install transformers
#!huggingface-cli login


In [11]:
from huggingface_hub import notebook_login
notebook_login()

VBox(children=(HTML(value='<center> <img\nsrc=https://huggingface.co/front/assets/huggingface_logo-noborder.sv…

In [1]:
!pip install accelerate



In [2]:
from transformers import AutoTokenizer
import transformers
import torch

model = "meta-llama/Llama-2-7b-chat-hf" # Calling the smallest model of 7 billion parameters
tokenizer = AutoTokenizer.from_pretrained(model)

# First time, download will take a bit (depending on connection). File is around 13GB.
pipeline = transformers.pipeline(
    "text-generation",
    model=model,
    torch_dtype=torch.float16,
    device_map="mps") # If you are using a Mac.)

Loading checkpoint shards:   0%|          | 0/2 [00:00<?, ?it/s]

In [5]:
sequences = pipeline(
    "Is the sentiment of this text positive, neutral, or negative? \
    Answer only with a number: 1 if positive, 0 if neutralm and -1 if negative. \
    Here is the text: Oh, Pat, you know it is about President Obama naming the Supreme Court Justice.\
    Don't drag your feet on this, and don't be an obstacle to yet another proposal by the President.  \
    You are way to old and removed from Kansas to be a voice for us. ",
    do_sample=True, # this prevents the model from just picking the most likely word at every step greedily
    top_k=1, # limit the number of words the model considers when decoding before randomly sampling from the word probabilities.
    num_return_sequences=1,
    eos_token_id=tokenizer.eos_token_id,
    max_length=250, # what is the length of sequence we want?
    return_full_text = False,
)

for seq in sequences:
    print(f"Result: {seq['generated_text']}")

Result:  You are not a voice for the people of Kansas.      You are a voice for the people of Washington, D.C. and New York City.      You are not a voice for the people of Kansas.      You are a voice for the people of the United States.      You are not a voice for the people of the United States.      You are a voice for the people of the world.      You are not a voice for the people of the world.      You are a voice for the people of the universe.      You are not a voice for the people of the universe.      You are a voice for the people of the cosmos.      You are not a voice for the people of the cosmos
