ASL, v.130323

# Introduction

In this notebook, we use ChatGPT via the OpenAI API to classify the MedWeb tweets studied in the previous notebooks (`DAT255-NLP-2.0-MedTweets-fastai-ULMFiT.ipynb` and `DAT255-NLP-3.0-MedTweets-transformers.ipynb`).

# Setup

In [38]:
import numpy as np, pandas as pd
from pathlib import Path

In [39]:
import openai

Load API key (get yours here: https://platform.openai.com/). The (secret) key is stored in an environment file (OBS: make sure to gitignore it!). We load it using `python-dotenv`: https://pypi.org/project/python-dotenv/

In [40]:
NB_DIR = Path.cwd()

In [41]:
dotenv_file = NB_DIR/'.env'

In [42]:
import dotenv
dotenv.load_dotenv(dotenv_file)

True

# Load and prepare the data

In [43]:
df = pd.read_csv('https://github.com/HVL-ML/DAT255/raw/main/3-NLP/data/medwebdata.csv')
df.head()

Unnamed: 0,ID,Tweet,Influenza,Diarrhea,Hayfever,Cough,Headache,Fever,Runnynose,Cold,labels,is_test
0,1en,The cold makes my whole body weak.,0,0,0,0,0,0,0,1,Cold,False
1,2en,It's been a while since I've had allergy sympt...,0,0,1,0,0,0,1,0,Hayfever;Runnynose,False
2,3en,I'm so feverish and out of it because of my al...,0,0,1,0,0,1,1,0,Hayfever;Fever;Runnynose,False
3,4en,"I took some medicine for my runny nose, but it...",0,0,0,0,0,0,1,0,Runnynose,False
4,5en,I had a bad case of diarrhea when I traveled t...,0,0,0,0,0,0,0,0,sober,False


In [44]:
df.drop(['labels'], axis=1, inplace=True)

In [45]:
df['labels'] = df.apply(lambda x: [x[c] for c in df.columns[2:-1]], axis=1)

In [46]:
df.head()

Unnamed: 0,ID,Tweet,Influenza,Diarrhea,Hayfever,Cough,Headache,Fever,Runnynose,Cold,is_test,labels
0,1en,The cold makes my whole body weak.,0,0,0,0,0,0,0,1,False,"[0, 0, 0, 0, 0, 0, 0, 1]"
1,2en,It's been a while since I've had allergy sympt...,0,0,1,0,0,0,1,0,False,"[0, 0, 1, 0, 0, 0, 1, 0]"
2,3en,I'm so feverish and out of it because of my al...,0,0,1,0,0,1,1,0,False,"[0, 0, 1, 0, 0, 1, 1, 0]"
3,4en,"I took some medicine for my runny nose, but it...",0,0,0,0,0,0,1,0,False,"[0, 0, 0, 0, 0, 0, 1, 0]"
4,5en,I had a bad case of diarrhea when I traveled t...,0,0,0,0,0,0,0,0,False,"[0, 0, 0, 0, 0, 0, 0, 0]"


# Set up the ChatGPT prompt

We use the instructions given to the labelers creating the data as part of our prompt. See Wakamiya et.al, _Tweet Classification Toward Twitter-Based Disease Surveillance: New Data, Methods, and Evaluations_, 2019: https://www.jmir.org/2019/2/e12783/

In [47]:
init_prompt = """
When I write a set of short text separated by semicolons ";", I want you to return whether or not 
each of the text deals with one or more of the terms, and if so, which ones. 
['Influenza', 'Diarrhea', 'Hayfever', 'Cough', 'Headache', 'Fever', 'Runnynose', 'Cold']. 

You should base your response on the following three criteria:

Factuality: The user who wrote the text (or someone close to the user) should be affected by a certain disease or 
have a symptom of the disease. A text that includes only the name of a disease or a symptom as a topic 
is removed by labeling it as 0 (negative).

Tense (time): Older information, which is meaningless from the viewpoint of surveillance, should be 
discarded. Such information should also be labeled as 0 (negative). Here, we regard 24 hours as the 
standard condition. When the precise date and time are ambiguous, the general guideline is that 
information within 24 hours (eg, information related to the present day or previous day) is labeled as 
1 (positive).

Location: The location of the disease should be specified as follows. If a the user who wrote the text 
is affected, the information is labeled as 1 (positive) because the location of the user is the place of 
onset of the symptom. In cases where the user is not personally affected, the information is labeled 
as 1 (positive) if it is within the same vicinity as that of the user, and as 0 (negative) otherwise.

Respond only with the terms that the texts I provide deal with, with each set of terms in a 
a comma-separated list inside brackets, with the presence or absence of the terms coded by 0 and 1. 
Separate the responses for each text with a semicolon ";".

For example, when I write, 
"I'm feeling really bad. My head hurts. My nose is runny. I've felt like this for days.";
"I'm so feverish and out of it because of my allergies. I'm so sleepy.";
"The cold makes my whole body weak.";
"I had a bad case of diarrhea when I traveled to Nepal.";
"I took some medicine for my runny nose, but it won't stop.";
"My phlegm has blood in it and it's really gross.";
"They say we will have less pollen next spring, but it doesn't really matter to me, since my allergy gets severe in the autumn."

 you should respond with 

[0,0,0,0,1,0,1,0];[0,0,1,0,0,1,1,0];[0,0,0,0,0,0,0,1];
[0,0,0,0,0,0,0,0];[0,0,0,0,0,0,1,0];[0,0,0,1,0,0,0,0];
[0,0,0,0,0,0,0,0]"

Here are my texts: 
"""

In [48]:
init_tweets = """
"It takes a millennial wimp to call in sick just because they're coughing. Its always important to go to work, no matter what.";
"I'm not going today, because my stuffy nose is killing me.";
"I never thought I would have allergies.";
"I have a fever but I don't think it's the kind of cold that will make it to my stomach."
"""

In [49]:
init_tweets_ai_response = """
[0,0,0,1,0,0,0,0];[0,0,0,0,0,0,1,0];[0,0,1,0,0,0,1,0];[0,0,0,0,0,1,0,1]
"""

# Create API request

## Test

Let's try it out on two tweets:

In [50]:
new_tweets = """
"My phlegm has blood in it and it's really gross.";
"They say we will have less pollen next spring, but it doesn't really matter to me, since my allergy gets severe in the autumn."
"""

In [51]:
new_tweets_correct_responses = "[0,0,0,1,0,0,0,0];[0,0,0,0,0,0,0,0]"

In [53]:
messages=[
        {"role": "user", "content": init_prompt},
        {"role": "user", "content": init_tweets},
        {"role": "assistant", "content": init_tweets_ai_response},
        {"role": "user", "content": new_tweets}]


In [54]:
messages

[{'role': 'user',
  'content': '\nWhen I write a set of short text separated by semicolons ";", I want you to return whether or not \neach of the text deals with one or more of the terms, and if so, which ones. \n[\'Influenza\', \'Diarrhea\', \'Hayfever\', \'Cough\', \'Headache\', \'Fever\', \'Runnynose\', \'Cold\']. \n\nYou should base your response on the following three criteria:\n\nFactuality: The user who wrote the text (or someone close to the user) should be affected by a certain disease or \nhave a symptom of the disease. A text that includes only the name of a disease or a symptom as a topic \nis removed by labeling it as 0 (negative).\n\nTense (time): Older information, which is meaningless from the viewpoint of surveillance, should be \ndiscarded. Such information should also be labeled as 0 (negative). Here, we regard 24 hours as the \nstandard condition. When the precise date and time are ambiguous, the general guideline is that \ninformation within 24 hours (eg, informati

In [55]:
import os
openai.api_key = os.getenv("OPENAI_API_KEY")

model = "gpt-3.5-turbo"

In [56]:
temperature=0.3

In [57]:
response = openai.ChatCompletion.create(
  model=model,
  messages=messages,
  temperature=temperature,
  max_tokens=150,
  top_p=1,
  frequency_penalty=0.0,
  presence_penalty=0.6
)

In [58]:
response

<OpenAIObject chat.completion id=chatcmpl-6tbqNlYAHq6E8BQqTRTIQqMp29qYR at 0x7f792012e4d0> JSON: {
  "choices": [
    {
      "finish_reason": "stop",
      "index": 0,
      "message": {
        "content": "[0,0,0,1,0,0,0,0];[0,0,1,0,0,0,0,0]",
        "role": "assistant"
      }
    }
  ],
  "created": 1678711531,
  "id": "chatcmpl-6tbqNlYAHq6E8BQqTRTIQqMp29qYR",
  "model": "gpt-3.5-turbo-0301",
  "object": "chat.completion",
  "usage": {
    "completion_tokens": 36,
    "prompt_tokens": 870,
    "total_tokens": 906
  }
}

In [59]:
new_tweets_correct_responses

'[0,0,0,1,0,0,0,0];[0,0,0,0,0,0,0,0]'

## Function to get predictions

In [60]:
import ast

In [61]:
def get_prediction(tweets):
    
    messages=[
        {"role": "user", "content": init_prompt},
        {"role": "user", "content": init_tweets},
        {"role": "assistant", "content": init_tweets_ai_response},
        {"role": "user", "content": tweets}]


    
    response = openai.ChatCompletion.create(
              model=model,
              messages=messages,
              temperature=0.1,
              max_tokens=150,
              top_p=1,
              frequency_penalty=0.0,
              presence_penalty=0.6,
            )
    
    predictions = response.get("choices")[0].get("message").get("content")
    
    predictions = predictions.split(";")
    
    return_predictions = []
    
    # As we cannot guarantee that ChatGPT sticks to our instructions, 
    # we make a simple test of whether the predictions returned 
    # can be evaluated as lists. For predictions where this is 
    # not the case, we use the default prediction of all zeros.
    
    for pred in predictions:
        try:
            return_predictions.append(ast.literal_eval(pred))
        except:
            return_predictions.append([0,0,0,0,0,0,0,0])
    
    return predictions

In [62]:
test_preds = get_prediction(new_tweets)

In [63]:
test_preds

['[0,0,0,1,0,0,0,0]', '[0,0,1,0,0,0,0,0]']

# Get predictions for test data

In [64]:
test_df = df.loc[df.is_test == True]

In [65]:
test_df.head()

Unnamed: 0,ID,Tweet,Influenza,Diarrhea,Hayfever,Cough,Headache,Fever,Runnynose,Cold,is_test,labels
1920,1921en,I went on a trip and got the flu as a souvenir.,1,0,0,0,0,1,0,0,True,"[1, 0, 0, 0, 0, 1, 0, 0]"
1921,1922en,Difficult bosses are one kind of headache,0,0,0,0,0,0,0,0,True,"[0, 0, 0, 0, 0, 0, 0, 0]"
1922,1923en,I'm dying and need someone to translate for me...,0,0,0,0,0,0,0,0,True,"[0, 0, 0, 0, 0, 0, 0, 0]"
1923,1924en,Flu crisis.,1,0,0,0,0,1,0,0,True,"[1, 0, 0, 0, 0, 1, 0, 0]"
1924,1925en,"I have a horribly stuffy nose, there's no way ...",0,0,0,0,0,0,1,0,True,"[0, 0, 0, 0, 0, 0, 1, 0]"


In [66]:
len(test_df)

640

We will feed five tweets at a time to the model and collect the results. 

In [67]:
n=5

In [68]:
chunked_test_df = [test_df[i:i+n] for i in range(0,len(test_df),n)]

In [69]:
len(chunked_test_df)

128

In [32]:
chunked_test_df_with_preds = []

for test_df_chunk in chunked_test_df:
    test_df_chunk_with_preds = test_df_chunk.copy()
    tweets = '";"'.join([t for t in test_df_chunk.Tweet])
    preds = get_prediction(tweets)
    
    # Check that we got back five predictions.
    # If not, insert zero predictions (or delete the 
    # last predictions) to get exactly five
    k = len(preds)
    if k!=n:
        if k<5:
            nb_to_add = 5-k
            for i in range(nb_to_add):
                preds.append('[0,0,0,0,0,0,0,0]')
        if k>5:
            preds = preds[:5]
            
    preds_l = [ast.literal_eval(p) for p in preds]
    test_df_chunk_with_preds['preds'] = preds_l
    chunked_test_df_with_preds.append(test_df_chunk_with_preds)

RateLimitError: The server had an error while processing your request. Sorry about that!

In [70]:
len(chunked_test_df_with_preds)

20

In [71]:
test_df_with_preds= pd.concat(chunked_test_df_with_preds)
test_df_with_preds.head()

Unnamed: 0,ID,Tweet,Influenza,Diarrhea,Hayfever,Cough,Headache,Fever,Runnynose,Cold,is_test,labels,preds
1920,1921en,I went on a trip and got the flu as a souvenir.,1,0,0,0,0,1,0,0,True,"[1, 0, 0, 0, 0, 1, 0, 0]","[0, 0, 0, 0, 1, 0, 0, 0]"
1921,1922en,Difficult bosses are one kind of headache,0,0,0,0,0,0,0,0,True,"[0, 0, 0, 0, 0, 0, 0, 0]","[0, 0, 0, 0, 0, 1, 0, 0]"
1922,1923en,I'm dying and need someone to translate for me...,0,0,0,0,0,0,0,0,True,"[0, 0, 0, 0, 0, 0, 0, 0]","[0, 0, 0, 0, 0, 0, 0, 1]"
1923,1924en,Flu crisis.,1,0,0,0,0,1,0,0,True,"[1, 0, 0, 0, 0, 1, 0, 0]","[0, 0, 0, 1, 0, 0, 1, 0]"
1924,1925en,"I have a horribly stuffy nose, there's no way ...",0,0,0,0,0,0,1,0,True,"[0, 0, 0, 0, 0, 0, 1, 0]","[0, 0, 0, 0, 0, 0, 1, 0]"


In [None]:
test_df_with_preds.to_csv('medweb_chatgpt.csv', index=None)

In [72]:
def get_accuracy(df):
    return (df['labels'] == df['preds']).sum()/len(df)

In [73]:
get_accuracy(test_df_with_preds)

0.36