# Read in cvs of tweets and get GPT analysis

# When ready to run:

1. Go to this [cell](https://colab.research.google.com/drive/1s5P26EXOGcoeSeASEP_o0WvXZLbOwfAJ#scrollTo=cGElL8WqOUpV&line=3&uniqifier=1) and update `n` to the number of samples needed (ie. 1000). If you read through the notebook, it is also labeled there with text.
1. Run all cells:
  - `ctrl`+`shift`+`P` -> "run all cells in notebook"

    Or
  - Runtime > Run all

    Or
  - `ctrl` + `F9`


1. Once complete, final excel files will be saved to the local runtime environment and automatically download. If not, they can be manually downloaded from the `Files` tab in colab (go to files -> right click -> download).

# Read Scraped Tweets and Do Some Cleanup

## Download the filtered dataframe CSV from the Epidural

In [None]:
import requests

url = 'https://media.githubusercontent.com/media/kswanjitsu/epidural/main/filtered_df.csv'
resp = requests.get(url, headers={'User-Agent': 'Mozilla/5.0'})
resp

<Response [200]>

### Save data to a file, then read it the CSV into a pandas DataFrame

In [None]:
filtered_df_csv_file = 'filtered_df.csv'
with open(filtered_df_csv_file, 'w', encoding='utf-8') as f:
  f.write(resp.text)

In [None]:
import pandas as pd

df = pd.read_csv(filtered_df_csv_file)
df

Unnamed: 0.1,Unnamed: 0,User,Time,Likes,Source,Tweet,cleaned_tweet,sent_analysis,topic
0,11,sidney_ella,2009-12-28 21:37:33+00:00,0.0,Twitter Web Client,@heartsandhandss I think women who choose natural are like super heroes! Especially since giving birth. Epi though. Caved bc of back labor.,@user i think women who choose natural are like super heroes! especially since giving birth. epi though. caved bc of back labor.,"{'label': 'positive', 'score': 0.8187388181686401}",1
1,104,emilyrdickey,2009-12-05 01:52:13+00:00,0.0,Twitter Web Client,@Wonderkarin the Bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,@user the bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,"{'label': 'neutral', 'score': 0.9033392667770386}",1
2,123,Stile15,2009-12-01 18:43:35+00:00,0.0,Twitter Web Client,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,"{'label': 'negative', 'score': 0.7766241431236267}",1
3,166,BenryHailey,2009-11-13 22:24:00+00:00,0.0,Twitter Web Client,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...I was a 10lb baby bout killed my mom for 18hrs of labor,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...i was a 10lb baby bout killed my mom for 18hrs of labor,"{'label': 'negative', 'score': 0.6303679347038269}",1
4,194,bethany529,2009-11-09 00:56:08+00:00,0.0,Twitter Web Client,My co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late AND went through 52 hours of labor. YIKES!,my co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late and went through 52 hours of labor. yikes!,"{'label': 'negative', 'score': 0.8341954946517944}",1
...,...,...,...,...,...,...,...,...,...
431519,658393,amil,2018-01-01 04:47:59+00:00,1.0,Twitter for iPhone,@chantalbraganza Damn. I had a 24 hr labour but with the aid of a blessed epidural,@user damn. i had a 24 hr labour but with the aid of a blessed epidural,"{'label': 'neutral', 'score': 0.37650564312934875}",1
431520,658395,fiercekittenz,2018-01-01 03:07:46+00:00,1.0,Twitter Web Client,"@DallasNChains @christawatson @katemorris Speaking as someone who had her epidural installed incorrectly and fall out, YUP. CONFIRMED.","@user @user @user speaking as someone who had her epidural installed incorrectly and fall out, yup. confirmed.","{'label': 'negative', 'score': 0.49432462453842163}",1
431521,658397,WrightByrdYT,2018-01-01 01:28:34+00:00,0.0,Twitter for Android,@Fact Well yeah if you don't have an epidural,@user well yeah if you don't have an epidural,"{'label': 'neutral', 'score': 0.7822378873825073}",1
431522,658398,MorganShaw513,2018-01-01 01:09:31+00:00,1.0,Twitter for iPhone,@lynkrystal Yeah Iâm definitely terrified of having babies at the moment but Iâm glad it wasnât that bad for you! I have a high pain tolerance but I still think Iâd rather have an epidural if given the chance ð,@user yeah iâm definitely terrified of having babies at the moment but iâm glad it wasnât that bad for you! i have a high pain tolerance but i still think iâd rather have an epidural if given the chance ð,"{'label': 'negative', 'score': 0.6401897072792053}",1


## Remove Dupes and Retweets

#### Dupes

In [None]:
df_no_dupes = df.drop_duplicates(subset=['cleaned_tweet'])
df_no_dupes


Unnamed: 0.1,Unnamed: 0,User,Time,Likes,Source,Tweet,cleaned_tweet,sent_analysis,topic
0,11,sidney_ella,2009-12-28 21:37:33+00:00,0.0,Twitter Web Client,@heartsandhandss I think women who choose natural are like super heroes! Especially since giving birth. Epi though. Caved bc of back labor.,@user i think women who choose natural are like super heroes! especially since giving birth. epi though. caved bc of back labor.,"{'label': 'positive', 'score': 0.8187388181686401}",1
1,104,emilyrdickey,2009-12-05 01:52:13+00:00,0.0,Twitter Web Client,@Wonderkarin the Bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,@user the bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,"{'label': 'neutral', 'score': 0.9033392667770386}",1
2,123,Stile15,2009-12-01 18:43:35+00:00,0.0,Twitter Web Client,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,"{'label': 'negative', 'score': 0.7766241431236267}",1
3,166,BenryHailey,2009-11-13 22:24:00+00:00,0.0,Twitter Web Client,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...I was a 10lb baby bout killed my mom for 18hrs of labor,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...i was a 10lb baby bout killed my mom for 18hrs of labor,"{'label': 'negative', 'score': 0.6303679347038269}",1
4,194,bethany529,2009-11-09 00:56:08+00:00,0.0,Twitter Web Client,My co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late AND went through 52 hours of labor. YIKES!,my co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late and went through 52 hours of labor. yikes!,"{'label': 'negative', 'score': 0.8341954946517944}",1
...,...,...,...,...,...,...,...,...,...
431519,658393,amil,2018-01-01 04:47:59+00:00,1.0,Twitter for iPhone,@chantalbraganza Damn. I had a 24 hr labour but with the aid of a blessed epidural,@user damn. i had a 24 hr labour but with the aid of a blessed epidural,"{'label': 'neutral', 'score': 0.37650564312934875}",1
431520,658395,fiercekittenz,2018-01-01 03:07:46+00:00,1.0,Twitter Web Client,"@DallasNChains @christawatson @katemorris Speaking as someone who had her epidural installed incorrectly and fall out, YUP. CONFIRMED.","@user @user @user speaking as someone who had her epidural installed incorrectly and fall out, yup. confirmed.","{'label': 'negative', 'score': 0.49432462453842163}",1
431521,658397,WrightByrdYT,2018-01-01 01:28:34+00:00,0.0,Twitter for Android,@Fact Well yeah if you don't have an epidural,@user well yeah if you don't have an epidural,"{'label': 'neutral', 'score': 0.7822378873825073}",1
431522,658398,MorganShaw513,2018-01-01 01:09:31+00:00,1.0,Twitter for iPhone,@lynkrystal Yeah Iâm definitely terrified of having babies at the moment but Iâm glad it wasnât that bad for you! I have a high pain tolerance but I still think Iâd rather have an epidural if given the chance ð,@user yeah iâm definitely terrified of having babies at the moment but iâm glad it wasnât that bad for you! i have a high pain tolerance but i still think iâd rather have an epidural if given the chance ð,"{'label': 'negative', 'score': 0.6401897072792053}",1


#### Remove retweets (RTs)

In [None]:
df_no_dupes_or_rts = df_no_dupes[~df_no_dupes['cleaned_tweet'].str.startswith('rt @')]
df_no_dupes_or_rts

Unnamed: 0.1,Unnamed: 0,User,Time,Likes,Source,Tweet,cleaned_tweet,sent_analysis,topic
0,11,sidney_ella,2009-12-28 21:37:33+00:00,0.0,Twitter Web Client,@heartsandhandss I think women who choose natural are like super heroes! Especially since giving birth. Epi though. Caved bc of back labor.,@user i think women who choose natural are like super heroes! especially since giving birth. epi though. caved bc of back labor.,"{'label': 'positive', 'score': 0.8187388181686401}",1
1,104,emilyrdickey,2009-12-05 01:52:13+00:00,0.0,Twitter Web Client,@Wonderkarin the Bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,@user the bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,"{'label': 'neutral', 'score': 0.9033392667770386}",1
2,123,Stile15,2009-12-01 18:43:35+00:00,0.0,Twitter Web Client,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,"{'label': 'negative', 'score': 0.7766241431236267}",1
3,166,BenryHailey,2009-11-13 22:24:00+00:00,0.0,Twitter Web Client,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...I was a 10lb baby bout killed my mom for 18hrs of labor,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...i was a 10lb baby bout killed my mom for 18hrs of labor,"{'label': 'negative', 'score': 0.6303679347038269}",1
4,194,bethany529,2009-11-09 00:56:08+00:00,0.0,Twitter Web Client,My co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late AND went through 52 hours of labor. YIKES!,my co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late and went through 52 hours of labor. yikes!,"{'label': 'negative', 'score': 0.8341954946517944}",1
...,...,...,...,...,...,...,...,...,...
431519,658393,amil,2018-01-01 04:47:59+00:00,1.0,Twitter for iPhone,@chantalbraganza Damn. I had a 24 hr labour but with the aid of a blessed epidural,@user damn. i had a 24 hr labour but with the aid of a blessed epidural,"{'label': 'neutral', 'score': 0.37650564312934875}",1
431520,658395,fiercekittenz,2018-01-01 03:07:46+00:00,1.0,Twitter Web Client,"@DallasNChains @christawatson @katemorris Speaking as someone who had her epidural installed incorrectly and fall out, YUP. CONFIRMED.","@user @user @user speaking as someone who had her epidural installed incorrectly and fall out, yup. confirmed.","{'label': 'negative', 'score': 0.49432462453842163}",1
431521,658397,WrightByrdYT,2018-01-01 01:28:34+00:00,0.0,Twitter for Android,@Fact Well yeah if you don't have an epidural,@user well yeah if you don't have an epidural,"{'label': 'neutral', 'score': 0.7822378873825073}",1
431522,658398,MorganShaw513,2018-01-01 01:09:31+00:00,1.0,Twitter for iPhone,@lynkrystal Yeah Iâm definitely terrified of having babies at the moment but Iâm glad it wasnât that bad for you! I have a high pain tolerance but I still think Iâd rather have an epidural if given the chance ð,@user yeah iâm definitely terrified of having babies at the moment but iâm glad it wasnât that bad for you! i have a high pain tolerance but i still think iâd rather have an epidural if given the chance ð,"{'label': 'negative', 'score': 0.6401897072792053}",1


### Put it all back into the original dataframe

In [None]:
df = df_no_dupes_or_rts
df

Unnamed: 0.1,Unnamed: 0,User,Time,Likes,Source,Tweet,cleaned_tweet,sent_analysis,topic
0,11,sidney_ella,2009-12-28 21:37:33+00:00,0.0,Twitter Web Client,@heartsandhandss I think women who choose natural are like super heroes! Especially since giving birth. Epi though. Caved bc of back labor.,@user i think women who choose natural are like super heroes! especially since giving birth. epi though. caved bc of back labor.,"{'label': 'positive', 'score': 0.8187388181686401}",1
1,104,emilyrdickey,2009-12-05 01:52:13+00:00,0.0,Twitter Web Client,@Wonderkarin the Bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,@user the bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,"{'label': 'neutral', 'score': 0.9033392667770386}",1
2,123,Stile15,2009-12-01 18:43:35+00:00,0.0,Twitter Web Client,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,"{'label': 'negative', 'score': 0.7766241431236267}",1
3,166,BenryHailey,2009-11-13 22:24:00+00:00,0.0,Twitter Web Client,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...I was a 10lb baby bout killed my mom for 18hrs of labor,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...i was a 10lb baby bout killed my mom for 18hrs of labor,"{'label': 'negative', 'score': 0.6303679347038269}",1
4,194,bethany529,2009-11-09 00:56:08+00:00,0.0,Twitter Web Client,My co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late AND went through 52 hours of labor. YIKES!,my co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late and went through 52 hours of labor. yikes!,"{'label': 'negative', 'score': 0.8341954946517944}",1
...,...,...,...,...,...,...,...,...,...
431519,658393,amil,2018-01-01 04:47:59+00:00,1.0,Twitter for iPhone,@chantalbraganza Damn. I had a 24 hr labour but with the aid of a blessed epidural,@user damn. i had a 24 hr labour but with the aid of a blessed epidural,"{'label': 'neutral', 'score': 0.37650564312934875}",1
431520,658395,fiercekittenz,2018-01-01 03:07:46+00:00,1.0,Twitter Web Client,"@DallasNChains @christawatson @katemorris Speaking as someone who had her epidural installed incorrectly and fall out, YUP. CONFIRMED.","@user @user @user speaking as someone who had her epidural installed incorrectly and fall out, yup. confirmed.","{'label': 'negative', 'score': 0.49432462453842163}",1
431521,658397,WrightByrdYT,2018-01-01 01:28:34+00:00,0.0,Twitter for Android,@Fact Well yeah if you don't have an epidural,@user well yeah if you don't have an epidural,"{'label': 'neutral', 'score': 0.7822378873825073}",1
431522,658398,MorganShaw513,2018-01-01 01:09:31+00:00,1.0,Twitter for iPhone,@lynkrystal Yeah Iâm definitely terrified of having babies at the moment but Iâm glad it wasnât that bad for you! I have a high pain tolerance but I still think Iâd rather have an epidural if given the chance ð,@user yeah iâm definitely terrified of having babies at the moment but iâm glad it wasnât that bad for you! i have a high pain tolerance but i still think iâd rather have an epidural if given the chance ð,"{'label': 'negative', 'score': 0.6401897072792053}",1


## Download the tiktoken library to count tokens per tweet

In [None]:
!pip install --upgrade tiktoken



In [None]:
import tiktoken


# gpt_model = 'gpt-3.5-turbo'
gpt_model = 'gpt-4'

def num_tokens_from_string(string: str, encoding_name: str) -> int:
    """Returns the number of tokens in a text string."""
    encoding = tiktoken.get_encoding(encoding_name)
    num_tokens = len(encoding.encode(string))
    return num_tokens


df['tok_per_twt'] = df['cleaned_tweet'].apply(lambda x: num_tokens_from_string(x, tiktoken.encoding_for_model(gpt_model).name))
df

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['tok_per_twt'] = df['cleaned_tweet'].apply(lambda x: num_tokens_from_string(x, tiktoken.encoding_for_model(gpt_model).name))


Unnamed: 0.1,Unnamed: 0,User,Time,Likes,Source,Tweet,cleaned_tweet,sent_analysis,topic,tok_per_twt
0,11,sidney_ella,2009-12-28 21:37:33+00:00,0.0,Twitter Web Client,@heartsandhandss I think women who choose natural are like super heroes! Especially since giving birth. Epi though. Caved bc of back labor.,@user i think women who choose natural are like super heroes! especially since giving birth. epi though. caved bc of back labor.,"{'label': 'positive', 'score': 0.8187388181686401}",1,29
1,104,emilyrdickey,2009-12-05 01:52:13+00:00,0.0,Twitter Web Client,@Wonderkarin the Bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,@user the bradley method is husband coached birth- where he takes the lead in guiding you for a natural labor,"{'label': 'neutral', 'score': 0.9033392667770386}",1,24
2,123,Stile15,2009-12-01 18:43:35+00:00,0.0,Twitter Web Client,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,my sister wanted to do a natural birth but she has now been in labor for 7 hours and asked for an epidural...i know she couldnt handle it ha,"{'label': 'negative', 'score': 0.7766241431236267}",1,34
3,166,BenryHailey,2009-11-13 22:24:00+00:00,0.0,Twitter Web Client,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...I was a 10lb baby bout killed my mom for 18hrs of labor,#haveababybyme u betta have health insurance cuz a natural birth and whatchu want...i was a 10lb baby bout killed my mom for 18hrs of labor,"{'label': 'negative', 'score': 0.6303679347038269}",1,39
4,194,bethany529,2009-11-09 00:56:08+00:00,0.0,Twitter Web Client,My co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late AND went through 52 hours of labor. YIKES!,my co-workers wife just gave natural birth to a 9.5 pound baby this am after she was 16 days late and went through 52 hours of labor. yikes!,"{'label': 'negative', 'score': 0.8341954946517944}",1,37
...,...,...,...,...,...,...,...,...,...,...
431519,658393,amil,2018-01-01 04:47:59+00:00,1.0,Twitter for iPhone,@chantalbraganza Damn. I had a 24 hr labour but with the aid of a blessed epidural,@user damn. i had a 24 hr labour but with the aid of a blessed epidural,"{'label': 'neutral', 'score': 0.37650564312934875}",1,20
431520,658395,fiercekittenz,2018-01-01 03:07:46+00:00,1.0,Twitter Web Client,"@DallasNChains @christawatson @katemorris Speaking as someone who had her epidural installed incorrectly and fall out, YUP. CONFIRMED.","@user @user @user speaking as someone who had her epidural installed incorrectly and fall out, yup. confirmed.","{'label': 'negative', 'score': 0.49432462453842163}",1,24
431521,658397,WrightByrdYT,2018-01-01 01:28:34+00:00,0.0,Twitter for Android,@Fact Well yeah if you don't have an epidural,@user well yeah if you don't have an epidural,"{'label': 'neutral', 'score': 0.7822378873825073}",1,12
431522,658398,MorganShaw513,2018-01-01 01:09:31+00:00,1.0,Twitter for iPhone,@lynkrystal Yeah Iâm definitely terrified of having babies at the moment but Iâm glad it wasnât that bad for you! I have a high pain tolerance but I still think Iâd rather have an epidural if given the chance ð,@user yeah iâm definitely terrified of having babies at the moment but iâm glad it wasnât that bad for you! i have a high pain tolerance but i still think iâd rather have an epidural if given the chance ð,"{'label': 'negative', 'score': 0.6401897072792053}",1,62


# GPT Analysis Section

### Install OpenAI and Langchain libraries

In [None]:
!pip install --upgrade openai langchain



## Setup OpenAI key and instatiate a chat model to make the appropriate API calls

In [None]:
import openai
import os

openai.api_key  = '<your_open_ai_key>'
os.environ['OPENAI_API_KEY'] = openai.api_key

In [None]:
from langchain.chat_models import ChatOpenAI

chat = ChatOpenAI(temperature=0.0, model_name=gpt_model)
chat

ChatOpenAI(cache=None, verbose=False, callbacks=None, callback_manager=None, tags=None, metadata=None, client=<class 'openai.api_resources.chat_completion.ChatCompletion'>, model_name='gpt-3.5-turbo', temperature=0.0, model_kwargs={}, openai_api_key='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', openai_api_base='', openai_organization='', openai_proxy='', request_timeout=None, max_retries=6, streaming=False, n=1, max_tokens=None, tiktoken_model_name=None)

### Prompt template

Give the model instructions to interpret the message and format a response.

**_text_** is a list of tweets string built in subsequent cell

**_format_instructions_** is a langchain output parser, also built in subsequent cell

In [None]:
from langchain.prompts import ChatPromptTemplate

template_string = '''There is a list of posts demarcated by triple backticks (```).

For each item in the list, determine:
(1) if the author had an epidural or not
(2) if their opinion of epidurals is positive, negative, or neutral
(3) if their opinion on natural childbirth is positive, negative, or neutral

```{text}```

Respond with a list of python dictionaries corresponding to each message in the list.
{format_instructions}

'''

#### Format instructions:
- We want the model to return a list of json objects
- This code describes each of the fields in the those JSON objects. This way we can parse the responses and convert it more easily into usable data.

In [None]:
from langchain.output_parsers import ResponseSchema
from langchain.output_parsers import StructuredOutputParser


index_schema = ResponseSchema(name="index",
                            description="The index number in front of the list entry")

about_epi_schema = ResponseSchema(name="about_epi",
                            description="Is the author mainly talking about epidurals or natural?\
                            1 for 'epidurals', 0 for 'natural', NaN if you could not determine.\
                            Response should ONLY be one of: [1, 0, NaN]")
had_epi_schema = ResponseSchema(name="had_epi",
                            description="Did the author have an epidural?\
                            1 for 'yes', 0 for 'no', NaN if you could not determine.\
                            Response should ONLY be one of: [1, 0, NaN]")
epi_pos_schema = ResponseSchema(name="epi_pos",
                            description="Does the message display a positive sentiment towards epidurals?\
                            1 for 'yes', 0 for 'neutral', -1 for 'negative sentiment', NaN if you could not determine.\
                            Response should ONLY be one of: [1, 0, -1, NaN]")

nat_pos_schema = ResponseSchema(name="nat_pos",
                            description="Does the message display a positive sentiment towards natural births?\
                            1 for 'yes', 0 for 'neutral', -1 for 'negative sentiment', NaN if you could not determine.\
                            Response should ONLY be one of: [1, 0, -1, NaN]")

response_schemas = [index_schema, about_epi_schema, had_epi_schema, epi_pos_schema, nat_pos_schema]

output_parser = StructuredOutputParser.from_response_schemas(response_schemas)
format_instructions = output_parser.get_format_instructions()
prompt = ChatPromptTemplate.from_template(template=template_string)


### Filter out a subset of the dataframe to a more appropriate number (ie. n=1000)
- Currently set to 4 for testing purposes

In [None]:
import numpy as np

######################################################################
######################################################################
############## Filter out n samples: #################################
######################################################################
######################################################################
num_samples = 40000
######################################################################
######################################################################
df2 = df.sample(n=num_samples)
df2
######################################################################
######################################################################
######################################################################
######################################################################


Unnamed: 0.1,Unnamed: 0,User,Time,Likes,Source,Tweet,cleaned_tweet,sent_analysis,topic,tok_per_twt,old_index
427242,650515,OH_LAWWD,2018-03-11 17:36:01+00:00,1.0,Twitter for iPhone,7) the fuckin epidural needle gotta go close Asl to yo spin and if you move bitch you ova with. I canât stay still with a fuckin needle going down my back.,7) the fuckin epidural needle gotta go close asl to yo spin and if you move bitch you ova with. i canât stay still with a fuckin needle going down my back.,"{'label': 'negative', 'score': 0.9163564443588257}",1,43,427242
673,12835,kiaaami,2022-10-30 05:22:00+00:00,11.0,Twitter for iPhone,"like okay you gave birth all natural and okay you got an epidural, just be proud of yourselves for doing it and still being alive from doing so bc thereâs so many things that can go wrong in laborâ¦ itâs actually scary","like okay you gave birth all natural and okay you got an epidural, just be proud of yourselves for doing it and still being alive from doing so bc thereâs so many things that can go wrong in laborâ¦ itâs actually scary","{'label': 'negative', 'score': 0.6601393818855286}",1,53,673
132860,163272,kimmkimmkimm,2022-07-20 02:04:58+00:00,0.0,Twitter for iPhone,"Pep talk of the day: if I can push an almost 9lb baby without epidural, I can handle this back pain!!","pep talk of the day: if i can push an almost 9lb baby without epidural, i can handle this back pain!!","{'label': 'positive', 'score': 0.7771046161651611}",1,28,132860
312966,450341,massoner,2014-04-22 21:31:56+00:00,0.0,Twitter for Android,"@PoliticalPort @sevenwithcheese im on heavy narcs, n get epidural injections every 3 months, tried it all, wont let them cut my spine tho","@user @user im on heavy narcs, n get epidural injections every 3 months, tried it all, wont let them cut my spine tho","{'label': 'neutral', 'score': 0.5705633759498596}",1,31,312966
55093,32874,beu_tuh_ful,2013-08-08 15:28:24+00:00,0.0,Twitter for Android,Nunu finna get a epidural. Omgomg,nunu finna get a epidural. omgomg,"{'label': 'neutral', 'score': 0.4669259190559387}",1,12,55093
...,...,...,...,...,...,...,...,...,...,...,...
134792,166542,forhpm,2022-06-27 00:57:34+00:00,1.0,Twitter for iPhone,"@RabidCedarTree @AJKayWriter 36 hrs of back labour then 12 with epidural b/c there were no rooms. Intern cut me too far, cut my anal muscles. Baby had too hard a latch, she tore off my nipples for the 6 months I nursed. Guess what? I had another. B/c i never viewed any of it as suffering.","@user @user 36 hrs of back labour then 12 with epidural b/c there were no rooms. intern cut me too far, cut my anal muscles. baby had too hard a latch, she tore off my nipples for the 6 months i nursed. guess what? i had another. b/c i never viewed any of it as suffering.","{'label': 'negative', 'score': 0.7962793707847595}",1,74,134792
43717,15949,EFCjojo,2013-11-29 08:36:30+00:00,0.0,Twitter for iPhone,@scousemouse1982 yea me dads on morphine and epidural . Got me results nothing bad she said but book dc appoint so go in on mon morn xx,@user yea me dads on morphine and epidural . got me results nothing bad she said but book dc appoint so go in on mon morn xx,"{'label': 'positive', 'score': 0.6627386808395386}",1,32,43717
80642,71137,Rainamarie_,2015-12-17 19:06:19+00:00,0.0,Twitter for iPhone,@RainaSaulny your welcome and i didn't do the water birth that pain is intense I reuped on the epidural felt NOTHING I warn you get it lol,@user your welcome and i didn't do the water birth that pain is intense i reuped on the epidural felt nothing i warn you get it lol,"{'label': 'negative', 'score': 0.6310524344444275}",1,32,80642
110153,124352,theladyfarmerla,2010-05-25 15:13:17+00:00,0.0,Twittelator,"@Teresa_Giudice of #rhwnj after getting an epidural ""is my makeup messed up"".... LOVE HER!!!! LOL!","@user of #rhwnj after getting an epidural ""is my makeup messed up"".... love her!!!! lol!","{'label': 'positive', 'score': 0.6530094742774963}",1,25,110153


## How many tokens are in just the tweets in this sample?

In [None]:
print(f'min tokens for tweets: {np.sum(df2["tok_per_twt"])}')

min tokens for tweets: 1330313


## Only keep the needed columns for this part

In [None]:
############## Only keep the cleaned_tweets for this section ############
df2 = df2[['cleaned_tweet']]
# Save off the old index so it will be easy to merge the results DF and the original
df2['old_index'] = df2.index

# Reset the index so it will be easier to group the tweets and send them as batches
df2.reset_index(inplace=True)
df2

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df2['old_index'] = df2.index


Unnamed: 0,index,cleaned_tweet,old_index
0,427242,7) the fuckin epidural needle gotta go close asl to yo spin and if you move bitch you ova with. i canât stay still with a fuckin needle going down my back.,427242
1,673,"like okay you gave birth all natural and okay you got an epidural, just be proud of yourselves for doing it and still being alive from doing so bc thereâs so many things that can go wrong in laborâ¦ itâs actually scary",673
2,132860,"pep talk of the day: if i can push an almost 9lb baby without epidural, i can handle this back pain!!",132860
3,312966,"@user @user im on heavy narcs, n get epidural injections every 3 months, tried it all, wont let them cut my spine tho",312966
4,55093,nunu finna get a epidural. omgomg,55093
...,...,...,...
39995,134792,"@user @user 36 hrs of back labour then 12 with epidural b/c there were no rooms. intern cut me too far, cut my anal muscles. baby had too hard a latch, she tore off my nipples for the 6 months i nursed. guess what? i had another. b/c i never viewed any of it as suffering.",134792
39996,43717,@user yea me dads on morphine and epidural . got me results nothing bad she said but book dc appoint so go in on mon morn xx,43717
39997,80642,@user your welcome and i didn't do the water birth that pain is intense i reuped on the epidural felt nothing i warn you get it lol,80642
39998,110153,"@user of #rhwnj after getting an epidural ""is my makeup messed up"".... love her!!!! lol!",110153


## Send the tweets to the model

### The groupby snags x rows from `df2` at a time, then builds a list of the tweets preceeded by their original index. The format instructions are already embedded into the prompt. Here we are creating the `text` component.
- For each batch, get and parse the responses. The response dictionaries are appended to a list of response dictionaries.
- Lots of debugging prints commented out. Feel free to uncomment to see what's going on, but it may add significant amounts of text.

In [None]:
import json
import regex as re


all_responses_list_of_dicts = list()

send_message = False
all_messages = []


group_size = 20

for i, g in df2.groupby(df2.index // group_size):
  try:
    # print('-'*50)
    print(f'\rGroup #{i}', end='')
    msg_list = []
    # print(g.to_dict())
    for j, data in g.iterrows():
      old_index = data['old_index']
      tweet = data['cleaned_tweet']
      # print(f'Adding tweet {old_index}: {tweet}')
      msg_list.append(f'{old_index}) {tweet.encode("UTF-8")}')
    messages = prompt.format_messages(text='\n'.join(msg_list),
                              format_instructions=format_instructions)
    all_messages.append(messages[0].content)
    # print(messages[0].content)
    if send_message:
      response = chat(messages)
      m = re.findall(r'\{([^{]*?)\}', response.content)
      all_responses_list_of_dicts.extend([json.loads(f'{{{item}}}') for item in m])
  except Exception as e:
    print(e)

# print('='*50)
# print(all_responses_list_of_dicts)

Group #1999

## How many tokens were/would be generated?
- What is the cost for those tokens in gpt-3.5 and gpt-4



In [None]:
import numpy as np


model_name = tiktoken.encoding_for_model(gpt_model).name
tokens = [num_tokens_from_string(msg, model_name) for msg in all_messages]
total_tokens = np.sum(tokens)
print('='*100)
print(f'Current model: {gpt_model}')
print(f'{num_samples} tweets')
print(f'Total tokens: {total_tokens}')
gpt35turbo_cp1kt = 0.0015
gpt4_cp1kt = 0.003
print(f'GPT-3.5-Turbo cost: ${gpt35turbo_cp1kt*total_tokens/1000}')
print(f'GPT-4 cost: ${gpt4_cp1kt*total_tokens/1000}')
print('='*100)


Current model: gpt-3.5-turbo
40000 tweets
Total tokens: 2682208
GPT-3.5-Turbo cost: $4.023312
GPT-4 cost: $8.046624


In [None]:
np.max(tokens)

3182

In [None]:
# Sample of the responses
print(len(all_responses_list_of_dicts))
print('Sample...:')
all_responses_list_of_dicts[:2]

0
Sample...:


[]

### Put the list of response dicitonaries into a DataFrame

In [None]:
pd.set_option('display.max_colwidth', None)


temp_df = pd.DataFrame(all_responses_list_of_dicts)
temp_df['index'] = pd.to_numeric(temp_df['index'])
temp_df

Unnamed: 0,index,about_epi,had_epi,epi_pos,nat_pos
0,384857,1,1,-1,
1,304786,1,1,1,


## Merge the results with the original `df2` and original full dictionary

In [None]:
merged_results = df2.merge(right=temp_df, how='inner', left_on='old_index', right_on='index')
merged_results

Unnamed: 0,index_x,cleaned_tweet,old_index,index_y,about_epi,had_epi,epi_pos,nat_pos
0,384857,@user i was 8cm by dinner time. that was on pitocin tho. and my epidural was wearing off. i was feeling the contractions full force. :(,384857,384857,1,1,-1,
1,304786,@user easy. had a epidural i didn't feel anything.,304786,304786,1,1,1,


In [None]:
df['old_index'] = df.index
filtered_df_merged_results = df.merge(right=temp_df, how='inner', left_on='old_index', right_on='index')
filtered_df_merged_results

A value is trying to be set on a copy of a slice from a DataFrame.
Try using .loc[row_indexer,col_indexer] = value instead

See the caveats in the documentation: https://pandas.pydata.org/pandas-docs/stable/user_guide/indexing.html#returning-a-view-versus-a-copy
  df['old_index'] = df.index


Unnamed: 0.1,Unnamed: 0,User,Time,Likes,Source,Tweet,cleaned_tweet,sent_analysis,topic,tok_per_twt,old_index,index,about_epi,had_epi,epi_pos,nat_pos
0,436223,Mizz_Dee300,2014-08-06 23:33:03+00:00,0.0,Twitter for Android,@DeMariusKeeper easy. Had a epidural I didn't feel anything.,@user easy. had a epidural i didn't feel anything.,"{'label': 'neutral', 'score': 0.6178291440010071}",1,14,304786,304786,1,1,1,
1,581521,Niki_twodees,2012-06-25 18:49:36+00:00,0.0,Echofon,@CRM1987 I was 8cm by dinner time. That was on pitocin tho. And my epidural was wearing off. I was feeling the contractions full force. :(,@user i was 8cm by dinner time. that was on pitocin tho. and my epidural was wearing off. i was feeling the contractions full force. :(,"{'label': 'negative', 'score': 0.8693695068359375}",1,37,384857,384857,1,1,-1,


## Save and Download results

In [None]:
merged_results_file = 'merged_results.xlsx'
filtered_df_merged_results_file = 'filtered_df_merged_results.xlsx'

merged_results.to_excel(merged_results_file)
filtered_df_merged_results.to_excel(filtered_df_merged_results_file)
df2.to_excel('df2.xlsx')



In [None]:
from google.colab import files

files.download(merged_results_file)
files.download(filtered_df_merged_results_file)
files.download('df2.xlsx')

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>

<IPython.core.display.Javascript object>