# Preprocess article headlines and abstract through chatgpt

Due to GPT's advantage in understanding context, it is able to better summarise the news article's impact on a particular company or stock. For example, a negative keywords that FinBert may construe as Preprocessing 

In [None]:
import os

import pandas as pd
from dotenv import load_dotenv
from tqdm import tqdm
from openai import OpenAI

In [2]:
load_dotenv(dotenv_path='../.env')
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY")

In [3]:
client = OpenAI()

In [4]:
def parse_article_gpt(article):

    instructions = "Evaluate, in one paragraph, how this piece of news would affect the company Tesla, before passing it into another sentiment scoring model."

    completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[
            {"role": "system", "content": instructions},
            {
                "role": "user",
                "content": f"{article}"
            }
        ]
    )

    return completion.choices[0].message.content

In [5]:
parse_article_gpt("The USA has placed tariffs on electric vehicles.")

"The imposition of tariffs on electric vehicles in the USA could significantly impact Tesla by increasing production costs and consequently raising the prices of its vehicles for consumers. This could lead to a decrease in demand, particularly among price-sensitive customers, and may hinder Tesla’s competitive edge in the market, especially against traditional automakers who might better absorb such costs. Additionally, if competitors are able to navigate these tariffs more effectively or if consumers turn to alternative brands due to price sensitivity, Tesla could see a potential decline in market share. Overall, while the tariffs could create challenges, Tesla's established brand loyalty and innovative technology may help mitigate some adverse effects, but the situation demands close monitoring for strategic adjustments."

### Load Data

In [6]:
headlines_df = pd.read_csv('../data/tesla_headlines_with_sentiment.csv')

#### Call GPT API to get response

In [28]:
for i, row in tqdm(headlines_df.iterrows(), total=headlines_df.shape[0]):
    combined_string = f"{row['lead_paragraph']} \n {row['abstract']}"
    gpt_summary = parse_article_gpt(combined_string)
    headlines_df.loc[i, 'gpt_summary'] = gpt_summary


100%|██████████| 1365/1365 [1:01:08<00:00,  2.69s/it]


In [29]:
headlines_df

Unnamed: 0,timestamp,article_url,lead_paragraph,abstract,adjusted_date,pos_sentiment,neg_sentiment,neutral_sentiment,pos_sentiment_w_preamb,neg_sentiment_w_preamb,neutral_sentiment_w_preamb,gpt_summary
0,2021-07-26 19:46:15+00:00,https://www.nytimes.com/2021/07/26/business/te...,Tesla on Monday reported a big increase in pro...,The electric car company made $1 billion in th...,27/7/21,0.940866,0.016005,0.043130,0.883625,0.016205,0.100170,The significant increase in Tesla's profit and...
1,2021-08-06 07:00:10+00:00,https://www.nytimes.com/2021/08/06/business/bi...,The Biden administration’s vow on Thursday to ...,A push to increase sales of electric vehicles ...,6/8/21,0.041768,0.847919,0.110313,0.054793,0.779841,0.165365,The Biden administration's commitment to incre...
2,2021-07-22 09:00:15+00:00,https://www.nytimes.com/2021/07/22/business/te...,"GRÜNHEIDE, Germany — The vast pale gray factor...",Elon Musk wanted to tap German engineering exp...,22/7/21,0.056098,0.030616,0.913287,0.031834,0.039803,0.928363,The news highlights significant operational ch...
3,2021-08-09 17:51:08+00:00,https://www.nytimes.com/2021/08/09/business/en...,SAN DIEGO — Robert Teglia bought a Tesla Model...,President Biden has made conversion to E.V.s a...,10/8/21,0.168428,0.011206,0.820366,0.153038,0.011520,0.835442,This news piece may have mixed implications fo...
4,2021-07-05 20:30:28+00:00,https://www.nytimes.com/2021/07/05/business/te...,Benjamin Maldonado and his teenage son were dr...,A California family that lost a 15-year-old bo...,6/7/21,0.017526,0.844307,0.138167,0.020553,0.828773,0.150674,The lawsuit against Tesla regarding the tragic...
...,...,...,...,...,...,...,...,...,...,...,...,...
1360,2023-11-20 13:27:11+00:00,https://www.nytimes.com/2023/11/20/business/de...,"Over just three days, the landscape for artifi...",Big Tech is reeling from the ouster of Sam Alt...,20/11/23,0.014174,0.876072,0.109754,0.011622,0.904145,0.084233,The rapid reshaping of the artificial intellig...
1361,2023-10-17 12:11:23+00:00,https://www.nytimes.com/2023/10/17/business/de...,As President Biden prepares to visit Israel on...,The Future Investment Initiative kicks off in ...,17/10/23,0.022449,0.646057,0.331493,0.032042,0.447444,0.520514,The news of President Biden's visit to Israel ...
1362,2023-07-07 11:19:35+00:00,https://www.nytimes.com/2023/07/07/business/de...,Economists and market participants are deeply ...,Economists have forecast that employers create...,7/7/23,0.010188,0.963813,0.025998,0.009738,0.954933,0.035329,The impending payrolls report could significan...
1363,2023-05-26 11:24:35+00:00,https://www.nytimes.com/2023/05/26/business/de...,Investors are holding their breath on Friday m...,Stock and bond trading suggest that investors ...,26/5/23,0.031368,0.943510,0.025122,0.024865,0.951599,0.023536,The potential resolution of the debt ceiling c...


In [30]:
headlines_df.to_csv('../data/tesla_gpt_summarised.csv')