Skip to content

wangtuguahhh/Sentiment-Analysis-for-Investment-Strategies-on-Tesla-Stock

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

33 Commits
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

image

News Analysis for Potential Investment Strategies on Tesla Stock

Investment strategy on certain stocks is a challenging topic and of keen interest to a wide audience. The stock market, with its inherent complexity and unpredictability, presents a fascinating area for machine learning exploration. Its difficulty in prediction stems from a multitude of factors including economic indicators, global events, investor behavior, and market speculation, all of which contribute to its volatile nature.

Leveraging advanced Large Language Models (LLMs) presents an innovative approach to this task. LLMs enable the rapid processing and interpretation of vast amounts of news information. My goal in this project is to conduct a thorough sentiment analysis of news related to Tesla in a short period of time, aiming to uncover any potential correlations between public sentiment in the news and fluctuations in Tesla's stock price. This endeavor not only poses a significant challenge but also holds the promise of offering novel insights into the complicated dynamics between media sentiment and stock market movements.

The final analysis dashboard of Tesla related news, Tesla stock price movements and their correlations are deployed on streamlit.sharing. Feel free to check it out and play with it through the link below!

Final Analysis Dashboard

1. Data

APIs are very convinent way to collect required data in this task. The following APIs are used:

In this task, data from the past 30 days are evaluated. The time period is from 2023-09-12 to 2023-10-11.

2. Method

After collecting the news, some manual labeling were performed for 60% of the news articles. Some articles were found not directly related to Tesla but related to Elon Musk's personal life, his political opinions and his other companies. Therefore, a classification tool is required to tell if a news is directly related to Tesla or not.

Flan-T5 model, a LLM developed by Google to handle text2text generation tasks, such as translation and summarization was chosen for this task. The model size is 248M parameters. In order to perform required classification task on news, tuning is needed for this model. The following methods were implemented:

FinBERT model, a LLM developed for financial sentiment analysis with BERT, was used for sentiment analysis of news related to Tesla

3. Data Collection

1. NewsAPI

All news related to either 'Tesla' or 'Elon Musk' were extracted using NewsAPI and here are the number of news by publisher.

image

2. Tesla Stock Price

Open price, close price and other properties about Tesla stock were extracted using MarketstackAPI. The open price and close price are plotted below.

image

Data Collection Notebook

4. News Classification

In-context learning was investigated to leverage Flan-T5 model to classify if a news is related to Tesla or not. BERT model was also evaluated but ICL failed to work for it. ICL worked for LLMs with generative capabilites, such as Flan-T5. The following methods were tested:

  • Zero-shot learning
  • One-shot learning
  • Few-shot learning (2, 3, 4, 5 shots)

Here is the classification report for one-shot learning.

image

Here is the classification report for few-shot learning (2 shots):

image

One-shot learning provided the best performance with an average accuracy of 0.85. Including more examples in the prompt (few-shot learning) didn't improve the overall accuracy.

More examples in the prompt improved precision for 'yes' and recall for 'no'. Few-shot learning suffered from misclassifying true 'yes' as 'no'.

Here is the distribution of Tesla-related news and non-Tesla-related news from one-shot learning.

image

News Classification Notebook

5. Sentiment Analysis

5.1. FinBERT Sentiment Analysis Evaluation

The FinBERT model sentiment results were compared with manual labels for the 60% of the news data. Here is the confusion matrix for news directly related to Tesla. The FinBERT model tended to label positive news as neutral ones assuming the manual labels were true labels.

image

5.2. Sentiment Distribution

Sentiment distributions for all data using FinBERT model.

image

  • The majority news were neutral, which could be suprising given the common assumption that media outlets prefer to report negative news.
  • There were more negative ones than positive ones.
    • This could be true.
    • Or related to FinBERT model's tendency to label positive Tesla-related news as neutral ones.

5.3. Sentiment Polarity of Publishers

Positive news rate - Negative news rate was chosen to illustrate sentiment polarity of each publisher. Here is the distribution.

image

Here is the plot with individual positive rates and negative rates.

image

  • The majority publishers are more on neutral side.
  • There are some publishers tended to report more negative Tesla news, such as NBC News, The Irish Times, Politico, , The Washington Post and Breitbart News.
    • Some the publishers above are big names, such as NBC News and The Washington Post that could have a larger impact on public opinion on how Tesla is doing.
  • There are some publishers tended to report more positive Tesla news, such as Next Big Future, Engadget, The Times of India, TechRadar, The Jerusalem Post and BBC News.
    • This trend suggests that technology-focused publishers generally maintain a positive stance towards Tesla.
    • Major publishers from specific countries appeared to be more positive in their coverage of Tesla, which could indicate a favorable disposition towards providing Tesla with better opportunities.

Sentiment Analysis Notebook

6. Correlation with Stock Price

From Pearson Correlation calculations, the correlation coefficient between open price and positive news number is large, around 0.68. Here is the plot of trends of those two values normalized.

image

There were some similaries between the two trend lines. Next will shift the positive numbers number by 1 day to check its impact on next day's open price.

image

There is no clear correlation between today's open price with yesterday's number of positive news. Next let's investigate the impact from both positive and negative news on Tesla stock price change.

image image

  • The stock price movements correlated with the previous-day news sentiment the best though quite some mis-match.
  • News accumulated from more than 1 days were also evaluated. however, the impact from negative sentiments was exaggerated if averaging impact from previous days.
  • Ignoring the neutral news and relying on the raw difference between positive and negative didn't correlate well with the stock price movements.

Stock Price Correlation Notebook

7. Future Improvements

🔭 Observations:

  • With help of prompt engineering and fine-tuning, we can facilitate open-source LLMs to work on a particular task of interest.
    • In this work, we imporve Flan-T5 model performance on classifying if a news is related to Tesla or not using one-shot learning method.
  • Most of the news in the past 30 days were neutral to Tesla.
  • Ignoring the neutral news, there were more negative news for Tesla in the past 30 days compared to positive ones.
  • There is no clear strong correlation between the news sentiments and the stock market price movements.

🧰 Limitations:

  • The news classification is not perfect. The around was around 0.85 for the Flan-T5 model with in-context learning technique applied.

  • The sentiment analysis from FinBERT model is not appropriate.

    • Comparing to human labels, FinBERT model tended to label positive news as neutral ones.
    • FinBERT model was pretrained using finance corpus for sentiment analysis based on BERT.
    • The manual labeling paid more attention to potential impact on Tesla stock price, which is challenging to FinBERT.
      • For example, there were news articles on other EV makers adapting to the battery charging system developed by Tesla. I labeled such news as positive since the adaption change from other EV makers could bring potential developments and more market shares of Tesla in the EV charging system section. However, FinBERT treated those as neutral news.
    • The NLP taks here is no longer simple sentiment analysis on a piece of context. It required us to generate a tool to predict the potential impact of news on Tesla's stock price with help from LLMs.
  • There are multiple and complex factors that will impact the stock price of a company.

    • Market-related facors such as GDP, interest rates, inflation, employment rate, indicate how well the whole market is performing and they will have a signifcant impact on individual stocks.
    • Company reports showing how the company performed in the past period of time is importance to its stock price. Earnings, dividens, debts, management qualities and etc. are key components to drive the stock price movement.
    • Other factors such as global economy, regulations and goverment policies, industry-specific changes and natural disasters and pandemics all play a role here.

🍀 Future Work:

  • Collecting data for a longer period of time:
    • then can try fine-tuning LLMs for classification and sentiment analysis
    • to see if there is any long-term trend between the news sentiments and the stock price movement.
  • Tuning LLMs with generative capabilities, such as Flan-T5 to achieve the task of predicting the impact of a news on Tesla's stock price.
    • try ICL first
    • try fine-tuning if more data are available
  • Incorporate other factors mentioned in the not considered section.
  • Create a Tesla stock investigation AI tool for user.

About

No description, website, or topics provided.

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages