Skip to content

bgord623/Practicum-2

Repository files navigation

Practicum 2

Predicting daily stock market movement with machine learning

Table of Contents

  1. Overview
  2. Scoring
  3. Data
  4. Exploratory data analysis and visualization
  5. Modeling of investment strategies
  6. Evaluation

Overview

The purpose of this project is to create a daily stock direction prediction tool. I’ll be working with Tesla (TSLA) stock market data but ideally the techniques can be applied broadly. Investment strategies focused on classification, including machine learning algorithms, will make buy (1) and sell (0) predictions of the stock before the trading day begins. Specifically, a buy (1) prediction indicates the sign of the return (i.e., daily percent change) will be positive for the predicted day, while the sell (0) prediction indicates the sign of the return will be negative. The presentation covers the final notebook and detailed notebooks are located in weekly folders.

Scoring

Scoring is focused on financial performance and accuracy score during the test data timeframe (1/3/23 – 4/19/23).

  • Financial performance reflects the difference in percentage points between the TSLA stock performance and strategy performance
  • Accuracy score is the number of correct predictions divided by the number of total predictions

Data

Stock data was obtained with the yfinance library, which utilizes the Yahoo Finance API and Pandas to allow one to easily download stock data to a DataFrame:

image

Data downloaded with yfinance contains no missing values but frequently calculations to the data create missing values, such as adding price lags and calculating moving averages. These null values always occur at the head-end and are least important for the purposes of this project, and thus deleted.
I wanted to start with simple techniques as well as a simple dataset, so only the closing price (‘Adj Close’) was retained. Feature engineering began with creating a daily ‘return’ (% change) and creating ‘lags’ of those returns in an attempt to create a pattern leading to the sign of the return:

image

Exploratory data analysis and visualization

Various market indicator data was explored including US Treasury data and Financial Industry Regulatory Authority (FINRA) data, but a comparison with TSLA stock did not reveal useful insight.
Several market indicators were also explored including volatility, relative strength index, Bollinger bands, moving average convergence divergence (MACD), and candlesticks. While many of those showed promise, the MACD was easiest to understand and incorporate during later stages.

image

Modeling of investment strategies

Investment strategies modeled include:

  • Indicator Strategies
    • MACD
    • Momentum
  • Linear Regression (ordinary least squares)
  • Logistic Regression (Sklearn)
  • Deep Learning (Keras)

After reviewing modeling results, I adjusted the approach to the problem by adding features, adding lags of those features, and choosing a new classification algorithm:

  • Features (10 lags each)
    • Rolling volume
    • Rolling daily minimum price
    • Rolling daily maximum price
    • MACD buy/sell signal
  • AdaBoost() classification algorithm

Evaluation

The full results are shown in the table below. As the models increased in complexity (deep learning/revised approach), overfitting was apparent and financial performance on the test data was lowest among ML techniques. The revised approach provided the highest accuracy scores and the MACD strategy had the top overall financial performance.

image

Regarding deployment, I believe much more analysis & insight is needed before any of these models can be relied upon for consistent positive returns.

About

No description, website, or topics provided.

Resources

License

Stars

Watchers

Forks

Releases

No releases published

Packages

 
 
 

Contributors