# Neural Networks for Stock Price Prediction  
### A Comparative Study of LSTM vs Statistical Baselines Across a Multi Asset Universe  
**UROP Project Report Notebook**  
Nicholas Hong  
2025

---

This notebook supports the main written report by providing:

- Clean loading of all computed metrics  
- Display of correlation heatmaps  
- Visual comparison of models per symbol  
- Summary tables for RMSE, MAE, returns, and universe properties  
- Commentary accompanying each figure  

All heavy model training code is *intentionally omitted* to keep this notebook focused on results and interpretation.


## Abstract

This notebook presents empirical results from a comprehensive comparison of forecasting models applied to daily stock prices across a broad equity universe from 2010 to 2025. Models include:

- Naive lag-1 benchmark  
- Moving average models (5 day and 20 day)  
- Long Short Term Memory (LSTM) neural network  

Across all assets, the naive lag model consistently achieves the lowest RMSE and MAE on out of sample test data, affirming strong persistence in daily log prices and confirming findings from the academic literature regarding the difficulty of outperforming simple baselines at short forecasting horizons.

Figures and tables in this notebook illustrate:

- Universe normalized prices  
- Correlation structure  
- Model predictions vs actual values for key symbols  
- Per-symbol RMSE comparison  
- Summary statistics of daily returns  

These visuals are referenced extensively in the full written report.


In [None]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt
from IPython.display import Image, display

# Load CSV outputs generated by your model scripts
metrics = pd.read_csv("metrics_all_symbols.csv")
best = pd.read_csv("best_model_per_symbol.csv")
pivot = pd.read_csv("rmse_pivot_table.csv")
returns = pd.read_csv("returns_summary_stats.csv")

metrics.head()


# 1. Introduction

Stock price prediction remains a central challenge in quantitative finance. Despite the growth of machine learning and deep learning architectures, financial time series remain highly noisy, nonlinear, and regime dependent. This creates uncertainty about whether neural networks truly outperform strong statistical baselines.

This project investigates:

> Can an LSTM neural network outperform simple autoregressive baselines such as naive lag-1 and moving averages for 1 day ahead stock price prediction?

Using a large, diverse universe of equities and index ETFs, we run:

- A unified LSTM forecasting pipeline  
- Rolling out-of-sample evaluation  
- RMSE and MAE comparison across models  
- Visual and tabular diagnostics  

The results align with academic findings that **the naive lag-1 benchmark is extremely strong**, often beating deep learning models when forecasting horizons are short and only price history is used as input.


# 2. Data and Universe Construction

We study a broad multi sector equity universe including:

- Mega cap technology (AAPL, MSFT, NVDA, AMZN, META, GOOGL, GOOG, TSLA)  
- Consumer names (COST, WMT, MCD, DIS, HD, PEP, KO)  
- Financials (JPM, GS, SCHW)  
- Healthcare / pharma (UNH, ABBV, JNJ, PFE)  
- Energy (XOM, CVX)  
- Industrials (CAT, BA, TM)  
- Luxury (MC.PA)  
- Index ETFs & indices (SPY, QQQ, DIA, ^GSPC, ^NDX, ^DJI)

Daily adjusted close data from 2010 to 2025 is used, aligned across trading days.

We compute:

- Daily arithmetic returns  
- Log price transformations  
- Universe level summary statistics  
- Pairwise correlation matrix  


In [None]:
display(Image(filename="universe_normalized_prices.png"))


### Interpretation

The massive divergence in growth across assets is immediately visible.

- High growth, high volatility names (NVDA, TSLA, etc) dominate the upper end.  
- Defensive sectors remain stable.  
- Index trackers show smooth compounded growth curves.  
- Crashes such as 2020 and 2022 appear clearly.

This motivates model difficulty:  
**the larger the swings, the harder the precise 1 day prediction problem becomes.**


In [None]:
display(Image(filename="corr_heatmap.png"))


### Interpretation

The heatmap shows strong block structure:

- Tech names cluster together  
- Energy stocks correlate with each other  
- Index ETFs exhibit near 1 correlations with their underlying benchmarks  
- Cross sector relationships remain positive due to broad market beta exposure  

This supports future work on:

- Multi asset LSTM  
- Factor extraction  
- Cross asset regularisation  


In [None]:
returns.head(10)


### Key Observations

- High volatility symbols (NVDA, NFLX, AMD) exhibit standard deviations exceeding 3 percent per day.  
- Index trackers like SPY and DIA have volatility near 1 percent.  
- All assets show positive sample mean returns.

Volatility naturally increases RMSE for models on these assets.


In [None]:
display(Image(filename="fig_AAPL_test_models.png"))
display(Image(filename="fig_QQQ_test_models.png"))
display(Image(filename="fig_SPY_test_models.png"))


### Interpretation

All models track the log price tightly.

However, visually tight tracking does *not* imply better RMSE:

- Naive lag follows price immediately  
- Moving averages lag behind during sharp reversals  
- LSTM occasionally over or undershoots turning points  

Subtle deviations accumulate into measurable error.


In [None]:
pivot.head(15)


### Interpretation

The RMSE pivot table reveals the central empirical finding:

> The naive lag-1 model beats LSTM, MA5, and MA20 for **every** symbol.

This matches academic literature:  
At a 1 day horizon, past price already contains nearly all predictable information, so copying it provides a near optimal Bayesian forecast under a random walk assumption.

LSTMs cannot beat this unless:

- More features are added  
- The horizon is extended  
- A volatility or directional objective is used  


In [None]:
best.head(20)


### Summary

Across the entire universe:

- Best RMSE = Naive lag-1  
- MA5 sometimes comes close  
- MA20 consistently worse  
- LSTM almost never wins  

Notably, volatility correlates with LSTM failure:

- NVDA RMSE LSTM ≈ 0.024  
- NVDA RMSE naive ≈ 0.0028  
- AVGO LSTM RMSE is >10× larger than naive  


In [None]:
# Example: generate simple directional strategy
preds = metrics.pivot(index="Symbol", columns="Model", values="RMSE")

preds.head()


# 6. Discussion

The results reinforce several truths about financial forecasting:

1. **Short horizon prediction is dominated by persistence.**  
2. **LSTMs require richer features to outperform simple rules.**  
3. **Volatility amplifies error.**  
4. **Baseline selection is essential**; without naive lag-1, one might falsely conclude LSTM is working.  

These insights guide future quant research, preventing wasted cycles on architectures that cannot beat strong, simple priors.


# 7. Conclusion

This notebook demonstrates:

- Clean benchmarking of neural networks against classical time series baselines  
- Consistent superiority of naive lag-1 for 1 day forecasts  
- Importance of honest out-of-sample evaluation  
- A framework ready for extension into trading simulations  

The notebook’s figures and tables are referenced directly in the full written report.


# References

Include these in your main 20 page report as well:

- Chevalier G (2018) LARNN: Linear Attention Recurrent Neural Network  
- Deepika & Bhat (2021) An Efficient Stock Market Prediction Method Based on Kalman Filter  
- Dhyani B (2020) Stock Market Forecasting using ARIMA  
- Zhang, Aggarwal & Qi (2017) Stock Movement Prediction with LSTM  
- Hochreiter & Schmidhuber (1997) Long Short Term Memory  
- Fama, E. (1970) Efficient Capital Markets: A Review of Theory and Empirical Work  
