# LSTM Stock Prediction Notebook

## Table of Contents
1. [Introduction](#1-introduction)
2. [Libraries](#2-libraries)  
3. [Data Collection](#3-data-collection)  
4. [Data Exploration](#4-data-exploration)  
5. [Feature Engineering](#5-feature-engineering)  
6. [Model Architecture](#6-model-architecture)  
7. [Training & Evaluation](#7-training--evaluation)  
8. [Results & Diagnostics](#8-results--diagnostics)  
9. [Conclusion](#9-conclusion)

## 1. Introduction

## 2. Libraries

In [7]:
# Core Libraries 
import pandas as pd
import numpy as np
import os
from dotenv import load_dotenv
from datetime import datetime

# Machine Learning Libraries
import tensorflow as tf
from tensorflow import keras
from sklearn.preprocessing import StandardScaler
from sklearn.metrics import mean_squared_error, mean_absolute_error, r2_score

# Visualization Libraries
import matplotlib.pyplot as plt
import seaborn as sns
import mplfinance as mpf

# Alpaca Libraries
from alpaca.data.historical import StockHistoricalDataClient
from alpaca.data.requests import StockBarsRequest
from alpaca.data.timeframe import TimeFrame

## 3. Data Collection

In [10]:
# Load environment variables
load_dotenv()

# Gather data from Alpaca
client = StockHistoricalDataClient(os.getenv('ALPACA_API_KEY'), os.getenv('ALPACA_SECRET_KEY'))
request_params = StockBarsRequest(
    symbol_or_symbols=["AAPL"],
    timeframe=TimeFrame.Day,
    start=datetime(2016, 7, 1),
    end=datetime(2025, 7, 1)
)
bars = client.get_stock_bars(request_params)

# Convert to DataFrane
df = bars.df.reset_index()

## 4. Data Exploration

In [11]:
print("Head of DataFrame:")
print(df.head())

Head of DataFrame:
  symbol                 timestamp   open    high    low  close      volume  \
0   AAPL 2016-07-01 04:00:00+00:00  95.49  96.465  95.33  95.89  27180926.0   
1   AAPL 2016-07-05 04:00:00+00:00  95.39  95.400  94.46  95.04  30590138.0   
2   AAPL 2016-07-06 04:00:00+00:00  94.60  95.660  94.37  95.53  32320508.0   
3   AAPL 2016-07-07 04:00:00+00:00  95.70  96.500  95.62  95.94  26759405.0   
4   AAPL 2016-07-08 04:00:00+00:00  96.49  96.890  96.05  96.68  30976552.0   

   trade_count       vwap  
0     154544.0  95.995066  
1     153278.0  94.848509  
2     187589.0  95.158127  
3     143923.0  96.051727  
4     168615.0  96.635640  


In [12]:
print("\n DataFrame Info:")
print(df.info())


 DataFrame Info:
<class 'pandas.core.frame.DataFrame'>
RangeIndex: 2261 entries, 0 to 2260
Data columns (total 9 columns):
 #   Column       Non-Null Count  Dtype              
---  ------       --------------  -----              
 0   symbol       2261 non-null   object             
 1   timestamp    2261 non-null   datetime64[ns, UTC]
 2   open         2261 non-null   float64            
 3   high         2261 non-null   float64            
 4   low          2261 non-null   float64            
 5   close        2261 non-null   float64            
 6   volume       2261 non-null   float64            
 7   trade_count  2261 non-null   float64            
 8   vwap         2261 non-null   float64            
dtypes: datetime64[ns, UTC](1), float64(7), object(1)
memory usage: 159.1+ KB
None


In [13]:
print("\n DataFrame Descriptive Statistics:")
print(df.describe())


 DataFrame Descriptive Statistics:
              open         high          low        close        volume  \
count  2261.000000  2261.000000  2261.000000  2261.000000  2.261000e+03   
mean    182.703186   184.678842   180.916845   182.912084  5.948293e+07   
std      58.044565    58.851109    57.386703    58.246307  3.868233e+07   
min      94.600000    95.400000    94.370000    95.040000  2.080515e+06   
25%     145.660000   147.230000   144.370000   145.910000  3.100255e+07   
50%     172.400000   174.010000   170.970000   172.570000  4.871406e+07   
75%     204.430000   207.160000   202.586900   204.610000  7.803178e+07   
max     514.790000   515.140000   500.330000   506.090000  3.570209e+08   

        trade_count         vwap  
count  2.261000e+03  2261.000000  
mean   4.735065e+05   182.881165  
std    3.062269e+05    58.176753  
min    3.000000e+00    94.848509  
25%    2.021250e+05   145.814255  
50%    4.770020e+05   172.638542  
75%    6.347160e+05   204.960355  
max    2