# **Core Stock Data EDA for META Ticker**
## In this notebook we will examine only the Meta stock for the periods we have selected for this project (01-01-2019 through 06-30-2024), and see what we can derive from it through our plots.  We will look at each of our core stock tickers separately in order to gain a better analysis.

#### Let's start by bringing in the libraries and logic necessary for reading in our file.

In [1]:
import sys
import os

project_root = os.path.abspath(os.path.join(os.getcwd(), '..'))
if project_root not in sys.path:
    sys.path.append(project_root)

import numpy as np
import pandas as pd
import plotly.express as px
import plotly.graph_objs as go
from scipy.stats import linregress
from scipy.stats import gaussian_kde

#### Now let's read in our data that we need for this notebook.

In [2]:
# Now let's access the main core_stock_data.csv file
csv_path = os.path.join(project_root, 'data', 'core_stock_data.csv')
core_stock_data = pd.read_csv(csv_path, parse_dates=['Date'], index_col= 'Date')
core_stock_data.head()

Unnamed: 0_level_0,Close,Volume,Open,High,Low,SMA_core,EMA_core,RSI_core,Ticker
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1
2019-03-14,45.932499,94318000,45.974998,46.025002,45.639999,41.35925,42.219051,75.741602,AAPL
2019-03-15,46.529999,156171600,46.212502,46.8325,45.935001,41.50025,42.388107,76.98591,AAPL
2019-03-18,47.005001,104879200,46.450001,47.0975,46.447498,41.7294,42.569162,78.724282,AAPL
2019-03-19,46.6325,126585600,47.087502,47.247501,46.48,41.92075,42.728509,73.527018,AAPL
2019-03-20,47.040001,124140800,46.557499,47.372501,46.182499,42.1219,42.897587,80.396901,AAPL


In [3]:
# Now let's just select our subject stock information in the META stock.
meta_data = core_stock_data[core_stock_data['Ticker'] == 'META']
print(meta_data.head())

                 Close    Volume        Open        High         Low  \
Date                                                                   
2019-01-02  135.679993  28146200  128.990005  137.509995  128.559998   
2019-01-03  131.740005  22717900  134.690002  137.169998  131.119995   
2019-01-04  137.949997  29002100  134.009995  138.000000  133.750000   
2019-01-07  138.050003  20089300  137.559998  138.869995  135.910004   
2019-01-08  142.529999  26263800  139.889999  143.139999  139.539993   

             SMA_core    EMA_core   RSI_core Ticker  
Date                                                 
2019-01-02  104.62104  135.679993  52.828980   META  
2019-01-03  104.62104  135.525483   0.000000   META  
2019-01-04  104.62104  135.620562  61.182311   META  
2019-01-07  104.62104  135.715835  61.561043   META  
2019-01-08  104.62104  135.983057  73.251918   META  


#### Note above that the starting date is 03-14, this is because of the rolling windows when we calculated our SMA (Single Moving Average).  With our setting of 50 days for the window it was offset by the starting date of 01-01-2019, and we needed a whole window to compute.  So the date you see above in 03-14 starts a new window for us.

#### Let's begin our EDA analysis with a simple look at Closing price over time.  We will be using Plotly a lot here, as we can use the interactivity of a singular plot to make multiple insights.

In [4]:
x = np.arange(len(meta_data))
y = meta_data['Close'].values

slope, intercept, r_value, p_value, std_err = linregress(x, y)

regression_line = slope * x + intercept


fig = go.Figure()

fig.add_trace(go.Scatter(x = meta_data.index, y = meta_data['Close'], mode = 'lines', name = 'Close Price'))

fig.add_trace(go.Scatter(x = meta_data.index, y = regression_line, mode = 'lines', name = 'Linear Trend', line = dict(color = 'red', dash = 'dash')))

fig.update_layout(title = 'META Closing Price with Linear Trend Line', xaxis_title = 'Date', yaxis_title = 'Price', template = 'plotly_dark')

fig.show()

#### Key Takeaways:  AAPL experienced fluctuation in pricing during our observation period.  Looking into historical events we can see:
#### (2019 - 2020) Between 2019 and 2020 AAPL had slowing iPhone sales which led to a direct impact on its stock performance.  Also with the trade tensions between US and China (which AAPL relies heavily on) the stock took a hit as a result.

#### (2020 - 2023) Coming out of the pandemic AAPL began to recover rapidly as everyone began to go back to work.  There was a surge of demand for their products and it was reflected in their corresponding stock prices.  Between these events, their 2020 Q3 stock split, and the anticipation surrounding the 5G iPhone launch AAPL had an excellent period of strong performance.

#### (2023 - end of data (06-30-2024)) This period is a bit of a mixed bag for AAPL, and with reasons.  On the positive there is the continuing AI Surge that has contributed positively, as well as their continued new product launches.  However there are the looming fears of a recession as well as continued inflation and high interest rates (amongst other things) that have kept AAPL from doing more in this period.

#### Now let's look at Volume over the same period for AAPL.

In [5]:
# converting to monthly data for a smoother plot
meta_monthly = meta_data.resample('ME').sum()

fig = go.Figure()

fig.add_trace(go.Bar(x = meta_monthly.index, y = meta_monthly['Volume'], name = 'Monthly Volume', marker_color = 'cyan'))

fig.update_layout(title = 'META Monthly Volume Over Time', xaxis_title = 'Date', yaxis_title = 'Volume', template = 'plotly_dark')

fig.show()

#### Looking at the above monthly Volume for AAPL we can see a couple things.  The same time period for 2020 - 2021 again is noticeable, this time as AAPL's most positive trading period.  After 2020 the trading volume leveled off substantially to norms for AAPL, and after 2022 it has declined even further into 2024.

#### Now let's look at the SMA and EMA (Simple Moving Average and Exponential Moving Average, respectively) for AAPL, to see what trends exist in this time period.

In [6]:
fig = go.Figure()

# We will again use Close Price here as a starting figure for our SMA and EMA
fig.add_trace(go.Scatter(x = meta_data.index, y = meta_data['Close'], mode = 'lines', name = 'Close Price'))

# Plot our SMA
if 'SMA_core' in meta_data.columns:
    fig.add_trace(go.Scatter(x = meta_data.index, y = meta_data['SMA_core'], mode = 'lines', name = 'SMA 50'))

# Plot our EMA
if 'EMA_core' in meta_data.columns:
    fig.add_trace(go.Scatter(x = meta_data.index, y = meta_data['EMA_core'], mode = 'lines', name = 'EMA 50'))

fig.update_layout(title = 'META Closing Price with SMA and EMA', xaxis_title = 'Date', yaxis_title = 'Price', template = 'plotly_dark')
fig.show()

#### Let's take a look at this one, as it introduces some new concepts.  Only looking at the SMA and EMA lines for a second (the red and the green lines) if both of them are sloping upwards it can indicate a short-term uptrend in the price and and confirm a bullish momentum.  We can clearly see this in several positions on our plot, notably in the same 2020 window we have discussed as well as the first half of 2023.  There are smaller examples but these two are the most obvious.  Conversely if these lines are together moving down then it can indicate a short-term downtrend or bearish momentum.

#### Now if the Closing Price is above our SMA and EMA lines then it indicates that the stock is trending above the recent average and shows suggested continued strength.  This can also be illustrated in the same periods noted above.

#### Looking at SMA vs EMA, if the SMA is *above* the EMA line this suggests that the the recent pricing is weaker and can indicate a price slow or momentum downshift.  Conversely if the EMA line is above the SMA line it can indicate a positive price shift and momentum upturn, which we can see the most clearly in Jul-Aug of 2022 and Feb-Mar of 2023.

#### Now let's look at our RSI (Relative Strength Index) for our AAPL data.

In [7]:
# Converting to weekly data for a smoother plot
meta_monthly = meta_data.resample('ME').agg({
    'Close' : 'last',
    'RSI_core' : 'last'
})

fig = go.Figure()

# Plotting our RSI line
fig.add_trace(go.Scatter(x = meta_monthly.index, y = meta_monthly['RSI_core'], mode = 'lines', name = 'RSI'))

# Now adding lines for Overbought and Oversold at 0.7 and 0.3 respectively
fig.add_trace(go.Scatter(x = meta_monthly.index, y = [70]*len(meta_monthly), mode = 'lines', name = 'Overbought (70)', line = dict(dash = 'dash', color = 'red')))
fig.add_trace(go.Scatter(x = meta_monthly.index, y = [30]*len(meta_monthly), mode = 'lines', name = 'Oversold (30)', line = dict(dash = 'dash', color = 'green')))

fig.update_layout(title = 'META Monthly RSI Over Time', xaxis_title = 'Date', yaxis_title = 'RSI', template = 'plotly_dark')

fig.show()

#### This plot above is very interesting as it can show us potential price shifts.  The red line for Overbought indicates that while the price has been strong it could be due for a reversal or a decrease.  You can see this happen multiple times over this plot with the sudden spike over the red line, then a quick dip back below.  This can take a while to happen, but eventually does.

#### Let's also look at the Oversold line.  Similarly to the Overbought line this line indicates when a positive shift in price is about to happen.  So when the RSI value dips below the green line the signs point toward an increase in pricing.  This can also be viewed in the plot numerous times, especially in mid-2023 and early 2024.

#### Now let's make use of some of the other features in our dataset and make a Candlestick Chart.

In [8]:
# We will again be using monthly sampling for interpretability
meta_monthly_candles = meta_data.resample('ME').agg({
    'Open' : 'first',
    'High' : 'max',
    'Low' : 'min',
    'Close' : 'last',
    'Volume' : 'sum',
    'SMA_core' : 'last',
    'EMA_core' : 'last'
})

# Let's start by compiling the features we need for this one.
fig = go.Figure(data = [go.Candlestick(x = meta_monthly_candles.index,
                open = meta_monthly_candles['Open'],
                high = meta_monthly_candles['High'],
                low = meta_monthly_candles['Low'],
                close = meta_monthly_candles['Close'],
                name = 'Candlesticks')])

# Now adding in the SMA again
fig.add_trace(go.Scatter(x = meta_monthly_candles.index, y = meta_monthly_candles['SMA_core'], mode = 'lines', name = 'SMA 50'))

# Adding in the EMA as well
fig.add_trace(go.Scatter(x = meta_monthly_candles.index, y = meta_monthly_candles['EMA_core'], mode = 'lines', name = 'EMA 50'))

fig.update_layout(title = 'META Candlestick Chart with SMA and EMA', xaxis_title = 'Date', yaxis_title = 'Price', template = 'plotly_dark')

fig.show()

#### Candlestick charts are great at showing a lot of information.  The size of the candle shows the range of pricing in the given window, in our case a month.  The color (green for positive change, red for negative) will dictate how the final closing price was settled (final closing - beginning opening price for the total window, again a month here).  

#### You can then begin to see trends just by noticing the colors, although there are other parts of the candlestick too.  You can notice buyer/seller behavior by looking at successive green or red candlesticks.  If you see multiple long red candlesticks together it could mean that sellers are pushing prices lower.  This can be demonstrated in AAPL from Apr - Jun 2022.  Conversely successive green candlesticks can show buyer behavior pushing positive price changes.  This can be show in several places, especially from May 2024 - Current.

#### Now let's look at a correlation heatmap to see which indicators are most closely related to price movements for our AAPL stock.

In [10]:
corr_matrix = meta_data[['Close', 'Volume', 'SMA_core', 'EMA_core', 'RSI_core']].corr()

fig = px.imshow(corr_matrix, text_auto = True, aspect = 'auto', color_continuous_scale= 'Viridis')

fig.update_layout(title = 'Correlation Matrix of META Features', template = 'plotly_dark')
fig.show()

#### For the above correlation plot we are looking at which features correlate the strongest with our Close price, as that is going to be our strongest driver for this project.  Looking at our colorbar a score of 1 is very strong, and this chart shows that our SMA_core and EMA_core have very strong correlations with our Close Price and would help us in further predicting further values.  Additively while our Volume and RSI_core provide useful information in other area they do not provide our Close Price any further value.

#### Now let's look at a distribution of daily returns for AAPL.

In [12]:
# Calculating the pct_change of the Close column
meta_data = meta_data.copy()
meta_data.loc[:, 'daily_return'] = meta_data['Close'].pct_change()

# Plotting this new feature, with n = 50 bins as a default
daily_returns = meta_data['daily_return'].dropna()

# Calculating the KDE for the daily returns
kde = gaussian_kde(daily_returns)
x_vals = np.linspace(daily_returns.min(), daily_returns.max(), 1000)
kde_vals = kde(x_vals)

# Calculating the histogram first without plotting
hist_values, bin_edges = np.histogram(daily_returns, bins = 50)

# Calculate the bin width
manual_scaling_factor = max(hist_values) / max(kde_vals) * 1

# Plotting histogram of daily returns
fig = go.Figure(data = [go.Histogram(x = daily_returns, nbinsx=50, name = 'Histogram', marker_color = 'blue', opacity = 0.6)])

# Scaling for the KDE Curve is needed here
#scaling_factor = max(hist_values) / max(kde_vals)
kde_vals_scaled = kde_vals * manual_scaling_factor

# Plotting the KDE Curve
fig.add_trace(go.Scatter(x = x_vals, y = kde_vals_scaled, mode = 'lines', name = 'KDE', line = dict(color = 'red')))

fig.update_layout(title = 'Distribution of META Daily Returns with KDE', xaxis_title = 'Daily Return', yaxis_title = 'Frequency', template = 'plotly_dark')
fig.show()

#### There are some key takeaways from our distribution chart.  The first one is that the distribution is mostly centered around 0, with very little spread from the center.  With this near symmetry it suggests that positive and negative are both likely and that there is no strong bias in the direction of the returns.  Also with it being a normal distribution it shows that our AAPL stock behaves in a predictable manner, where extreme returns are rare.  The minimal spread (distance away from 0) indicates low volatility as well.

#### One more, let's look at a Rolling Mean and Volatility plot.  We will use this to understand the stability of price movements over time, as it is helpful to identify periods of high uncertainty and/or strong trends in our pricing.

In [13]:
### First let's create the rolling mean and rolling std needed for this plot.
#### We will keep the same window size as our SMA and EMA windows for consistency and to also help us as we view the long-term analysis.
meta_data = meta_data.copy()

meta_data.loc[:, 'Rolling_Mean'] = meta_data['Close'].rolling(window = 50).mean()
meta_data.loc[:, 'Rolling_Std'] = meta_data['Close'].rolling(window = 50).std()

fig = go.Figure()

# Plot the Rolling Mean on primary y-axis
fig.add_trace(go.Scatter(x = meta_data.index, y = meta_data['Rolling_Mean'], mode = 'lines', name = 'Rolling Mean'))

# Plot the Rolling Std (Volatility) on secondary y-axis
fig.add_trace(go.Scatter(x = meta_data.index, y = meta_data['Rolling_Std'], mode = 'lines', name = 'Rolling Std (Volatility)', line = dict(dash = 'dash'), yaxis = 'y2'))

fig.update_layout(title = 'META Rolling Mean and Volatility',
                xaxis_title = 'Date',
                yaxis_title = 'Price',
                yaxis2 = dict(
                    title = 'Volatility (Rolling Std)',
                    overlaying = 'y',
                    side = 'right'
                ),
                template = 'plotly_dark'
)

fig.show()

#### In this plot above we can see again the consistent price increase over time in the blue line that is the Rolling Mean.  Our previous plots have shown this as well a bit this is just more reinforcement and a bit more demonstrative of that fact.  The red line in the Rolling Std is a bit more important as it displays Volatility.  Our window size of 50 days plays a part in this too as it will smooth out short-term volatility and provide a longer-term view of that price stability.  As we can see the price for AAPL (barring a couple speedbumps) is quite stable, with little volatility over the timeframe.

## Summary of Findings for AAPL
#### We have taken a good look at our AAPL stock covering the given time period.  Here is some of the key takeaways from our plots:

#### -AAPL Stock Price has shown to be an extremely consistent performer with minimal volatility.
#### -AAPL's Closing Price is strongly correlated with the SMA (Single Moving Average) and EMA (Exponential Moving Average).
#### -AAPL's price behaves in a predictable manner, due to its daily returns showing a normal distribution-like shape.