## NVIDIA NVDA and TSM GC AAPL from 2018-2023

## 1. Stationarity

- Before applying Granger Causality test, it is necessary to test for stationary. \
- Augmented Dickey-Fuller or Phillips-Perron are tests for stationary. \
- Null hypothesis: time series has at least one unit root (i.e. non-stationary).
- Alternative hypothesis: time series does not have unit roots. d. 

## 2. GC for non-stationary series (Toda & Yamamoto 1995)
- Check to see wheter the two series are cointegrate.
- Apply GC for both directions

In [2]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

In [3]:
## https://www.machinelearningplus.com/time-series/granger-causality-test-in-python/
def grangers_causation_matrix(data, variables,test='ssr_chi2test', verbose=False):    
    """Check Granger Causality of all possible combinations of the Time series.
    The rows are the response variable, columns are predictors. The values in the table 
    are the P-Values. P-Values lesser than the significance level (0.05), implies 
    the Null Hypothesis that the coefficients of the corresponding past values is 
    zero, that is, the X does not cause Y can be rejected.

    data      : pandas dataframe containing the time series variables
    variables : list containing names of the time series variables.
    """
    df = pd.DataFrame(np.zeros((len(variables), len(variables))), columns=variables, index=variables)
    for c in df.columns:
        for r in df.index:
            test_result = grangercausalitytests(data[[r, c]], maxlag=10, verbose=False)
            p_values = [round(test_result[i+1][0][test][1],4) for i in range(10)]
            if verbose: print(f'Y = {r}, X = {c}, P Values = {p_values}')
            min_p_value = np.min(p_values)
            df.loc[r, c] = min_p_value
    df.columns = [var + '_x' for var in variables]
    df.index = [var + '_y' for var in variables]
    return df

In [4]:
import yfinance as yf
import pandas as pd
import numpy as np
import seaborn as sns
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
import statsmodels

In [5]:
from pandas_datareader import data as pdr

import yfinance as yf
yf.pdr_override() 

## AAPL data from 2018 to 2023
data_aapl = pdr.get_data_yahoo("AAPL", start="2018-01-01", end="2023-12-31")

## TSM data from 2018 to 2023
data_tsm = pdr.get_data_yahoo("TSM", start="2018-01-01", end="2023-12-31")

## NVDA data from 2018 to 2023
data_nvda = pdr.get_data_yahoo("NVDA", start="2018-01-01", end="2023-12-31")

[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


## Compute Adj Close Diff and Daily Return for AAPL

In [9]:
data_aapl['Adj Close Diff'] = data_aapl['Adj Close'].diff()
data_aapl['Daily Return'] = data_aapl['Adj Close'].pct_change()

data_tsm['Adj Close Diff'] = data_tsm['Adj Close'].diff()
data_tsm['Daily Return'] = data_tsm['Adj Close'].pct_change()

data_nvda['Adj Close Diff'] = data_nvda['Adj Close'].diff()
data_nvda['Daily Return'] = data_nvda['Adj Close'].pct_change()

In [10]:
data_aapl['TSM Adj Close Diff'] = data_tsm['Adj Close Diff'].copy()
data_aapl['TSM Daily Return'] = data_tsm['Daily Return'].copy()

data_aapl['NVDA Adj Close Diff'] = data_nvda['Adj Close Diff'].copy()
data_aapl['NVDA Daily Return'] = data_nvda['Daily Return'].copy()

## Apply GC test

We can only apply GC for stationary time series. The GC test will overestimate for non-stationary time series

In [11]:
from statsmodels.tsa.stattools import acf, pacf, grangercausalitytests

## GC matrix

In [12]:
cols = ['Adj Close Diff','Daily Return','TSM Adj Close Diff','TSM Daily Return','NVDA Adj Close Diff','NVDA Daily Return']
df = data_aapl[cols].copy()

In [14]:
df = df.dropna()

In [15]:
print("Granger Causality from 2018 to 2023")

grangers_causation_matrix(df, variables=df.columns)

Granger Causality from 2018 to 2023


Unnamed: 0,Adj Close Diff_x,Daily Return_x,TSM Adj Close Diff_x,TSM Daily Return_x,NVDA Adj Close Diff_x,NVDA Daily Return_x
Adj Close Diff_y,1.0,0.0105,0.1023,0.0233,0.0093,0.0204
Daily Return_y,0.0078,1.0,0.3174,0.0382,0.0323,0.0211
TSM Adj Close Diff_y,0.0273,0.0143,1.0,0.1794,0.0281,0.1164
TSM Daily Return_y,0.0111,0.0,0.0248,1.0,0.0473,0.3441
NVDA Adj Close Diff_y,0.1598,0.063,0.0761,0.0566,1.0,0.2497
NVDA Daily Return_y,0.0374,0.0,0.0573,0.0022,0.0148,1.0


In [29]:
## Granger Causality test on TSM Adj Close Diff
grangercausalitytests(df[['Adj Close Diff','TSM Daily Return']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=3.7413  , p=0.0533  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=3.7488  , p=0.0528  , df=1
likelihood ratio test: chi2=3.7441  , p=0.0530  , df=1
parameter F test:         F=3.7413  , p=0.0533  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.7478  , p=0.0238  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=7.5205  , p=0.0233  , df=2
likelihood ratio test: chi2=7.5018  , p=0.0235  , df=2
parameter F test:         F=3.7478  , p=0.0238  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.7984  , p=0.0389  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=8.4344  , p=0.0378  , df=3
likelihood ratio test: chi2=8.4109  , p=0.0382  , df=3
parameter F test:         F=2.7984  , p=0.0389  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.5977  , p=0.

{1: ({'ssr_ftest': (3.7413290345935546, 0.05326958670972632, 1504.0, 1),
   'ssr_chi2test': (3.748791791976388, 0.05284569706447627, 1),
   'lrtest': (3.744136789586264, 0.05299309281493102, 1),
   'params_ftest': (3.7413290345937344, 0.05326958670972071, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24ed8bc2ea0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24ed8ac7860>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (3.747777591620685, 0.023790927332119644, 1501.0, 2),
   'ssr_chi2test': (7.52052372149334, 0.023277644075482766, 2),
   'lrtest': (7.501808353265005, 0.02349649127015493, 2),
   'params_ftest': (3.7477775916208014, 0.023790927332118683, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edbd5bf20>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edbd5b020>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.798

GC at lag = 2

In [30]:
## Granger Causality test on NVDA Adj Close Diff
grangercausalitytests(df[['Adj Close Diff','NVDA Adj Close Diff']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=3.9421  , p=0.0473  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=3.9499  , p=0.0469  , df=1
likelihood ratio test: chi2=3.9448  , p=0.0470  , df=1
parameter F test:         F=3.9421  , p=0.0473  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=4.5070  , p=0.0112  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=9.0440  , p=0.0109  , df=2
likelihood ratio test: chi2=9.0169  , p=0.0110  , df=2
parameter F test:         F=4.5070  , p=0.0112  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=3.1217  , p=0.0251  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=9.4089  , p=0.0243  , df=3
likelihood ratio test: chi2=9.3796  , p=0.0246  , df=3
parameter F test:         F=3.1217  , p=0.0251  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.5059  , p=0.

{1: ({'ssr_ftest': (3.9420673447441144, 0.04727422155633112, 1504.0, 1),
   'ssr_chi2test': (3.9499305109902796, 0.04687330335828592, 1),
   'lrtest': (3.9447630451195437, 0.047017474980609535, 1),
   'params_ftest': (3.9420673447442156, 0.04727422155633112, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edbd3aff0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb6595b0>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (4.506964458783431, 0.011181601243849034, 1501.0, 2),
   'ssr_chi2test': (9.043955329684007, 0.0108675101095979, 2),
   'lrtest': (9.016907808014366, 0.011015478003219264, 2),
   'params_ftest': (4.506964458783358, 0.011181601243849034, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edbd58e60>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9b4200>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (3.1

GC at lag = 8

In [31]:
## Granger Causality test on NVDA Daily Return
grangercausalitytests(df[['Adj Close Diff','NVDA Daily Return']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=0.4443  , p=0.5052  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=0.4452  , p=0.5046  , df=1
likelihood ratio test: chi2=0.4451  , p=0.5047  , df=1
parameter F test:         F=0.4443  , p=0.5052  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.3105  , p=0.0368  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=6.6430  , p=0.0361  , df=2
likelihood ratio test: chi2=6.6284  , p=0.0364  , df=2
parameter F test:         F=3.3105  , p=0.0368  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.1806  , p=0.0885  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=6.5725  , p=0.0868  , df=3
likelihood ratio test: chi2=6.5581  , p=0.0874  , df=3
parameter F test:         F=2.1806  , p=0.0885  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=1.6461  , p=0.

{1: ({'ssr_ftest': (0.4442712368443283, 0.5051703133916579, 1504.0, 1),
   'ssr_chi2test': (0.4451574161731401, 0.5046436367554201, 1),
   'lrtest': (0.4450916809018963, 0.5046751004838429, 1),
   'params_ftest': (0.44427123684434777, 0.5051703133916579, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edbd5b2c0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edbd3a780>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (3.3104632643664975, 0.03676593946904521, 1501.0, 2),
   'ssr_chi2test': (6.642981580460953, 0.03609897561973727, 2),
   'lrtest': (6.628373392381036, 0.03636361121753626, 2),
   'params_ftest': (3.3104632643665464, 0.03676593946904521, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9b6330>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9b79e0>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.18062

GC at lag = 7

In [32]:
## Granger Causality test on TSM Daily Return
grangercausalitytests(df[['Daily Return','TSM Daily Return']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=1.2387  , p=0.2659  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=1.2412  , p=0.2652  , df=1
likelihood ratio test: chi2=1.2406  , p=0.2653  , df=1
parameter F test:         F=1.2387  , p=0.2659  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=2.4762  , p=0.0844  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=4.9690  , p=0.0834  , df=2
likelihood ratio test: chi2=4.9608  , p=0.0837  , df=2
parameter F test:         F=2.4762  , p=0.0844  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.1394  , p=0.0934  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=6.4482  , p=0.0917  , df=3
likelihood ratio test: chi2=6.4345  , p=0.0923  , df=3
parameter F test:         F=2.1394  , p=0.0934  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.2103  , p=0.

{1: ({'ssr_ftest': (1.2386847209461431, 0.2659025400804121, 1504.0, 1),
   'ssr_chi2test': (1.2411555016395197, 0.26524847664669093, 1),
   'lrtest': (1.2406446782533749, 0.26534684477409126, 1),
   'params_ftest': (1.2386847209455307, 0.265902540080534, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9b7230>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb281610>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (2.47624224136242, 0.08440183988966797, 1501.0, 2),
   'ssr_chi2test': (4.968981766144976, 0.08336798776320604, 2),
   'lrtest': (4.960802283056182, 0.0837096394435645, 2),
   'params_ftest': (2.4762422413623026, 0.0844018398896755, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9c73e0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9c65a0>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.139415135

GC at lag = 5

In [33]:
## Granger Causality test on NVDA Adjust Close Diff
grangercausalitytests(df[['Daily Return','NVDA Adj Close Diff']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=4.5735  , p=0.0326  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=4.5826  , p=0.0323  , df=1
likelihood ratio test: chi2=4.5756  , p=0.0324  , df=1
parameter F test:         F=4.5735  , p=0.0326  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.2019  , p=0.0410  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=6.4251  , p=0.0403  , df=2
likelihood ratio test: chi2=6.4114  , p=0.0405  , df=2
parameter F test:         F=3.2019  , p=0.0410  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.1812  , p=0.0884  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=6.5742  , p=0.0868  , df=3
likelihood ratio test: chi2=6.5599  , p=0.0873  , df=3
parameter F test:         F=2.1812  , p=0.0884  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=1.6909  , p=0.

{1: ({'ssr_ftest': (4.573458389636074, 0.03263164882987908, 1504.0, 1),
   'ssr_chi2test': (4.582580979509018, 0.03229853086058829, 1),
   'lrtest': (4.575627571046425, 0.032429865119280976, 1),
   'params_ftest': (4.573458389635366, 0.03263164882989704, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9c4bf0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9b7c20>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (3.2018716267705405, 0.04096403100562806, 1501.0, 2),
   'ssr_chi2test': (6.42507484332636, 0.04025434133564879, 2),
   'lrtest': (6.4114079948631115, 0.040530358323681, 2),
   'params_ftest': (3.2018716267706493, 0.04096403100562153, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9ce600>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9ce300>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.1812053

GC at lag = 1

In [34]:
## Granger Causality test on NVDA Daily Return
grangercausalitytests(df[['Daily Return','NVDA Daily Return']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=2.6007  , p=0.1070  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=2.6059  , p=0.1065  , df=1
likelihood ratio test: chi2=2.6036  , p=0.1066  , df=1
parameter F test:         F=2.6007  , p=0.1070  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.8459  , p=0.0216  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=7.7174  , p=0.0211  , df=2
likelihood ratio test: chi2=7.6977  , p=0.0213  , df=2
parameter F test:         F=3.8459  , p=0.0216  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.6418  , p=0.0480  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=7.9626  , p=0.0468  , df=3
likelihood ratio test: chi2=7.9416  , p=0.0472  , df=3
parameter F test:         F=2.6418  , p=0.0480  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.0107  , p=0.

{1: ({'ssr_ftest': (2.600701659608648, 0.10702620941621012, 1504.0, 1),
   'ssr_chi2test': (2.6058892294083993, 0.10646742402038134, 1),
   'lrtest': (2.6036387846215803, 0.10661866760786043, 1),
   'params_ftest': (2.600701659608253, 0.10702620941624248, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9c7290>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9cfb00>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (3.84589805176766, 0.021578074798249143, 1501.0, 2),
   'ssr_chi2test': (7.7174183423878695, 0.02109521226342795, 2),
   'lrtest': (7.697711883254669, 0.02130409563052955, 2),
   'params_ftest': (3.8458980517673296, 0.021578074798257182, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9e0980>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x24edb9e1b20>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.641

GC at lag = 2

## VAR model

In [17]:
from statsmodels.tsa.api import VAR

In [18]:
df_Diff = df.copy()
df_Diffdf_Diff.drop(['Daily Return','TSM Adj Close Diff'], axis=1)
df_Diff

Unnamed: 0_level_0,Adj Close Diff,TSM Daily Return,NVDA Adj Close Diff,NVDA Daily Return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2018-01-03,-0.007069,0.016821,3.245312,0.065814
2018-01-04,0.188625,-0.005274,0.277031,0.005271
2018-01-05,0.464485,0.023379,0.447720,0.008474
2018-01-08,-0.153252,-0.000471,1.632553,0.030641
2018-01-09,-0.004715,-0.006126,-0.014851,-0.000270
...,...,...,...,...
2023-12-22,-1.077133,0.005851,-1.599945,-0.003266
2023-12-26,-0.548553,0.012603,4.489838,0.009195
2023-12-27,0.099716,0.001915,1.379913,0.002800
2023-12-28,0.428879,0.000478,1.049957,0.002125


In [19]:
model_diff = VAR(df_Diff)

  self._init_dates(dates, freq)


In [21]:
ps = range(1,10)
AIC = np.zeros(len(ps))
for i, p in enumerate(ps):
    result = model_diff.fit(p)
    AIC[i] = result.aic

In [22]:
AIC

array([-11.57196843, -11.56106429, -11.5475762 , -11.53799312,
       -11.5315108 , -11.52993829, -11.52754989, -11.5234151 ,
       -11.51573193])

In [26]:
print('Minimum AIC =', np.min(AIC), "at lag =", ps[np.argmin(AIC)])

Minimum AIC = -11.571968425978707 at lag = 1


In [None]:
df_imputed

In [None]:
df_imputed_transformed = df_imputed.dropna()

In [None]:
df_imputed_transformed=df_imputed_transformed.drop(['Open','High','Low','Close','Adj Close','Adj Close Prev','Volume'], axis=1)
df_imputed_transformed

In [None]:
model = VAR(df_imputed_transformed)

In [None]:
results = model.fit(2)

In [None]:
results.summary()

In [None]:
model.select_order(15)

## Remove five dates from data frame (including open, close, high, low, volume and adj close) 

In [None]:
df = data_aapl_2023.copy()

In [None]:
df

In [None]:
df.index = pd.to_datetime(df.index)

In [None]:
#missing_dates
missing_dates = sorted(missing_dates)

In [None]:
df.loc[missing_dates,['Open','High','Low','Close','Adj Close','Volume','Adj Close Diff','Adj Close Prev','Daily Return']] = np.nan

In [None]:
df.loc[missing_dates]

In [None]:
df

In [None]:
imputed_indices = df[df['Adj Close'].isnull()].index

In [None]:
imputed_indices

In [None]:
# Plot APPL and ITW
# Plot Adj Close
# Plot the main line with markers
plt.figure(figsize=(12,8))
plt.plot(data['Adj Close'].diff(),'.-',color='red',label='APPL')
plt.plot(data_ITW['Adj Close'].diff(),'.-',color='blue',label='ITW')

# Set labels
plt.xlabel('Date', fontsize=12)
plt.ylabel('Adj Close', fontsize=12)
plt.title('Adj Close Difference between APPL and ITW')

plt.legend(fontsize=12)
plt.show()

In [None]:
cols = ['Daily Return','TSM Daily Return','Adj Close Diff','TSM Adj Close Diff']
df_2023 = data_aapl_2023[cols].copy()
df_2022 = data_aapl_2022[cols].copy()
df_2021 = data_aapl_2021[cols].copy()