## NVIDIA NVDA and TSM GC AAPL from 2018-2023

## 1. Stationarity

- Before applying Granger Causality test, it is necessary to test for stationary. \
- Augmented Dickey-Fuller or Phillips-Perron are tests for stationary. \
- Null hypothesis: time series has at least one unit root (i.e. non-stationary).
- Alternative hypothesis: time series does not have unit roots. d. 

In [1]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

In [2]:
## https://www.machinelearningplus.com/time-series/granger-causality-test-in-python/
def grangers_causation_matrix(data, variables,test='ssr_chi2test', verbose=False):    
    """Check Granger Causality of all possible combinations of the Time series.
    The rows are the response variable, columns are predictors. The values in the table 
    are the P-Values. P-Values lesser than the significance level (0.05), implies 
    the Null Hypothesis that the coefficients of the corresponding past values is 
    zero, that is, the X does not cause Y can be rejected.

    data      : pandas dataframe containing the time series variables
    variables : list containing names of the time series variables.
    """
    df = pd.DataFrame(np.zeros((len(variables), len(variables))), columns=variables, index=variables)
    for c in df.columns:
        for r in df.index:
            test_result = grangercausalitytests(data[[r, c]], maxlag=10, verbose=False)
            p_values = [round(test_result[i+1][0][test][1],4) for i in range(10)]
            if verbose: print(f'Y = {r}, X = {c}, P Values = {p_values}')
            min_p_value = np.min(p_values)
            df.loc[r, c] = min_p_value
    df.columns = [var + '_x' for var in variables]
    df.index = [var + '_y' for var in variables]
    return df

In [3]:
import yfinance as yf
import pandas as pd
import numpy as np
import seaborn as sns
from datetime import datetime, timedelta
import matplotlib.pyplot as plt
import statsmodels

In [4]:
from pandas_datareader import data as pdr

import yfinance as yf
yf.pdr_override() 

## AAPL data from 2018 to 2023
data_aapl = pdr.get_data_yahoo("AAPL", start="2018-01-01", end="2023-12-31")

## TSM data from 2018 to 2023
data_tsm = pdr.get_data_yahoo("TSM", start="2018-01-01", end="2023-12-31")

## NVDA data from 2018 to 2023
data_nvda = pdr.get_data_yahoo("NVDA", start="2018-01-01", end="2023-12-31")

[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed
[*********************100%%**********************]  1 of 1 completed


## Compute Adj Close Diff and Daily Return for AAPL

In [5]:
data_aapl['Adj Close Diff'] = data_aapl['Adj Close'].diff()
data_aapl['Daily Return'] = data_aapl['Adj Close'].pct_change()

data_tsm['Adj Close Diff'] = data_tsm['Adj Close'].diff()
data_tsm['Daily Return'] = data_tsm['Adj Close'].pct_change()

data_nvda['Adj Close Diff'] = data_nvda['Adj Close'].diff()
data_nvda['Daily Return'] = data_nvda['Adj Close'].pct_change()

In [6]:
data_aapl['TSM Adj Close Diff'] = data_tsm['Adj Close Diff'].copy()
data_aapl['TSM Daily Return'] = data_tsm['Daily Return'].copy()

data_aapl['NVDA Adj Close Diff'] = data_nvda['Adj Close Diff'].copy()
data_aapl['NVDA Daily Return'] = data_nvda['Daily Return'].copy()

## Apply GC test

We can only apply GC for stationary time series. The GC test will overestimate for non-stationary time series

In [7]:
from statsmodels.tsa.stattools import acf, pacf, grangercausalitytests

## GC matrix

In [8]:
cols = ['Adj Close Diff','Daily Return','TSM Adj Close Diff','TSM Daily Return','NVDA Adj Close Diff','NVDA Daily Return']
df = data_aapl[cols].copy()

In [9]:
df = df.dropna()

In [10]:
print("Granger Causality from 2018 to 2023")

grangers_causation_matrix(df, variables=df.columns)

Granger Causality from 2018 to 2023


Unnamed: 0,Adj Close Diff_x,Daily Return_x,TSM Adj Close Diff_x,TSM Daily Return_x,NVDA Adj Close Diff_x,NVDA Daily Return_x
Adj Close Diff_y,1.0,0.0105,0.1023,0.0233,0.0093,0.0204
Daily Return_y,0.0078,1.0,0.3174,0.0382,0.0323,0.0211
TSM Adj Close Diff_y,0.0273,0.0143,1.0,0.1794,0.0281,0.1164
TSM Daily Return_y,0.0111,0.0,0.0248,1.0,0.0473,0.3441
NVDA Adj Close Diff_y,0.1598,0.063,0.076,0.0566,1.0,0.2497
NVDA Daily Return_y,0.0374,0.0,0.0573,0.0022,0.0148,1.0


In [11]:
## Granger Causality test on TSM Adj Close Diff
grangercausalitytests(df[['Adj Close Diff','TSM Daily Return']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=3.7413  , p=0.0533  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=3.7488  , p=0.0528  , df=1
likelihood ratio test: chi2=3.7442  , p=0.0530  , df=1
parameter F test:         F=3.7413  , p=0.0533  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.7478  , p=0.0238  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=7.5206  , p=0.0233  , df=2
likelihood ratio test: chi2=7.5019  , p=0.0235  , df=2
parameter F test:         F=3.7478  , p=0.0238  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.7984  , p=0.0389  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=8.4345  , p=0.0378  , df=3
likelihood ratio test: chi2=8.4109  , p=0.0382  , df=3
parameter F test:         F=2.7984  , p=0.0389  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.5977  , p=0.

{1: ({'ssr_ftest': (3.7413490131857223, 0.05326895147082439, 1504.0, 1),
   'ssr_chi2test': (3.74881181041947, 0.052845064139614505, 1),
   'lrtest': (3.7441567583564392, 0.052992459596116995, 1),
   'params_ftest': (3.7413490131860967, 0.05326895147080265, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1136ea0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a11402c0>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (3.7478103566012053, 0.02379015170880764, 1501.0, 2),
   'ssr_chi2test': (7.520589469742059, 0.02327687885589466, 2),
   'lrtest': (7.5018737748159765, 0.02349572269427472, 2),
   'params_ftest': (3.7478103566011507, 0.023790151708811443, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1142bd0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1143020>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.

GC at lag = 2

In [12]:
## Granger Causality test on NVDA Adj Close Diff
grangercausalitytests(df[['Adj Close Diff','NVDA Adj Close Diff']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=3.9421  , p=0.0473  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=3.9499  , p=0.0469  , df=1
likelihood ratio test: chi2=3.9448  , p=0.0470  , df=1
parameter F test:         F=3.9421  , p=0.0473  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=4.5070  , p=0.0112  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=9.0439  , p=0.0109  , df=2
likelihood ratio test: chi2=9.0169  , p=0.0110  , df=2
parameter F test:         F=4.5070  , p=0.0112  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=3.1217  , p=0.0251  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=9.4089  , p=0.0243  , df=3
likelihood ratio test: chi2=9.3796  , p=0.0246  , df=3
parameter F test:         F=3.1217  , p=0.0251  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.5059  , p=0.

{1: ({'ssr_ftest': (3.9420823152888276, 0.04727380203521262, 1504.0, 1),
   'ssr_chi2test': (3.949945511396452, 0.04687288552810832, 1),
   'lrtest': (3.944778006311026, 0.04701705689103408, 1),
   'params_ftest': (3.942082315288882, 0.04727380203521262, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1142b70>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a114a690>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (4.506950069682529, 0.011181761177741348, 1501.0, 2),
   'ssr_chi2test': (9.043926455618772, 0.010867667005328374, 2),
   'lrtest': (9.016879106311535, 0.011015636085841671, 2),
   'params_ftest': (4.506950069682607, 0.011181761177740076, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a0f2c2c0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a0f2d910>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (3.121

GC at lag = 8

In [13]:
## Granger Causality test on NVDA Daily Return
grangercausalitytests(df[['Adj Close Diff','NVDA Daily Return']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=0.4443  , p=0.5052  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=0.4452  , p=0.5046  , df=1
likelihood ratio test: chi2=0.4451  , p=0.5047  , df=1
parameter F test:         F=0.4443  , p=0.5052  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.3105  , p=0.0368  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=6.6430  , p=0.0361  , df=2
likelihood ratio test: chi2=6.6284  , p=0.0364  , df=2
parameter F test:         F=3.3105  , p=0.0368  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.1806  , p=0.0885  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=6.5724  , p=0.0868  , df=3
likelihood ratio test: chi2=6.5581  , p=0.0874  , df=3
parameter F test:         F=2.1806  , p=0.0885  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=1.6461  , p=0.

{1: ({'ssr_ftest': (0.44428834136254763, 0.5051621174959084, 1504.0, 1),
   'ssr_chi2test': (0.4451745548094144, 0.5046354340186356, 1),
   'lrtest': (0.44510881447695283, 0.5046668992943135, 1),
   'params_ftest': (0.44428834136273765, 0.5051621174958283, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1013e30>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1134c80>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (3.310454904541982, 0.036766245477320963, 1501.0, 2),
   'ssr_chi2test': (6.642964805116889, 0.036099278407375306, 2),
   'lrtest': (6.628356690709552, 0.036363914885348445, 2),
   'params_ftest': (3.3104549045421963, 0.036766245477312914, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1135010>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1135b20>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.

GC at lag = 7

In [14]:
## Granger Causality test on TSM Daily Return
grangercausalitytests(df[['Daily Return','TSM Daily Return']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=1.2387  , p=0.2659  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=1.2412  , p=0.2652  , df=1
likelihood ratio test: chi2=1.2407  , p=0.2653  , df=1
parameter F test:         F=1.2387  , p=0.2659  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=2.4763  , p=0.0844  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=4.9690  , p=0.0834  , df=2
likelihood ratio test: chi2=4.9609  , p=0.0837  , df=2
parameter F test:         F=2.4763  , p=0.0844  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.1394  , p=0.0934  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=6.4482  , p=0.0917  , df=3
likelihood ratio test: chi2=6.4344  , p=0.0923  , df=3
parameter F test:         F=2.1394  , p=0.0934  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.2103  , p=0.

{1: ({'ssr_ftest': (1.238701924000825, 0.2658992217880347, 1504.0, 1),
   'ssr_chi2test': (1.2411727390088054, 0.2652451580759396, 1),
   'lrtest': (1.2406619014382159, 0.26534352740439043, 1),
   'params_ftest': (1.238701924000792, 0.2658992217880774, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1135130>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a1134410>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (2.476273772683386, 0.0843991873818783, 1501.0, 2),
   'ssr_chi2test': (4.969045038855669, 0.08336535034564002, 2),
   'lrtest': (4.960865347688014, 0.08370699992638367, 2),
   'params_ftest': (2.476273772683528, 0.08439918738186758, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a110b440>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a110b6e0>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.1394073560

GC at lag = 5

In [15]:
## Granger Causality test on NVDA Adjust Close Diff
grangercausalitytests(df[['Daily Return','NVDA Adj Close Diff']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=4.5735  , p=0.0326  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=4.5826  , p=0.0323  , df=1
likelihood ratio test: chi2=4.5756  , p=0.0324  , df=1
parameter F test:         F=4.5735  , p=0.0326  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.2019  , p=0.0410  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=6.4251  , p=0.0403  , df=2
likelihood ratio test: chi2=6.4114  , p=0.0405  , df=2
parameter F test:         F=3.2019  , p=0.0410  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.1812  , p=0.0884  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=6.5742  , p=0.0868  , df=3
likelihood ratio test: chi2=6.5599  , p=0.0873  , df=3
parameter F test:         F=2.1812  , p=0.0884  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=1.6909  , p=0.

{1: ({'ssr_ftest': (4.573469575948937, 0.03263143643955871, 1504.0, 1),
   'ssr_chi2test': (4.582592188135005, 0.03229831960360649, 1),
   'lrtest': (4.575638745693141, 0.03242965360867337, 1),
   'params_ftest': (4.573469575948873, 0.03263143643955968, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a11431a0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a114a8d0>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (3.2018697236295597, 0.040964108634834434, 1501.0, 2),
   'ssr_chi2test': (6.425071024365246, 0.04025441820060432, 2),
   'lrtest': (6.411404192125701, 0.040530435386909196, 2),
   'params_ftest': (3.201869723630017, 0.04096410863481388, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a114b0b0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a114a390>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.18120

GC at lag = 1

In [16]:
## Granger Causality test on NVDA Daily Return
grangercausalitytests(df[['Daily Return','NVDA Daily Return']], maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=2.6007  , p=0.1070  , df_denom=1504, df_num=1
ssr based chi2 test:   chi2=2.6059  , p=0.1065  , df=1
likelihood ratio test: chi2=2.6037  , p=0.1066  , df=1
parameter F test:         F=2.6007  , p=0.1070  , df_denom=1504, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=3.8459  , p=0.0216  , df_denom=1501, df_num=2
ssr based chi2 test:   chi2=7.7174  , p=0.0211  , df=2
likelihood ratio test: chi2=7.6977  , p=0.0213  , df=2
parameter F test:         F=3.8459  , p=0.0216  , df_denom=1501, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.6418  , p=0.0480  , df_denom=1498, df_num=3
ssr based chi2 test:   chi2=7.9626  , p=0.0468  , df=3
likelihood ratio test: chi2=7.9416  , p=0.0472  , df=3
parameter F test:         F=2.6418  , p=0.0480  , df_denom=1498, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.0107  , p=0.

{1: ({'ssr_ftest': (2.6007476834259196, 0.10702310738843411, 1504.0, 1),
   'ssr_chi2test': (2.605935345028498, 0.10646432723178965, 1),
   'lrtest': (2.603684820636772, 0.10661557134710706, 1),
   'params_ftest': (2.600747683426096, 0.10702310738840028, 1504.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a110acf0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a11884a0>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (3.8459044395274065, 0.021577937665856604, 1501.0, 2),
   'ssr_chi2test': (7.717431160464056, 0.021095077063842234, 2),
   'lrtest': (7.697724635982013, 0.021303959788301206, 2),
   'params_ftest': (3.845904439527323, 0.021577937665861913, 1501.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a11880e0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x1a9a11881d0>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),
 3: ({'ssr_ftest': (2.64

GC at lag = 2

## VAR model

In [38]:
from statsmodels.tsa.api import VAR

In [39]:
df_Diff = df.copy()
df_Diff = df_Diff.drop(['Daily Return','TSM Adj Close Diff'], axis=1)
df_Diff

Unnamed: 0_level_0,Adj Close Diff,TSM Daily Return,NVDA Adj Close Diff,NVDA Daily Return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1
2018-01-03,-0.007053,0.016821,3.245316,0.065814
2018-01-04,0.188610,-0.005275,0.277039,0.005271
2018-01-05,0.464493,0.023379,0.447720,0.008474
2018-01-08,-0.153267,-0.000471,1.632542,0.030640
2018-01-09,-0.004707,-0.006126,-0.014835,-0.000270
...,...,...,...,...
2023-12-22,-1.077133,0.005851,-1.599945,-0.003266
2023-12-26,-0.548553,0.012603,4.489838,0.009195
2023-12-27,0.099716,0.001915,1.379913,0.002800
2023-12-28,0.428879,0.000478,1.049957,0.002125


In [51]:
model_diff = VAR(df_Diff)
model_fitted = model_diff.fit(maxlags=15)
model_fitted.k_ar

  self._init_dates(dates, freq)


15

In [76]:
#model_fitted.summary()

In [67]:
df_Diff_2023 = df.copy()
df_Diff_2023 = df_Diff_2023.drop(['Daily Return','TSM Adj Close Diff','TSM Daily Return','NVDA Adj Close Diff'], axis=1)

df_Diff_2023 = df_Diff_2023.drop(df_Diff_2023.loc[:"2022-12-31"].index)

In [68]:
model_diff_2023 = VAR(df_Diff_2023)
model_fitted_2023 = model_diff_2023.fit(maxlags=15)
model_fitted_2023.k_ar


  self._init_dates(dates, freq)


15

In [77]:
#model_fitted_2023.summary()

In [79]:
lag_order = model_fitted_2023.k_ar

steps_to_forecast = 1
f = model_fitted_2023.forecast(df_Diff_2023.iloc[-lag_order:].values, steps_to_forecast)

In [80]:
f

array([[-0.14329739, -0.00556939]])

In [81]:
f[0,0]

-0.1432973874877444

In [48]:
df_Diff2 = df.copy()
#df_Diff2 = df_Diff2.drop(['Daily Return','TSM Adj Close Diff','TSM Daily Return'], axis=1)
#df_Diff2 = df_Diff2.drop(['Daily Return','TSM Adj Close Diff','TSM Daily Return','NVDA Daily Return'], axis=1)
df_Diff2 = df_Diff2.drop(['Daily Return','TSM Adj Close Diff','TSM Daily Return','NVDA Adj Close Diff'], axis=1)
df_Diff2

Unnamed: 0_level_0,Adj Close Diff,NVDA Daily Return
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2018-01-03,-0.007053,0.065814
2018-01-04,0.188610,0.005271
2018-01-05,0.464493,0.008474
2018-01-08,-0.153267,0.030640
2018-01-09,-0.004707,-0.000270
...,...,...
2023-12-22,-1.077133,-0.003266
2023-12-26,-0.548553,0.009195
2023-12-27,0.099716,0.002800
2023-12-28,0.428879,0.002125


In [49]:
model_diff_2 = VAR(df_Diff2)
model_fitted_2 = model_diff_2.fit(maxlags=15)

  self._init_dates(dates, freq)


In [50]:
model_fitted_2.summary()

  Summary of Regression Results   
Model:                         VAR
Method:                        OLS
Date:           Thu, 30, May, 2024
Time:                     09:53:27
--------------------------------------------------------------------
No. of Equations:         2.00000    BIC:                   -5.49714
Nobs:                     1493.00    HQIC:                  -5.63544
Log likelihood:           93.2260    FPE:                 0.00328768
AIC:                     -5.71758    Det(Omega_mle):      0.00315529
--------------------------------------------------------------------
Results for equation Adj Close Diff
                           coefficient       std. error           t-stat            prob
----------------------------------------------------------------------------------------
const                         0.091295         0.058181            1.569           0.117
L1.Adj Close Diff            -0.053998         0.032586           -1.657           0.097
L1.NVDA Daily Retur

In [21]:
ps = range(1,10)
AIC = np.zeros(len(ps))
for i, p in enumerate(ps):
    result = model_diff.fit(p)
    AIC[i] = result.aic

In [22]:
AIC

array([-22.87887574, -22.8501804 , -22.82292996, -22.80761717,
       -22.78060374, -22.76715955, -22.76212873, -22.75999455,
       -22.75575027])

In [23]:
print('Minimum AIC =', np.min(AIC), "at lag =", ps[np.argmin(AIC)])

Minimum AIC = -22.878875738033333 at lag = 1


In [24]:
df_imputed

NameError: name 'df_imputed' is not defined

In [None]:
df_imputed_transformed = df_imputed.dropna()

In [None]:
df_imputed_transformed=df_imputed_transformed.drop(['Open','High','Low','Close','Adj Close','Adj Close Prev','Volume'], axis=1)
df_imputed_transformed

In [None]:
model = VAR(df_imputed_transformed)

In [None]:
results = model.fit(2)

In [None]:
results.summary()

In [None]:
model.select_order(15)

## Remove five dates from data frame (including open, close, high, low, volume and adj close) 

In [None]:
df = data_aapl_2023.copy()

In [None]:
df

In [None]:
df.index = pd.to_datetime(df.index)

In [None]:
#missing_dates
missing_dates = sorted(missing_dates)

In [None]:
df.loc[missing_dates,['Open','High','Low','Close','Adj Close','Volume','Adj Close Diff','Adj Close Prev','Daily Return']] = np.nan

In [None]:
df.loc[missing_dates]

In [None]:
df

In [None]:
imputed_indices = df[df['Adj Close'].isnull()].index

In [None]:
imputed_indices

In [None]:
# Plot APPL and ITW
# Plot Adj Close
# Plot the main line with markers
plt.figure(figsize=(12,8))
plt.plot(data['Adj Close'].diff(),'.-',color='red',label='APPL')
plt.plot(data_ITW['Adj Close'].diff(),'.-',color='blue',label='ITW')

# Set labels
plt.xlabel('Date', fontsize=12)
plt.ylabel('Adj Close', fontsize=12)
plt.title('Adj Close Difference between APPL and ITW')

plt.legend(fontsize=12)
plt.show()

In [None]:
cols = ['Daily Return','TSM Daily Return','Adj Close Diff','TSM Adj Close Diff']
df_2023 = data_aapl_2023[cols].copy()
df_2022 = data_aapl_2022[cols].copy()
df_2021 = data_aapl_2021[cols].copy()