# Vector Autoregressive

- Tesla causes Meta, Ford, Palladium
- Apple causes Microsoft, NVIDIA, Amazon, Meta, Gold, Palladium
- Google causes Microsoft, Amazon, Meta, Oil, Ford, Palladium
- Microsoft causes Apple, Google, Amazon, Meta, Gold, Ford, Palladium
- NVIDIA causes Google, Microsoft, Amazon, Meta, Gold
- Amazon causes Tesla, TGold, Ford, Palladium
- Meta causes Amazon, TSMC, Ford, Palladium
- TSMC causes Apple, Gold, Ford, Palladium
- Gold causes Oil, Palladium
- Oil causes Google, Microsoft, Gold, Palladium
- Ford causes Tesla, NVIDIA, TSMC, Gold, Oil, Palladium
- Palladium causes Tesla

In [2]:
import pandas as pd
import numpy as np
import matplotlib.pyplot as plt

from datetime import datetime
from sklearn.metrics import mean_squared_error

from statsmodels.tsa.api import VAR
from statsmodels.tsa.stattools import acf, pacf, grangercausalitytests

## Import TimeSeriesSplit
from sklearn.model_selection import TimeSeriesSplit

import yfinance as yf

In [3]:
import warnings
warnings.simplefilter(action='ignore', category=FutureWarning)

In [4]:
tesla = yf.Ticker("TSLA")
palladium = yf.Ticker("PA=F")

tesla_data = tesla.history(start='2019-01-01', end='2024-10-24')
palladium_data = palladium.history(start='2019-01-01', end='2024-10-24')

In [7]:
tesla_data['closeDiff'] = tesla_data.Close.diff()
palladium_data['closeDiff'] = palladium_data.Close.diff()

In [8]:
tesla_data

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Dividends,Stock Splits,closeDiff
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2019-01-02 00:00:00-05:00,20.406668,21.008667,19.920000,20.674667,174879000,0.0,0.0,
2019-01-03 00:00:00-05:00,20.466667,20.626667,19.825333,20.024000,104478000,0.0,0.0,-0.650667
2019-01-04 00:00:00-05:00,20.400000,21.200001,20.181999,21.179333,110911500,0.0,0.0,1.155333
2019-01-07 00:00:00-05:00,21.448000,22.449333,21.183332,22.330667,113268000,0.0,0.0,1.151335
2019-01-08 00:00:00-05:00,22.797333,22.934000,21.801332,22.356667,105127500,0.0,0.0,0.025999
...,...,...,...,...,...,...,...,...
2024-10-17 00:00:00-04:00,221.589996,222.080002,217.899994,220.889999,50791800,0.0,0.0,-0.440002
2024-10-18 00:00:00-04:00,220.710007,222.279999,219.229996,220.699997,49611900,0.0,0.0,-0.190002
2024-10-21 00:00:00-04:00,218.899994,220.479996,215.729996,218.850006,47329000,0.0,0.0,-1.849991
2024-10-22 00:00:00-04:00,217.309998,218.220001,215.259995,217.970001,43268700,0.0,0.0,-0.880005


In [9]:
palladium_data

Unnamed: 0_level_0,Open,High,Low,Close,Volume,Dividends,Stock Splits,closeDiff
Date,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1
2019-01-02 00:00:00-05:00,1255.500000,1255.500000,1255.500000,1255.500000,0,0.0,0.0,
2019-01-03 00:00:00-05:00,1257.000000,1257.000000,1257.000000,1257.000000,0,0.0,0.0,1.500000
2019-01-04 00:00:00-05:00,1291.099976,1291.099976,1291.099976,1291.099976,0,0.0,0.0,34.099976
2019-01-07 00:00:00-05:00,1296.900024,1296.900024,1296.900024,1296.900024,0,0.0,0.0,5.800049
2019-01-08 00:00:00-05:00,1318.099976,1318.099976,1318.099976,1318.099976,0,0.0,0.0,21.199951
...,...,...,...,...,...,...,...,...
2024-10-17 00:00:00-04:00,1038.300049,1038.300049,1038.300049,1038.300049,12,0.0,0.0,19.300049
2024-10-18 00:00:00-04:00,1077.699951,1077.699951,1077.699951,1077.699951,12,0.0,0.0,39.399902
2024-10-21 00:00:00-04:00,1048.400024,1048.400024,1048.400024,1048.400024,12,0.0,0.0,-29.299927
2024-10-22 00:00:00-04:00,1074.199951,1074.199951,1074.199951,1074.199951,12,0.0,0.0,25.799927


In [10]:
df = pd.DataFrame()
df['Tesla Diff'] = tesla_data['closeDiff'].copy()
df['Palladium Diff'] = palladium_data['closeDiff'].copy()

## Palladium Diff causes Tesla Diff

In [11]:
grangercausalitytests(df[['Tesla Diff', 'Palladium Diff']].dropna(), maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=2.9263  , p=0.0874  , df_denom=1458, df_num=1
ssr based chi2 test:   chi2=2.9323  , p=0.0868  , df=1
likelihood ratio test: chi2=2.9294  , p=0.0870  , df=1
parameter F test:         F=2.9263  , p=0.0874  , df_denom=1458, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=1.7533  , p=0.1736  , df_denom=1455, df_num=2
ssr based chi2 test:   chi2=3.5187  , p=0.1722  , df=2
likelihood ratio test: chi2=3.5144  , p=0.1725  , df=2
parameter F test:         F=1.7533  , p=0.1736  , df_denom=1455, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.7378  , p=0.0422  , df_denom=1452, df_num=3
ssr based chi2 test:   chi2=8.2530  , p=0.0411  , df=3
likelihood ratio test: chi2=8.2297  , p=0.0415  , df=3
parameter F test:         F=2.7378  , p=0.0422  , df_denom=1452, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=2.7869  , p=0.

{np.int64(1): ({'ssr_ftest': (np.float64(2.9262980672965533),
    np.float64(0.08735983751569401),
    np.float64(1458.0),
    np.int64(1)),
   'ssr_chi2test': (np.float64(2.932319256735435),
    np.float64(0.0868226835002527),
    np.int64(1)),
   'lrtest': (np.float64(2.9293805132801936),
    np.float64(0.08698086258798517),
    np.int64(1)),
   'params_ftest': (np.float64(2.926298067294419),
    np.float64(0.08735983751579852),
    np.float64(1458.0),
    1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x2a314cad280>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x2a3170d6ff0>,
   array([[0., 1., 0.]])]),
 np.int64(2): ({'ssr_ftest': (np.float64(1.7533023920776694),
    np.float64(0.17356675160334076),
    np.float64(1455.0),
    np.int64(2)),
   'ssr_chi2test': (np.float64(3.518654972417041),
    np.float64(0.17216060528122595),
    np.int64(2)),
   'lrtest': (np.float64(3.514421727093577),
    np.float64(0.17252539023968733),
    n

Granger Causality test detects strongest causality (Palladium causes Tesla) at lag = 4 (F = 2.7869, p = 0.0253). Thus, lag 4 would be our optimal lag.

In [14]:
df = df.dropna()

In [15]:
df

Unnamed: 0_level_0,Tesla Diff,Palladium Diff
Date,Unnamed: 1_level_1,Unnamed: 2_level_1
2019-01-03 00:00:00-05:00,-0.650667,1.500000
2019-01-04 00:00:00-05:00,1.155333,34.099976
2019-01-07 00:00:00-05:00,1.151335,5.800049
2019-01-08 00:00:00-05:00,0.025999,21.199951
2019-01-09 00:00:00-05:00,0.212000,8.000000
...,...,...
2024-10-17 00:00:00-04:00,-0.440002,19.300049
2024-10-18 00:00:00-04:00,-0.190002,39.399902
2024-10-21 00:00:00-04:00,-1.849991,-29.299927
2024-10-22 00:00:00-04:00,-0.880005,25.799927


In [17]:
plt.figure(figsize = (15,5))
plt.plot(df['Tesla Diff'], legend='Tesla')
plt.plot(df['Palladium Diff', legend='Tesla']
plt.show()

SyntaxError: invalid syntax. Maybe you meant '==' or ':=' instead of '='? (197014353.py, line 3)

In [None]:
kfold = TimeSeriesSplit(n_splits = 5,
                           test_size = 14)


## Tesla Diff causes Palladium Diff

In [12]:
grangercausalitytests(df[['Palladium Diff', 'Tesla Diff']].dropna(), maxlag=10)


Granger Causality
number of lags (no zero) 1
ssr based F test:         F=3.8801  , p=0.0491  , df_denom=1458, df_num=1
ssr based chi2 test:   chi2=3.8881  , p=0.0486  , df=1
likelihood ratio test: chi2=3.8829  , p=0.0488  , df=1
parameter F test:         F=3.8801  , p=0.0491  , df_denom=1458, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=1.8915  , p=0.1512  , df_denom=1455, df_num=2
ssr based chi2 test:   chi2=3.7960  , p=0.1499  , df=2
likelihood ratio test: chi2=3.7910  , p=0.1502  , df=2
parameter F test:         F=1.8915  , p=0.1512  , df_denom=1455, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=2.1351  , p=0.0940  , df_denom=1452, df_num=3
ssr based chi2 test:   chi2=6.4360  , p=0.0922  , df=3
likelihood ratio test: chi2=6.4219  , p=0.0928  , df=3
parameter F test:         F=2.1351  , p=0.0940  , df_denom=1452, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=1.5622  , p=0.

{np.int64(1): ({'ssr_ftest': (np.float64(3.8800786362759188),
    np.float64(0.04905130347776365),
    np.float64(1458.0),
    np.int64(1)),
   'ssr_chi2test': (np.float64(3.888062337173606),
    np.float64(0.04863049934903059),
    np.int64(1)),
   'lrtest': (np.float64(3.8828979763948155),
    np.float64(0.048780289480944324),
    np.int64(1)),
   'params_ftest': (np.float64(3.8800786362749804),
    np.float64(0.04905130347779121),
    np.float64(1458.0),
    1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x2a3170d6ba0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x2a3170d7b60>,
   array([[0., 1., 0.]])]),
 np.int64(2): ({'ssr_ftest': (np.float64(1.8914792864875554),
    np.float64(0.15121922950032013),
    np.float64(1455.0),
    np.int64(2)),
   'ssr_chi2test': (np.float64(3.7959584306142005),
    np.float64(0.14987117079173057),
    np.int64(2)),
   'lrtest': (np.float64(3.7910322754196386),
    np.float64(0.15024077010402348),


In [None]:
Granger Causality test detects strongest causality (Tesla causes Palladium) at lag = 1 (F = 3.8801, p = 0.0491). Thus, lag 1 would be our optimal lag.