### Granger causality testing for 1 day
In this short notebook we shall perform the [Granger causality test](https://en.wikipedia.org/wiki/Granger_causality) for each of the 14 cryptocurrencies on the very last day of the dataset associated with the kaggle [G-Research Crypto Forecasting competition](https://www.kaggle.com/c/g-research-crypto-forecasting). The objective of the Granger causality test is to asses whether one time series can be used to forecast another time series. To do this we shall make use of the statsmodels [grangercausalitytests](https://www.statsmodels.org/stable/generated/statsmodels.tsa.stattools.grangercausalitytests.html) routine.

In [None]:
import numpy as np 
import pandas as pd
import matplotlib.pyplot as plt
#plt.style.use('fivethirtyeight')
plt.rcParams.update({'font.size': 12})
plt.rcParams["figure.figsize"] = (7, 7)
import seaborn as sns
import plotly.express as px
from datetime import date

# read in the data
train = pd.read_csv("../input/g-research-crypto-forecasting/train.csv")
asset_details = pd.read_csv("../input/g-research-crypto-forecasting/asset_details.csv")
train['timestamp'] = pd.to_datetime(train['timestamp'], unit='s')
train = train.set_index('timestamp')
mapping = dict(asset_details[['Asset_ID', 'Asset_Name']].values)
train["Asset_name"] = train["Asset_ID"].map(mapping)
cryptocurrencies = asset_details['Asset_Name'].unique()

Here is where we select the date range to be tested (this can very easily be changed by the interested reader)

In [None]:
# extract the last day of data
last_day = train.loc['2021-09-20 00:00:00':'2021-09-21 00:00:00']
# create columns of the Close values
close_columns = pd.pivot_table(last_day,index='timestamp',columns='Asset_ID',values='Close')

Let us perform the Granger causality test for each pair of cryptocurrencies for lags of 1 up to 30 minutes, testing whether the time series in the second column (`j`) causes the time series in the first column (`i`). Note that the Granger test requires that the data be [stationary](https://en.wikipedia.org/wiki/Stationary_process), so we actually look at the [percentage changes](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pct_change.html) in each of these two time series, rather than the absolute values:

In [None]:
import statsmodels.api as sm
from statsmodels.tsa.stattools import grangercausalitytests

granger_matrix = np.zeros((14, 14), float)
for lag in range(1,31):  # look at lags from 1 to 30 minutes
    for i in range(14):
        for j in range(14):
            results = grangercausalitytests(close_columns[[i, j]].pct_change().dropna(), [lag], verbose=False)
            granger_matrix[i,j] = results[lag][0]['ssr_ftest'][0]
    print("For a lag of",lag,"minutes the global maximum value is",np.amax(granger_matrix))

We can see that the strongest value is seen at a lag of 1 minute. Let us go back and take a closer look at the values for this particular lag

In [None]:
lag = 1
for i in range(14):
    for j in range(14):
        results = grangercausalitytests(close_columns[[i, j]].pct_change().dropna(), [lag], verbose=False)
        granger_matrix[i,j] = results[lag][0]['ssr_ftest'][0]

plt.rcParams["figure.figsize"] = (12, 12)
sns.heatmap(granger_matrix,cmap='YlOrBr',annot=True, fmt='.1f',);
plt.yticks(asset_details.Asset_ID.values +0.5, asset_details.Asset_Name.values, rotation='horizontal');
plt.xticks(asset_details.Asset_ID.values +0.5, asset_details.Asset_Name.values, rotation='vertical');

We can see that the strongest signal corresponds to Ethereum (Asset_ID=6) seemingly causing changes in IOTA (Asset_ID=8):

In [None]:
resutls = grangercausalitytests(close_columns[[8, 6]].pct_change().dropna(), [1])

Let us make an interactive plot of the [percentage change](https://pandas.pydata.org/pandas-docs/stable/reference/api/pandas.DataFrame.pct_change.html) in these two time series (without lag) together

In [None]:
IOTA_pct_change = pd.Series(close_columns[8]).pct_change().dropna()
Ethereum_pct_change = pd.Series(close_columns[6]).pct_change().dropna()

# https://matplotlib.org/stable/gallery/color/named_colors.html
color_1 = 'blue'
color_2 = 'olive'
color_3 = 'orange'
color_4 = 'crimson'
color_5 = 'limegreen'
color_6 = 'red'
color_7 = 'teal'
color_8 = 'yellowgreen'

fig  = px.line(y=IOTA_pct_change)
fig.update_traces(line_color=color_2)

fig_2  = px.line(y=Ethereum_pct_change)
fig_2.update_traces(line_color=color_4)
fig.add_trace(fig_2.data[0])

fig.update_layout(
    title="IOTA in olive green and Ethereum in crimson",
    xaxis_title="Time",
    yaxis_title="% change",)
fig.show();

### Conclusion 
It looks like *all* of the other 13 cryptocurrencies cause strong changes in the IOTA cryptocurrency (Asset_ID=8) one minute later. 

I am not at all sure whether to believe/trust this conclusion at the moment.

### See also

* ["*Granger causality Part II: The Movie*"](https://www.kaggle.com/carlmcbrideellis/granger-causality-part-ii-the-movie)  - where one can find a heatmap of the 1 minute lag that has been animated on a daily basis from the 1st of May 2021 up to the 20th of September 2021.

### Related reading

* [C. W. J. Granger "*Investigating Causal Relations by Econometric Models and Cross-spectral Methods*",  Econometrica **Vol. 37** pp. 424-438 (1969)](https://doi.org/10.2307/1912791)

### Appendix
The asset details, mapping the `Asset_ID` with the `Asset_Name`

In [None]:
asset_details