# Granger Causality Test: EURUSD vs EURGBP

In [35]:
import pandas as pd
import numpy as np
from statsmodels.tsa.api import VAR


In [6]:

df = pd.read_csv(
    "/Users/nathr/Downloads/HISTDATA_COM_ASCII_EURUSD_M1202511/eurusdnov.csv",
    sep=";",
    header=None,
    names=["datetime", "open", "high", "low", "close", "volume"]
)

# Split datetime
df["datetime"] = pd.to_datetime(
    df["datetime"],
    format="%Y%m%d %H%M%S"
)

df = df.set_index("datetime")

print(df.head())


                        open     high      low    close  volume
datetime                                                       
2025-11-02 17:00:00  1.15294  1.15324  1.15294  1.15324       0
2025-11-02 17:01:00  1.15324  1.15329  1.15289  1.15328       0
2025-11-02 17:02:00  1.15328  1.15328  1.15328  1.15328       0
2025-11-02 17:03:00  1.15295  1.15328  1.15295  1.15328       0
2025-11-02 17:04:00  1.15295  1.15295  1.15295  1.15295       0


In [8]:
df2 = pd.read_csv(
    "/Users/nathr/Downloads/HISTDATA_COM_ASCII_EURGBP_M1202511/eurgbpnov.csv",
    sep=";",
    header=None,
    names=["datetime", "open", "high", "low", "close", "volume"]
)

# Split datetime
df2["datetime"] = pd.to_datetime(
    df2["datetime"],
    format="%Y%m%d %H%M%S"
)

df2 = df2.set_index("datetime")

print(df2.head())


                        open     high      low    close  volume
datetime                                                       
2025-11-02 17:04:00  0.87737  0.87737  0.87737  0.87737       0
2025-11-02 17:05:00  0.87738  0.87772  0.87711  0.87772       0
2025-11-02 17:06:00  0.87773  0.87773  0.87740  0.87773       0
2025-11-02 17:07:00  0.87772  0.87773  0.87741  0.87741       0
2025-11-02 17:08:00  0.87740  0.87771  0.87740  0.87771       0


In [10]:
combined = df.merge(df2, how='inner', left_index=True, right_index=True, suffixes=('_eurusd','_eurgbp'))
combined.head()

Unnamed: 0_level_0,open_eurusd,high_eurusd,low_eurusd,close_eurusd,volume_eurusd,open_eurgbp,high_eurgbp,low_eurgbp,close_eurgbp,volume_eurgbp
datetime,Unnamed: 1_level_1,Unnamed: 2_level_1,Unnamed: 3_level_1,Unnamed: 4_level_1,Unnamed: 5_level_1,Unnamed: 6_level_1,Unnamed: 7_level_1,Unnamed: 8_level_1,Unnamed: 9_level_1,Unnamed: 10_level_1
2025-11-02 17:04:00,1.15295,1.15295,1.15295,1.15295,0,0.87737,0.87737,0.87737,0.87737,0
2025-11-02 17:05:00,1.15306,1.15329,1.15295,1.15319,0,0.87738,0.87772,0.87711,0.87772,0
2025-11-02 17:06:00,1.153,1.15331,1.153,1.15302,0,0.87773,0.87773,0.8774,0.87773,0
2025-11-02 17:07:00,1.15303,1.15309,1.15303,1.15305,0,0.87772,0.87773,0.87741,0.87741,0
2025-11-02 17:08:00,1.15307,1.15309,1.15306,1.15309,0,0.8774,0.87771,0.8774,0.87771,0


In [16]:
### 1. Assumptions
VOL_WINDOW = 30  # rolling volatility window in minutes
MAX_LAG = 60      # minutes for lagged correlation
Significance_level = 0.05

In [22]:
def compute_log_returns(df, price_col):
    return np.log(df[price_col]).diff()

r_eurusd = compute_log_returns(combined, "close_eurusd")
r_eurgbp = compute_log_returns(combined, "close_eurgbp")

In [30]:
def rolling_vol(r, vol_window):
    return r.rolling(vol_window).std()

vol_eurusd = rolling_vol(r_eurusd, VOL_WINDOW)
vol_eurgbp = rolling_vol(r_eurgbp, VOL_WINDOW)


## ADF Stationarity Test

The **Augmented Dickey-Fuller (ADF) test** checks whether a time series is **stationary** — i.e., its mean, variance, and autocorrelation structure do not change over time.

- **Null hypothesis (H0):** The series has a unit root (non-stationary).
- **Alternative hypothesis (H1):** The series is stationary.

**Interpretation:**
- p-value < 0.05 → reject H0 → series is stationary (good for Granger causality).
- p-value > 0.05 → fail to reject H0 → series may be non-stationary (differences may be needed).

**Why important:**
- Granger causality assumes **stationary input series**.
- Using non-stationary series can produce **spurious results**.


In [32]:
from statsmodels.tsa.stattools import adfuller

def check_stationarity(series):
    result = adfuller(series.dropna())
    print(f"ADF Statistic: {result[0]:.4f}, p-value: {result[1]:.4f}")
    
check_stationarity(vol_eurusd)
check_stationarity(vol_eurgbp)


ADF Statistic: -10.8384, p-value: 0.0000
ADF Statistic: -13.3498, p-value: 0.0000


## VAR Lag Order Selection

Before running Granger causality, we must choose the **number of lags** (how many past periods to include).

- Tested using **Information Criteria**:
  - **AIC:** Akaike Information Criterion
  - **BIC:** Bayesian Information Criterion (stronger penalty for extra lags)
  - **FPE:** Final Prediction Error
  - **HQIC:** Hannan-Quinn Information Criterion

**Interpretation:**
- Each criterion suggests the **optimal number of lags** (row with minimum value).
- Lags = 60 (for minute-level data) is a common choice in intraday research:
  - 60 minutes ≈ 1 hour
  - Captures short-term dynamics without overfitting.
- You can also test robustness by checking ± a few lags around the suggested value.

**Example Output Explanation:**
- *BIC highlights lag 32 as minimum* → VAR model with **32 lags** is preferred by BIC.
- AIC, HQIC, FPE may suggest slightly different lags; often BIC is more conservative.


In [37]:
# df_gc = DataFrame with stationary series
model = VAR(df_gc)
results = model.select_order(maxlags=60)  # test lags from 1 to 60
print(results.summary())


  self._init_dates(dates, freq)


 VAR Order Selection (* highlights the minimums)  
       AIC         BIC         FPE         HQIC   
--------------------------------------------------
0       -40.61      -40.61   2.315e-18      -40.61
1       -48.48      -48.48   8.814e-22      -48.48
2       -48.51      -48.51   8.562e-22      -48.51
3       -48.52      -48.51   8.514e-22      -48.51
4       -48.52      -48.51   8.478e-22      -48.52
5       -48.52      -48.51   8.472e-22      -48.52
6       -48.52      -48.52   8.448e-22      -48.52
7       -48.52      -48.52   8.440e-22      -48.52
8       -48.52      -48.52   8.431e-22      -48.52
9       -48.53      -48.51   8.425e-22      -48.52
10      -48.53      -48.51   8.422e-22      -48.52
11      -48.53      -48.51   8.421e-22      -48.52
12      -48.53      -48.51   8.419e-22      -48.52
13      -48.53      -48.51   8.418e-22      -48.52
14      -48.53      -48.51   8.416e-22      -48.52
15      -48.53      -48.51   8.413e-22      -48.52
16      -48.53      -48.51   8.

## Granger Causality Test: EURUSD → EURGBP

The **Granger causality test** examines whether past values of one time series help predict another.  

- **Null hypothesis (H0):** EURUSD does **not** Granger-cause EURGBP  
- **Alternative hypothesis (H1):** EURUSD **does** Granger-cause EURGBP  

**Test statistics (using 32 lags, chosen from VAR lag selection):**

**Interpretation:**
- p-value < 0.05 → reject H0 → past EURUSD returns contain information useful for predicting EURGBP returns.
- This suggests a **lead-lag relationship**: EURUSD leads EURGBP at intraday timescales.

In [49]:
from statsmodels.tsa.stattools import grangercausalitytests

# Combine into a DataFrame
df_gc = pd.concat([vol_eurusd, vol_eurgbp], axis=1).dropna()
df_gc.columns = ["EURUSD", "EURGBP"]

# Test if EURUSD predicts EURGBP
grangercausalitytests(df_gc, maxlag=MAX_LAG)



Granger Causality
number of lags (no zero) 1
ssr based F test:         F=12.1796 , p=0.0005  , df_denom=28891, df_num=1
ssr based chi2 test:   chi2=12.1809 , p=0.0005  , df=1
likelihood ratio test: chi2=12.1783 , p=0.0005  , df=1
parameter F test:         F=12.1796 , p=0.0005  , df_denom=28891, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=15.2541 , p=0.0000  , df_denom=28888, df_num=2
ssr based chi2 test:   chi2=30.5136 , p=0.0000  , df=2
likelihood ratio test: chi2=30.4975 , p=0.0000  , df=2
parameter F test:         F=15.2541 , p=0.0000  , df_denom=28888, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=10.1153 , p=0.0000  , df_denom=28885, df_num=3
ssr based chi2 test:   chi2=30.3533 , p=0.0000  , df=3
likelihood ratio test: chi2=30.3374 , p=0.0000  , df=3
parameter F test:         F=10.1153 , p=0.0000  , df_denom=28885, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=9.1841  

MemoryError: Unable to allocate 13.4 MiB for an array with shape (28835, 61) and data type float64

### Reverse Test: EURGBP → EURUSD

- Repeat the same procedure, swapping the two series: test whether EURGBP **Granger-causes** EURUSD.
- This will tell you whether the causality is **bidirectional** or only one-way.
- Use the same **lag length** (from VAR selection) for consistency.


In [43]:
from statsmodels.tsa.stattools import grangercausalitytests

# Combine into a DataFrame
df_gc = pd.concat([vol_eurgbp, vol_eurusd], axis=1).dropna()
df_gc.columns = ["EURGBP", "EURUSD"]

# Test if EURUSD predicts EURGBP
grangercausalitytests(df_gc, maxlag=MAX_LAG)



Granger Causality
number of lags (no zero) 1
ssr based F test:         F=30.4344 , p=0.0000  , df_denom=28891, df_num=1
ssr based chi2 test:   chi2=30.4376 , p=0.0000  , df=1
likelihood ratio test: chi2=30.4216 , p=0.0000  , df=1
parameter F test:         F=30.4344 , p=0.0000  , df_denom=28891, df_num=1

Granger Causality
number of lags (no zero) 2
ssr based F test:         F=27.1806 , p=0.0000  , df_denom=28888, df_num=2
ssr based chi2 test:   chi2=54.3707 , p=0.0000  , df=2
likelihood ratio test: chi2=54.3196 , p=0.0000  , df=2
parameter F test:         F=27.1806 , p=0.0000  , df_denom=28888, df_num=2

Granger Causality
number of lags (no zero) 3
ssr based F test:         F=18.9305 , p=0.0000  , df_denom=28885, df_num=3
ssr based chi2 test:   chi2=56.8054 , p=0.0000  , df=3
likelihood ratio test: chi2=56.7496 , p=0.0000  , df=3
parameter F test:         F=18.9305 , p=0.0000  , df_denom=28885, df_num=3

Granger Causality
number of lags (no zero) 4
ssr based F test:         F=15.2439 

{1: ({'ssr_ftest': (30.434445961783076, 3.4829921666316595e-08, 28891.0, 1),
   'ssr_chi2test': (30.437606230997897, 3.447790480847797e-08, 1),
   'lrtest': (30.42158564273268, 3.4763821080324424e-08, 1),
   'params_ftest': (30.434445961781346, 3.4829921666316595e-08, 28891.0, 1.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x20410d4c800>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x2040f48a720>,
   array([[0., 1., 0.]])]),
 2: ({'ssr_ftest': (27.180636681573237, 1.609504840788259e-12, 28888.0, 2),
   'ssr_chi2test': (54.370682334581524, 1.5615516098282101e-12, 2),
   'lrtest': (54.31958920264151, 1.601957813425997e-12, 2),
   'params_ftest': (27.180636681508417, 1.609504840892563e-12, 28888.0, 2.0)},
  [<statsmodels.regression.linear_model.RegressionResultsWrapper at 0x20402899ee0>,
   <statsmodels.regression.linear_model.RegressionResultsWrapper at 0x2040289b830>,
   array([[0., 0., 1., 0., 0.],
          [0., 0., 0., 1., 0.]])]),


### EURGBP → EURUSD

- Same tests applied in reverse.
- **Interpretation:** past EURGBP returns also contain information useful for predicting EURUSD returns.


---
### Conclusion

- There is **bidirectional Granger causality** between EURUSD and EURGBP at intraday timescales.
- Both series **lead each other slightly**, consistent with overlapping liquidity and news reactions.
- However, the **magnitude of the effect is small**; FX markets are highly efficient and adjust quickly, so the potential alpha from exploiting this is likely minimal in practice.



### Drawbacks

- Granger causality does not guarantee **tradable alpha** — it only shows **predictive relationships** in-sample.
- FX markets are very fast; even if statistically significant, execution costs and slippage may erase any gains.
- Results depend on **lag selection** and **volatility window**, so robustness checks are essential.
- Intraday results can be sensitive to **data quality**, **missing ticks**, and **microstructure effects**.
