# Chapter 59: Causal Inference

## Learning Objectives

By the end of this chapter, you will be able to:

- Distinguish between correlation and causation and understand why causation matters for decision‑making in finance
- Represent causal assumptions using Directed Acyclic Graphs (DAGs) and understand structural causal models
- Apply do‑calculus to reason about interventions and estimate causal effects from observational data
- Implement instrumental variables regression to estimate causal effects when unobserved confounders exist
- Use propensity score matching to estimate treatment effects in observational studies
- Apply difference‑in‑differences and synthetic control methods to evaluate policy or event impacts
- Conduct causal discovery from time‑series data using Granger causality and more advanced methods (PCMCI)
- Build a complete causal inference pipeline for a NEPSE‑related question (e.g., does news sentiment cause stock returns?)
- Recognise the limitations and assumptions behind causal inference methods and avoid common pitfalls

---

## Introduction

Throughout this handbook, we have built models that predict stock prices, detect anomalies, and forecast multiple series. These models capture correlations and patterns, but they do not answer **why** questions. Why did a stock price rise? Was it because of positive news, or because the overall market went up? If we intervene—say, by publishing a positive article—will the price increase? Answering such questions requires moving beyond correlation to **causation**.

In finance, causal inference is crucial for:

- **Policy evaluation**: What is the effect of a regulatory change (e.g., a new circuit breaker rule) on market volatility?
- **Investment decisions**: Does a specific factor (e.g., ESG score) actually cause higher returns, or is it just correlated?
- **Risk management**: Understanding the causal drivers of extreme events helps in designing stress tests.
- **Attribution**: Decomposing a portfolio’s performance into contributions from various factors.

Causal inference aims to estimate the effect of an intervention (treatment) on an outcome, using observational data. Because we cannot run controlled experiments in financial markets (we cannot randomly assign stocks to receive positive news), we must rely on sophisticated statistical methods that adjust for confounding and selection bias.

In this chapter, we will introduce the core concepts of causal inference and walk through several methods, using the NEPSE system as a running example. We will use Python libraries like `dowhy`, `causalnex`, and `statsmodels` to implement these techniques.

---

## 59.1 Causality vs. Correlation

The famous adage “correlation does not imply causation” warns us that two variables moving together does not mean one causes the other. A correlation can arise due to:

- **Common cause (confounder)**: A third variable affects both X and Y, creating a spurious association. For example, overall market sentiment affects both news sentiment and stock returns, so they are correlated even if news has no direct effect.
- **Reverse causation**: Y might cause X. For instance, high stock returns might lead to more positive news coverage, not the other way around.
- **Selection bias**: The sample is not representative of the population.

Causal inference provides tools to untangle these possibilities, under certain assumptions.

**Example in NEPSE:** Suppose we observe that days with high trading volume are followed by higher returns. Could volume cause returns? Or is there a common cause (e.g., news) that drives both volume and returns? Causal inference helps answer such questions.

---

## 59.2 Causal Graphs and Structural Causal Models

A **causal graph** (or Directed Acyclic Graph, DAG) encodes our assumptions about the causal relationships between variables. Nodes represent variables, and directed edges represent direct causal effects. A DAG is acyclic: no variable can cause itself through a cycle.

**Example DAG for stock returns:**

```
     News  -->  Volume
       \         /
        \       v
         -> Returns
```

This graph says: News affects both Volume and Returns, and Volume also affects Returns. The edge from News to Returns indicates a direct effect, while the path News → Volume → Returns is an indirect effect.

### 59.2.1 Structural Causal Models (SCM)

An SCM extends the graph by specifying the functional form of each causal relationship. For example:

`Volume = f_Volume(News, U_Volume)`
`Returns = f_Returns(News, Volume, U_Returns)`

where `U` terms represent exogenous noise (unobserved factors). The SCM allows us to simulate interventions (e.g., setting News to a fixed value) and compute the resulting distribution of Returns.

In Python, we can define a DAG using the `causalnex` library.

```python
from causalnex.structure import StructureModel

sm = StructureModel()
sm.add_edges_from([
    ('News', 'Volume'),
    ('News', 'Returns'),
    ('Volume', 'Returns')
])
```

We can also learn the graph structure from data (causal discovery), which we will cover later.

---

## 59.3 Do‑Calculus and Interventions

The key operation in causal inference is the **intervention**: setting a variable to a specific value, regardless of its usual causes. This is denoted by the **do‑operator**: `do(X = x)`. The interventional distribution `P(Y | do(X = x))` answers “what would Y be if we forced X to be x?” This is different from the conditional distribution `P(Y | X = x)`, which is observational.

**Do‑calculus** is a set of rules that allow us to express an interventional distribution in terms of observational quantities, provided the graph is known and certain conditions hold. For example, if there are no confounders, `P(Y | do(X)) = P(Y | X)`. In general, we need to adjust for confounders.

### 59.3.1 Adjusting for Confounders

If a set of variables `Z` satisfies the **back‑door criterion** (blocks all spurious paths from X to Y), the causal effect can be estimated by:

`P(Y | do(X = x)) = Σ_z P(Y | X = x, Z = z) P(Z = z)`

This is called **adjustment formula**.

In our DAG, the direct effect of News on Returns is confounded by nothing (since News is exogenous). But the effect of Volume on Returns is confounded by News (News is a common cause of Volume and Returns). To estimate `P(Returns | do(Volume = v))`, we need to adjust for News:

`P(Returns | do(Volume = v)) = Σ_news P(Returns | Volume = v, News = news) P(News = news)`

---

## 59.4 Instrumental Variables

An **instrumental variable (IV)** is a variable that affects the treatment (X) but has no direct effect on the outcome (Y) and is not correlated with confounders. IV methods can estimate causal effects even when there are unobserved confounders.

**Conditions for a valid instrument Z:**
1. Relevance: Z affects X.
2. Exclusion: Z affects Y only through X.
3. Exogeneity: Z is independent of all confounders.

In finance, natural experiments can provide instruments. For example, a change in tax law that affects some stocks but not others could be an instrument for studying the effect of taxes on returns.

**Example: Effect of news sentiment on returns using an instrument**

Suppose we believe that news sentiment (X) affects returns (Y), but sentiment might be confounded by market sentiment (unobserved). We could use an instrument Z such as the occurrence of a natural disaster in a country unrelated to the stock’s business, which affects news coverage but not returns directly. This is unrealistic but illustrative.

**Implementation with `statsmodels` IV2SLS**

```python
import statsmodels.api as sm
from statsmodels.sandbox.regression.gmm import IV2SLS

# Assume data has columns: returns (Y), sentiment (X), instrument (Z)
data = df[['returns', 'sentiment', 'instrument']].dropna()

# First stage: X on Z (plus exogenous variables)
first_stage = sm.OLS(data['sentiment'], sm.add_constant(data['instrument'])).fit()
data['sentiment_hat'] = first_stage.fittedvalues

# Second stage: Y on predicted X
second_stage = sm.OLS(data['returns'], sm.add_constant(data['sentiment_hat'])).fit()
print(second_stage.summary())

# Alternatively, use IV2SLS directly
iv_model = IV2SLS(data['returns'], data[['sentiment']], data[['instrument']]).fit()
print(iv_model.summary())
```

**Explanation:**  
The IV estimator uses the instrument to isolate the exogenous variation in the treatment. The coefficient on `sentiment_hat` estimates the causal effect of sentiment on returns, under the assumption that the instrument is valid.

---

## 59.5 Propensity Score Matching

When the treatment is binary (e.g., a stock is included in an index vs. not), and we have many confounders, **propensity score matching** can estimate the average treatment effect on the treated (ATT). The propensity score is the probability of receiving treatment given covariates. By matching treated units with untreated units that have similar propensity scores, we mimic a randomised experiment.

**Example: Effect of being added to the NEPSE index on stock returns**

Suppose we want to know whether inclusion in a major index (treatment) causes higher returns. Stocks added to the index may differ from those not added (e.g., larger market cap). We can match each added stock with a non‑added stock that had similar characteristics (size, volatility, sector) before the inclusion.

**Implementation with `pymatch` or `causalml`**

```python
# Using `causalml` for propensity score matching
from causalml.inference.matching import NearestNeighborMatch

# data: columns: treatment (1/0), outcome (returns after inclusion), features (covariates)
psm = NearestNeighborMatch(replace=True, caliper=0.2)
matched = psm.match(data=data, treatment_col='treatment', score_cols=['size', 'volatility', 'sector_dummies'])
```

**Explanation:**  
`NearestNeighborMatch` computes propensity scores (e.g., using logistic regression) and finds matches. The average difference in outcomes between treated and matched controls estimates the ATT.

---

## 59.6 Difference‑in‑Differences

**Difference‑in‑Differences (DiD)** is used when we have panel data (multiple units observed over time) and a treatment that affects some units at a specific time. It compares the change in outcome for treated units before and after treatment to the change for untreated units over the same period. This removes time‑invariant unobserved confounders.

**Example: Effect of a circuit breaker rule change on volatility**

Suppose NEPSE introduced a new circuit breaker rule on January 1, 2024. We want to estimate its effect on daily volatility. We compare the change in volatility for stocks affected by the rule (all stocks) to a control group of unaffected stocks (e.g., from another exchange) over the same period. If no control exists, we might use a synthetic control (next section).

**Implementation with `statsmodels`**

```python
import statsmodels.api as sm
from statsmodels.formula.api import ols

# data: panel with columns: stock, date, volatility, treated (1 if after treatment for treated group), post (1 if after treatment date)
# DiD interaction term: treated * post
model = ols('volatility ~ treated * post', data=data).fit()
print(model.summary())
```

**Explanation:**  
The coefficient on the interaction term `treated:post` is the DiD estimate of the treatment effect. It assumes parallel trends in the absence of treatment (the untreated group’s trend is a valid counterfactual for the treated group).

---

## 59.7 Synthetic Control

**Synthetic control** is used when there is a single treated unit and multiple untreated units. It constructs a weighted average of untreated units (the synthetic control) that closely matches the treated unit’s pre‑treatment trajectory. The post‑treatment difference between the treated unit and its synthetic control estimates the treatment effect.

**Example: Effect of a new IPO on the sector index**

Suppose a large company (e.g., a hydropower firm) IPOs and is added to the NEPSE index. We want to know its effect on the hydropower sector index. The sector index after the IPO is influenced by the new stock. We can create a synthetic control for the sector index using other sector indices (banking, insurance) that were not affected, and compare.

**Implementation with `synth` or `causalimpact`**

```python
from causalimpact import CausalImpact

# Prepare data: pre and post intervention time series for treated unit (sector index)
# and covariates (other sector indices)
pre_period = ['2023-01-01', '2023-12-31']
post_period = ['2024-01-01', '2024-06-30']

impact = CausalImpact(data, pre_period, post_period)
impact.plot()
print(impact.summary())
```

**Explanation:**  
`CausalImpact` uses a Bayesian structural time‑series model to construct a synthetic control and estimate the causal effect. It provides confidence intervals and p‑values.

---

## 59.8 Causal Discovery from Time‑Series

Causal discovery aims to learn the causal graph from observational data. For time‑series, we must account for temporal order: causes precede effects. Common methods include:

- **Granger causality**: Tests whether past values of X help predict Y, controlling for past Y. This is a statistical notion of predictive causality.
- **PCMCI**: A more sophisticated method that combines conditional independence tests with the momentary conditional independence (MCI) approach, handling autocorrelation and high dimensionality.

### 59.8.1 Granger Causality

Granger causality tests whether X “Granger‑causes” Y. It fits two VAR models: one with only lags of Y, one with lags of both X and Y. If the latter significantly improves prediction, X Granger‑causes Y.

**Implementation with `statsmodels`**

```python
from statsmodels.tsa.stattools import grangercausalitytests

# data: DataFrame with columns ['X', 'Y']
data = df[['news_sentiment', 'stock_return']].dropna()
gc_res = grangercausalitytests(data, maxlag=5, verbose=True)
```

**Explanation:**  
For each lag, the test reports F‑statistic and p‑value. A low p‑value suggests X Granger‑causes Y. Remember: Granger causality is about predictive ability, not structural causality.

### 59.8.2 PCMCI

PCMCI (Peter and Clark with Momentary Conditional Independence) is a more robust algorithm for time‑series causal discovery. It first estimates a graph using conditional independence tests, then refines it using momentary conditional independence to avoid false positives due to autocorrelation.

**Implementation with `tigramite`**

```python
from tigramite import data_processing as pp
from tigramite.pcmci import PCMCI
from tigramite.independence_tests import ParCorr

# data: numpy array (time, variables)
dataframe = pp.DataFrame(data, datatime=range(len(data)))
pcmci = PCMCI(dataframe, cond_ind_test=ParCorr())
results = pcmci.run_pcmci(tau_max=5, pc_alpha=0.05)
pcmci.print_significant_links()
```

**Explanation:**  
`tigramite` outputs a graph with directed links at certain lags. For example, a link from X at lag 2 to Y means X at t‑2 influences Y at t. This provides a richer temporal causal structure.

---

## 59.9 Time‑Series Causal Inference with VAR

Vector Autoregression (VAR) can be used for causal inference by examining the coefficients and performing **impulse response analysis**. An impulse response function (IRF) shows the effect of a one‑time shock to one variable on the entire system over time. This can be interpreted causally if we assume the shocks are exogenous and the VAR correctly captures the dynamics.

**Example: Effect of a news shock on stock returns**

We can estimate a VAR with news sentiment and returns, then compute the IRF of a news shock on returns.

```python
from statsmodels.tsa.api import VAR

model = VAR(data[['news', 'returns']])
results = model.fit(maxlags=5, ic='aic')
irf = results.irf(10)
irf.plot(impulse='news', response='returns')
plt.show()
```

**Explanation:**  
The IRF shows how returns react over time to a one‑unit shock in news. If the response is significant and persists, it suggests a causal effect. However, this interpretation relies on the shocks being uncorrelated (orthogonalised IRF) and the VAR being correctly specified.

---

## 59.10 Application in NEPSE: Does News Sentiment Cause Returns?

Let's design a concrete causal inference study for NEPSE. We have daily news sentiment scores for each stock (constructed from headlines) and daily returns. We want to answer: does positive news cause higher returns, or is the relationship driven by confounding (e.g., overall market sentiment)?

**Hypothesis:** News sentiment causes returns, with a possible lag.

**Confounders:** Overall market return (NEPSE index), sector return, day‑of‑week effects.

**Method:** We could use a combination of:

- Granger causality tests to see if sentiment predicts returns.
- A panel regression with fixed effects and lagged sentiment, controlling for market and sector.
- Instrumental variables if a valid instrument exists (e.g., unexpected news events like natural disasters affecting unrelated companies).

**Simplified analysis using panel regression with fixed effects:**

```python
import statsmodels.api as sm
from statsmodels.formula.api import ols

# Data: panel of stocks, with columns: stock, date, return, sentiment, market_return, sector_return
# Create lagged sentiment
data['sentiment_lag1'] = data.groupby('stock')['sentiment'].shift(1)

# Include stock fixed effects (dummies)
model = ols('return ~ sentiment_lag1 + market_return + sector_return + C(stock) + C(date.dt.dayofweek)', data=data).fit()
print(model.summary())
```

**Explanation:**  
The coefficient on `sentiment_lag1` estimates the effect of yesterday’s sentiment on today’s return, controlling for market and sector movements and day‑of‑week effects. Stock fixed effects absorb time‑invariant stock characteristics. This is still not fully causal (there could be time‑varying confounders), but it’s a step towards causal interpretation.

---

## 59.11 Limitations and Pitfalls

Causal inference from observational data is challenging and relies on strong assumptions:

- **No unobserved confounders**: We must believe we have measured all confounders. In finance, this is rarely true (e.g., investor sentiment is hard to measure).
- **Correct model specification**: The functional form must be correct (e.g., linearity).
- **No measurement error**: Errors in variables can bias estimates.
- **Stable unit treatment value assumption (SUTVA)**: No interference between units (e.g., one stock’s treatment does not affect another’s outcome). In markets, this is violated (spillover effects).
- **Ignorability of treatment assignment**: In matching, we assume treatment is independent of potential outcomes given covariates.

Sensitivity analysis should be performed to assess how strong an unmeasured confounder would have to be to overturn the conclusions.

---

## Chapter Summary

In this chapter, we introduced causal inference and its importance for understanding cause‑effect relationships in financial time‑series. We covered:

- The fundamental difference between correlation and causation.
- Causal graphs and structural causal models for encoding assumptions.
- Do‑calculus and adjustment for confounders.
- Instrumental variables for handling unobserved confounders.
- Propensity score matching for treatment effects with binary treatments.
- Difference‑in‑differences and synthetic control for policy evaluation.
- Causal discovery from time‑series using Granger causality and PCMCI.
- Using VAR and impulse responses for causal interpretation.
- A concrete application to NEPSE: does news sentiment cause returns?
- The limitations and assumptions behind causal methods.

Causal inference is a deep and rapidly evolving field. For the NEPSE prediction system, incorporating causal reasoning can lead to more robust models and better decision‑making. In the next chapter, we will discuss **Advanced Optimisation Techniques**, exploring methods to optimise models and hyperparameters at scale.

---

**End of Chapter 59**