In [1]:
import numpy as np
from scipy import stats

In [2]:
SAMPLE_SIZE = 100
np.random.seed(45)

u_0 = np.random.randn(SAMPLE_SIZE)
u_1 = np.random.randn(SAMPLE_SIZE)
a = u_0
b = 5 * a + u_1
r, p = stats.pearsonr(a, b)
print(f'Mean of B before any intervention: {b.mean():.3f}')
print(f'Variance of B before any intervention: {b.var():.3f}')
print(f'Correlation between A and B:\nr = {r:.3f}; p ={p:.3f}\n')

Mean of B before any intervention: -0.620
Variance of B before any intervention: 22.667
Correlation between A and B:
r = 0.978; p =0.000



As we can see, the correlation between values of A and B is very high(r = .978;p < .001).It’s not surprising, given that B is a simple linear function of A. The mean of B is slightly below zero, and the variance is around 22.

### Intervening on A by fixing its value as 1.5

In [4]:
a = np.array([1.5]*SAMPLE_SIZE)
b = 5 * a + u_1
print(f'Mean of B after intervention: {b.mean():.3f}')
print(f'Variance of B before any intervention: {b.var():.3f}')

Mean of B after intervention: 7.575
Variance of B before any intervention: 1.003


Both the mean and variance have changed. The new mean of B is significantly greater than the previous one. This is because the value of our intervention on A (1.5) is much bigger than what we’d expect from the original distribution of A (centered at 0). At the same time, the variance has shrunk. This is because A became constant, and the only remaining variability in B comes from its stochastic parent, U1.

### Intervening on B instead

In [5]:
a = u_0
b = np.random.randn(SAMPLE_SIZE)
r, p = stats.pearsonr(a, b)
print(f'Mean of B after the intervention on B: {b.mean():.3f}')
print(f'Variance of B after the intervention on B: {b.var():.3f}')
print(f'Correlation between A and B after intervening on B:\nr ={r:.3f}; p = {p:.3f}\n')

Mean of B after the intervention on B: 0.186
Variance of B after the intervention on B: 0.995
Correlation between A and B after intervening on B:
r =-0.023; p = 0.821



1. Note that the correlation between A and B dropped to almost zero(r = −.023),and the corresponding p-value indicates a lack of significance (p = .821). This indicates that after the intervention, A and B became (linearly) independent.
2. This result suggests that there is no causal link from B to A. At the same time, previous results demonstrated that intervening on A changes B, indicating that there is a causal link from A to B