### Example of when greedily selecting most uncertain variable fails

A possible policy for intervention selection would be to pick the variable of whose causal effect we are most uncertain (given by the confidence interval). This notebook contains an example when this might fail.

The SEM is picked on purpose so that the effect of $X_0$ on $Y$ cancels out. If the first intervention is done on $X_1$, the most uncertain variable becomes $X_0$; however, intervening on it provides no new information.

What is even worse, a purely greedy strategy (that picks with replacement) would be forever stuck.

In [1]:
library(InvariantCausalPrediction)

Loading required package: glmnet
Loading required package: Matrix
Loading required package: foreach
Loaded glmnet 2.0-18

Loading required package: mboost
Loading required package: parallel
Loading required package: stabs
This is mboost 2.9-1. See ‘package?mboost’ and ‘news(package  = "mboost")’
for a complete list of changes.



In [2]:
n <- 100000

In [3]:
# SEM (observational)
eps0 <- rnorm(n,0,1)
eps1 <- rnorm(n,0,1)
eps2 <- rnorm(n,0,1)
eps3 <- rnorm(n,0,1)
eps4 <- rnorm(n,0,1)

X0 <- eps0
X1 <- -X0 + eps1
X2 <- X0 + eps2
X3 <- X1 + X2 + eps3
X4 <- X3 + eps4

obs_data = cbind(X0, X1, X2, X4)
obs_target = X3

In [4]:
# SEM (intervened on X0)
eps0 <- rnorm(n,5,1)
eps1 <- rnorm(n,0,1)
eps2 <- rnorm(n,0,1)
eps3 <- rnorm(n,0,1)
eps4 <- rnorm(n,0,1)

X0 <- eps0
X1 <- -X0 + eps1
X2 <- X0 + eps2
X3 <- X1 + X2 + eps3
X4 <- X3 + eps4

e0_data = cbind(X0, X1, X2, X4)
e0_target = X3

In [5]:
# SEM (intervened on X1)
eps0 <- rnorm(n,0,1)
eps1 <- rnorm(n,5,1)
eps2 <- rnorm(n,0,1)
eps3 <- rnorm(n,0,1)
eps4 <- rnorm(n,0,1)

X0 <- eps0
X1 <- eps1
X2 <- X0 + eps2
X3 <- X1 + X2 + eps3
X4 <- X3 + eps4

e1_data = cbind(X0, X1, X2, X4)
e1_target = X3

In [6]:
# SEM (intervened on X2)
eps0 <- rnorm(n,0,1)
eps1 <- rnorm(n,0,1)
eps2 <- rnorm(n,5,1)
eps3 <- rnorm(n,0,1)
eps4 <- rnorm(n,0,1)

X0 <- eps0
X1 <- -X0 + eps1
X2 <- eps2
X3 <- X1 + X2 + eps3
X4 <- X3 + eps4

e2_data = cbind(X0, X1, X2, X4)
e2_target = X3

In [7]:
# SEM (intervened on X4)
eps0 <- rnorm(n,0,1)
eps1 <- rnorm(n,0,1)
eps2 <- rnorm(n,0,1)
eps3 <- rnorm(n,0,1)
eps4 <- rnorm(n,5,1)

X0 <- eps0
X1 <- -X0 + eps1
X2 <- X0 + eps2
X3 <- X1 + X2 + eps3
X4 <- eps4

e4_data = cbind(X0, X1, X2, X4)
e4_target = X3

**Run ICP**

In [16]:
data = rbind(obs_data, e1_data, e2_data)#, e4_data)
targets = c(obs_target, e1_target, e2_target)#, e4_target)
idx = rep(0:2, each=n)

ICP(data, targets, idx, alpha=0.01, test = "normal", selection = "all")


 *** 12% complete: tested 2 of 16 sets of variables 
 accepted set of variables 2,3
 accepted set of variables 1,2,3
 accepted set of variables 2,3,4
 accepted set of variables 1,2,3,4


 Invariant Linear Causal Regression at level 0.01 (including multiplicity correction for the number of variables)
 Variables: X1, X2 show a significant causal effect
 
     LOWER BOUND  UPPER BOUND  MAXIMIN EFFECT  P-VALUE    
X0        -0.01         0.00            0.00        1    
X1         0.50         1.00            0.50   <1e-09 ***
X2         0.50         1.00            0.50   <1e-09 ***
X4         0.00         0.50            0.00        1    
---
Signif. codes:  0 ‘***’ 0.001 ‘**’ 0.01 ‘*’ 0.05 ‘.’ 0.1 ‘ ’ 1

