## Preliminaries for all demonstrations
For all demonstrations, we adopt the conventions commonly used in the causal inference literature. An unbolded italic captial letter represents a random variable, simply referred to as a variable. A bolded roman capital letter represents a set of variables. For example, $X$ is a variable, and $\mathbf{X} = \{ X_i \}_{i=1}^n$ is a set of variables. The domain of the variable is denoted by $D(X)$. A lowercase italic letter $x \in D(X)$ represents a value that is assigned to $X$, and a bolded lowercase letter $\mathbf{x} = \{ x_i \}_{i=1}^n$ represents a set of values that are assigned to $\mathbf{X}$. The distribution of a variable or a set of variable is denoted by $P(X)$ or $P(\mathbf{X})$, respectively.

A causal graph is denoted by $G = \langle \mathbf{V}, \mathbf{E} \rangle$. Here, $\mathbf{V}$ is a set of vertices, each of which represents a variable, and $\mathbf{E}$ is a set of edges, each of which represents a direct causal relationship between two variables. We also use family notations to represent the set of variables that are parents ($pa(\cdot)$), children ($ch(\cdot)$), ancestors ($an(\cdot)$), and descendants ($de(\cdot)$) of a given variable, respectively. 

Structural equation model (SEM) is adpopted to represent the causality in the causal graph. An SCM $M$ is a tuple $\langle \mathbf{U}, \mathbf{V}, \mathbf{F}, P(\mathbf{U}) \rangle$, where $\mathbf{U}$ is a set of indepedent exogenous background variables, of which distribution is controled by $P(\mathbf{U})$, $\mathbf{V}$ is a set of observed endogenous variables, and $\mathbf{F}=\{f_i\}_{i=1}^{|\mathbf{V}|}$ is a set of functions represent the relationships among variables, i.e., $v_i = f_i(pa(v_i), \mathbf{U}_i)$, with $v_i \in \mathbf{V}$, $\mathbf{U}_i \subseteq \mathbf{U}$. The causal graph $G$ encodes the causal relationships among variables in $\mathbf{V}$. Within $\mathbf{V}$, there are three types of variables: non-manipulative variables $\mathbf{C}$, which could not be intervened, manipulative variables $\mathbf{X}$, which could be intervened, and target variables $\mathbf{Y}$, which are the variables of interest. In all deomonstations, we consider a single target variable $Y$. Probability of $Y=y$ under intervention $do(\mathbf{x})$ is denoted by $P(y|do(\mathbf{x}))$, where intervention on $\mathbf{X}$ is describe by the causal graph $G_{\overline{\mathbf{X}}}$ that is obtained by removing all edges pointing to $\mathbf{X}$.

## Demonstration 1: Finding the minimal intervention sets

**Question 1:** In the paper, the authors considered very small and simple graphs, which might not be the case in practice. Can you give an example of a causal graph with 15 nodes at each time step -- 7 non-manipulable, 7 manipulable, and 1 target variable, how would you get the exploration set (a key input to the algorithm)? Would you write a program for this purpose?  Is it enough to have simply the causal diagram to get the exploration set?  What additional specifications do you need ?

Below is a simple example of a causal graph with 15 nodes. 

![An example causal graph with 15 nodes.](graphs_15_nodes.svg)

**Fig. 1.** An example causal graph with 15 nodes.

**Definition 1: Minimal Intervention Set (MIS).** A set of variables $X_s$ is said to be a minimal intervention set for a target variable $Y$ in a causal graph $G$ if there is no 

**Proposition 1: (Minimality).** *A set of variables $\mathbf{X}_s$ is an MIS for a target variable $Y$ in a causal graph $G$ if an only if $\mathbf{X}_s \subseteq an(Y)_{G_{\overline{\mathbf{X}}}}$*

**Proof:**

In [None]:
import sys
sys.path.append('..')
from causal_graph.base import CausalGraph

vertices = ['A', 'B', 'C', 'D', 'E', 'F', 'G', 'H', 'I', 'J', 'K', 'L', 'M', 'N', 'Y']
edges = [('A', 'Y'), 
         ('B', 'M'), ('B', 'L'), 
         ('C', 'A'), ('C', 'B'), ('C', 'I'),
         ('D', 'H'), ('D', 'Y'),
         ('E', 'D'), ('E', 'Y'),
         ('F', 'E'),
         ('G', 'F'), ('G', 'L'), 
         ('I', 'A'), ('I', 'J'),
         ('J', 'D'), ('J', 'H'), 
         ('K', 'Y'), ('K', 'B'), ('K', 'C'),
         ('L', 'Y'), ('L', 'F'),
         ('M', 'N'), ('M', 'G'), ('M', 'L'),
         ('N', 'G')]
treat_vars =[ 'A', 'B', 'C', 'D', 'E', 'F', 'G']
cgobj = CausalGraph(vertices, edges, treat_vars, 'Y')

print(cgobj.minimal_intervene_set())
print("There are {} minimal intervene sets.".format(len(cgobj.minimal_intervene_set())))
