# Algorithm 1: Generalized ID Algorithm for ioSCMs

**Source:** Forré & Mooij (2019) - "Causal Calculus in the Presence of Cycles, Latent Confounders and Selection Bias"

---

## Algorithm Pseudocode

### Function: ID (Main Function)

$$
\begin{align*}
&\textbf{1: function } \mathbf{ID}(G, \mathbf{Y}, \mathbf{W}, P(\mathbf{V} | do(\mathbf{J}))) \\
&\textbf{2: require: } \mathbf{Y} \subseteq \mathbf{V}, \mathbf{W} \subseteq \mathbf{V}, \mathbf{Y} \cap \mathbf{W} = \emptyset \\
&\textbf{3: } H \leftarrow \text{Anc}^{G_{\mathbf{V} \setminus \mathbf{W}}}(\mathbf{Y}) \\
&\textbf{4: } \textbf{for } C \in \text{CD}(H) \textbf{ do} \\
&\textbf{5: } \quad Q[C] \leftarrow \text{IDCD}(G, C, \text{Cd}^G(C), Q[\text{Cd}^G(C)]) \\
&\textbf{6: } \quad \textbf{if } Q[C] = \text{FAIL} \textbf{ then} \\
&\textbf{7: } \quad\quad \textbf{return } \text{FAIL} \\
&\textbf{8: } \quad \textbf{end if} \\
&\textbf{9: } \textbf{end for} \\
&\textbf{10: } Q[H] \leftarrow \left[ \bigotimes_{C \in \text{CD}(H)} \right] Q[C] \\
&\textbf{11: } \textbf{return } P(\mathbf{Y} | do(\mathbf{J}, \mathbf{W})) = \int Q[H] \, d\mathbf{x}_{H \setminus \mathbf{Y}} \\
&\textbf{12: } \textbf{end function}
\end{align*}
$$

---

### Function: IDCD (Recursive Helper Function)

$$
\begin{align*}
&\textbf{13: function } \mathbf{IDCD}(G, \mathbf{C}, \mathbf{D}, Q[\mathbf{D}]) \\
&\textbf{14: require: } \mathbf{C} \subseteq \mathbf{D} \subseteq \mathbf{V}, \text{CD}(G_{\mathbf{D}}) = \{\mathbf{D}\} \\
&\textbf{15: } A \leftarrow \text{Anc}^{G[\mathbf{D}]}(\mathbf{C}) \cap \mathbf{D} \\
&\textbf{16: } Q[A] \leftarrow \int Q[\mathbf{D}] \, d(\mathbf{x}_{\mathbf{D} \setminus A}) \\
&\textbf{17: } \textbf{if } A = \mathbf{C} \textbf{ then} \\
&\textbf{18: } \quad \textbf{return } Q[A] \\
&\textbf{19: } \textbf{else if } A = \mathbf{D} \textbf{ then} \\
&\textbf{20: } \quad \textbf{return } \text{FAIL} \\
&\textbf{21: } \textbf{else if } \mathbf{C} \subset A \subset \mathbf{D} \textbf{ then} \\
&\textbf{22: } \quad \textbf{for } S \in \mathcal{S}(G[A]) \text{ s.t. } S \subseteq \text{Cd}^{G[A]}(\mathbf{C}) \textbf{ do} \\
&\textbf{23: } \quad\quad R_A[S] \leftarrow P(S | \text{Pred}^G_<(S) \cap A, do(\mathbf{J} \cup \mathbf{V} \setminus A)) \\
&\textbf{24: } \quad \textbf{end for} \\
&\textbf{25: } \quad Q[\text{Cd}^{G[A]}(\mathbf{C})] \leftarrow \bigotimes_{\substack{S \in \mathcal{S}(G[A]) \\ S \subseteq \text{Cd}^{G[A]}(\mathbf{C})}} R_A[S] \\
&\textbf{26: } \quad \textbf{return } \text{IDCD}(G, \mathbf{C}, \text{Cd}^{G[A]}(\mathbf{C}), Q[\text{Cd}^{G[A]}(\mathbf{C})]) \\
&\textbf{27: } \textbf{end if} \\
&\textbf{28: } \textbf{end function}
\end{align*}
$$

### Line by Line Explanation of Algorithm

#### Line 1 - Function Declaration

$$
\textbf{1: function } \mathbf{ID}(G, \mathbf{Y}, \mathbf{W}, P(\mathbf{V} | do(\mathbf{J})))
$$

---
**Symbol Definitions:**

| Symbol | Definition | Type |
|--------|------------|------|
| $\mathbf{ID}$ | Identification algorithm | Function name |
| $G$ | Causal graph (directed mixed graph with possible cycles) | Graph structure |
| $\mathbf{Y}$ | Target variables (outcome we want to identify) | Set of variables |
| $\mathbf{W}$ | Intervention variables (treatment we're testing) | Set of variables |
| $P(\mathbf{V} \| do(\mathbf{J}))$ | Observational distribution under background interventions | Probability distribution |
| $\mathbf{V}$ | All observed variables in the system | Set of variables |
| $\mathbf{J}$ | Background intervention variables (experimental conditions) | Set of variables |
| $do(\cdot)$ | Intervention operator (forcing variables to specific values) | Operator |


**Plain English Explanation:**

This line declares the main function called "ID", in which the function determines whether a causal effect can be identified from observational data.

**Four Inputs of the Algorithm:**

1. **$G$ (Causal Graph):**
   - Your causal network structure showing which variables affect others
   - Can contain cycles (feedback loops) and latent confounders (bidirected edges)
   - Any known mechanistic relationships

2. **$\mathbf{Y}$ (Target Variables):**
   - The variables you want to predict or understand (your outcome)
   - Must be measurable/observable variables
   - Can be one or multiple variables
   - Example: $\mathbf{Y} = \{\text{lysine\_concentration}\}$

3. **$\mathbf{W}$ (Intervention Variables):**
   - The variables you're manipulating (your treatment)
   - Must be variables you can observe and control
   - Can be one or multiple variables
   - Example: $\mathbf{W} = \{\text{lysA\_knockout}\}$

4. **$P(\mathbf{V} | do(\mathbf{J}))$ (Observational Data):**
   - The probability distribution of all observed variables
   - Collected under specific experimental conditions $\mathbf{J}$
   - This is your actual experimental dataset (e.g., RNA-seq, metabolomics)
   - The "$do(\mathbf{J})$" indicates these are already controlled/fixed conditions
   - Example: $P(\text{all genes and metabolites} | \text{minimal media}, 37°C, \text{glucose}=10\text{mM})$

**What This Function Does:**

The ID function answers: **"Can I predict what happens to $\mathbf{Y}$ when I intervene on $\mathbf{W}$, using only my observational data?"**

Put more formally: **"Is $P(\mathbf{Y} | do(\mathbf{J}, \mathbf{W}))$ identifiable from $P(\mathbf{V} | do(\mathbf{J}))$?"**

### Line 2: Precondition Check

$$
\textbf{2: require: } \mathbf{Y} \subseteq \mathbf{V}, \mathbf{W} \subseteq \mathbf{V}, \mathbf{Y} \cap \mathbf{W} = \emptyset
$$

**Symbol Definitions:**

| Symbol | Definition | How to Read |
|--------|------------|-------------|
| $\textbf{require}$ | Precondition check (must be true to proceed) | "require" |
| $\mathbf{Y}$ | Target variables (outcome) | "Y" |
| $\subseteq$ | Subset relation (contained in) | "is a subset of" or "is contained in" |
| $\mathbf{V}$ | All observed variables | "V" |
| $\mathbf{W}$ | Intervention variables (treatment) | "W" |
| $\cap$ | Intersection (elements in both sets) | "intersect" or "cap" |
| $\emptyset$ | Empty set (no elements) | "empty set" or "null set" |

---

**English Explanation:**

This line looks to check the three conditions that must all be true before the algorithm can proceed. These are **preconditions** meaning if any of them fail, the algorithm will stop.

**The Three Conditions:**

**Condition 1: $\mathbf{Y} \subseteq \mathbf{V}$**

**Mathematical:** "Y is a subset of V"

**Plain English:** "Every variable in Y must be an observed variable"

**Why this matters:**
- You cannot identify (predict) something you don't measure
- If Y contains hidden/latent variables, we have no data on them
- The algorithm requires that everything in Y is actually observable

---


**Condition 2: $\mathbf{W} \subseteq \mathbf{V}$**

**Mathematical:** "W is a subset of V"

**Plain English:** "Every intervention variable must be an observed variable"

**Why this matters:**
- You cannot intervene on variables you cannot observe or control
- If W contains latent variables, you can't physically manipulate them in an experiment
- Need to verify interventions actually happened by measuring W

---


**Condition 3: $\mathbf{Y} \cap \mathbf{W} = \emptyset$**

**Mathematical:** "The intersection of Y and W is empty"

**Plain English:** "Y and W have no variables in common" or "Y and W don't overlap"

**Why this matters:**
- Doesn't make logical sense to ask "What's the effect of X on X?"
- If you're intervening on (fixing) X, you already know its value
- Can't simultaneously fix X and ask "what happens to X?"


**Note:** This line in particular prevents circular questions by ensuring we never ask "What is the effect of X on X?" (Condition 3: Y ∩ W = ∅).

**What Happens If Any Condition Fails:**

The algorithm **immediately stops** and returns an error explaining which condition was violated.

**Pseudocode:**
```python
def line_2_check(Y, W, V):
    if not Y.issubset(V):
        raise Error("Y must be subset of observed variables V")
    
    if not W.issubset(V):
        raise Error("W must be subset of observed variables V")
    
    if Y.intersection(W) != set():
        raise Error("Y and W must not overlap")
    
    # If we reach here, all checks passed
    return "Preconditions satisfied - proceeding on."
```

**Summarizing Note: This ensures that we can measure what we want to predict (Y ⊆ V) control what we want to intervene on (W ⊆ V) and ask a non-circular question too (Y ∩ W = ∅).
---

### Line 3: Ancestral Closure

$$
\textbf{3: } H \leftarrow \text{Anc}^{G_{\mathbf{V} \setminus \mathbf{W}}}(\mathbf{Y})
$$

---

**Symbol Definitions:**

| Symbol | Definition | How to Read |
|--------|------------|-------------|
| $H$ | Set of relevant variables (result) | "H" |
| $\leftarrow$ | Assignment operator | "gets" or "is assigned" |
| $\text{Anc}^G(\mathbf{Y})$ | Ancestors of $\mathbf{Y}$ in graph $G$ | "Anc of Y in G" or "ancestors of Y" |
| $G_{\mathbf{V} \setminus \mathbf{W}}$ | Modified graph after removing nodes $\mathbf{W}$ | "G sub V minus W" |
| $\mathbf{V} \setminus \mathbf{W}$ | Set difference: all variables in $\mathbf{V}$ except $\mathbf{W}$ | "V minus W" or "V set-minus W" |
| $\mathbf{Y}$ | Target variables | "Y" |
| $\mathbf{W}$ | Intervention variables | "W" |

**English Explanation:**
H gets the ancestors of Y in the graph G-sub-V-minus-W.

or in a less formal way:

H is assigned the set of ancestors of Y in the modified graph where W has been removed.

---
This line performs **two operations in sequence** to find the "relevant variables" for identifying $\mathbf{Y}$:

**Step 1: Create Modified Graph $G_{\mathbf{V} \setminus \mathbf{W}}$**

**What it does:** Remove all intervention nodes $\mathbf{W}$ from the graph

**How:**
1. Take the original graph $G$
2. Delete all nodes in $\mathbf{W}$
3. Delete all edges connected to those nodes

**Why remove W:**
- We're intervening on $\mathbf{W}$ (forcing it to specific values)
- Intervention $do(\mathbf{W})$ means "cut off what causes $\mathbf{W}$"
- $\mathbf{W}$ becomes exogenous (determined by experimenter, not the system)
- What naturally causes $\mathbf{W}$ is now irrelevant

---

**Step 2: Find Ancestors $\text{Anc}^{G_{\mathbf{V} \setminus \mathbf{W}}}(\mathbf{Y})$**

**What it does:** In the modified graph (without $\mathbf{W}$), find all variables that have a directed path to $\mathbf{Y}$

**Why find ancestors:**
- Only ancestors can causally affect $\mathbf{Y}$
- Variables with no path to $\mathbf{Y}$ are irrelevant
- This dramatically shrinks the problem!

---

**Step 3: Store back into $H$**

**What it is:** $H$ is the set of "relevant variables" for the rest of the algorithm

**Properties of H:**
- $\mathbf{Y} \subseteq H$ (always includes targets)
- $\mathbf{W} \notin H$ (interventions removed)
- $H \subseteq \mathbf{V}$ (subset of all variables)
- Often much smaller than $\mathbf{V}$ (reduces the identification problem)
---

**Summarizing Thoughts:**

Line 3 reduces the identification problem by:
1. Removing intervention variables $\mathbf{W}$ (captures $do(\mathbf{W})$ semantics)
2. Finding ancestors of $\mathbf{Y}$ (identifies causally relevant variables)
3. Storing in $H$ (simplified problem for subsequent steps)

**Note:** This is where the Ancestral Closure definition can be defined in order to understand what "ancestors" means in a causal graph and why they're the only relevant variables for identifying causal effects.

---

### **Ancestral Closure Definition Draft: Key Definition to implement for Line 3**

**Notation:** $\text{Anc}^G(\mathbf{Y})$

**Definition:** The set of all variables in graph $G$ that have a directed path to at least one variable in $\mathbf{Y}$.

**Mathematical notation:**
$$
\text{Anc}^G(\mathbf{Y}) = \{v \in V : \exists y \in \mathbf{Y}, \exists \text{ directed path from } v \text{ to } y \text{ in } G\}
$$

**In words:** "The set of all variables $v$ such that there exists a directed path from $v$ to some variable in $\mathbf{Y}$"

--- 

### Key Properties

**1. Follows directed edges only**
- A path from $v$ to $y$ means following arrows in their direction
- Example: $v \rightarrow w \rightarrow y$ (valid path)
- Cannot go backwards: $v \leftarrow w \leftarrow y$ (not a valid ancestor path)

**2. Includes Y itself**
- Every variable is its own ancestor by definition
- If $\mathbf{Y} = \{\text{lysine}\}$, then $\text{lysine} \in \text{Anc}(\mathbf{Y})$

**3. Works with cycles**
- In cyclic graphs (feedback loops), nodes can be mutual ancestors
- Example: If $A \rightarrow B \rightarrow A$ (cycle), then both $A$ and $B$ are ancestors of each other

**4. Transitive property**
- If $A$ is ancestor of $B$, and $B$ is ancestor of $C$
- Then $A$ is ancestor of $C$

---

### Why do we need this definition here? 

**The Causal Principle:** 
Only ancestors can causally affect a variable. If there's no directed path from $v$ to $y$, then $v$ cannot cause changes in $y$.

**How Line 3 Uses This:**

Line 3 computes $H \leftarrow \text{Anc}^{G_{\mathbf{V} \setminus \mathbf{W}}}(\mathbf{Y})$ in two steps:

1. **Remove $\mathbf{W}$ from graph** → Captures intervention semantics
   - Variables that only affect $\mathbf{Y}$ through $\mathbf{W}$ are excluded
   - Represents "cutting off" what causes $\mathbf{W}$

2. **Find ancestors of $\mathbf{Y}$** → Identifies relevant variables
   - Only ancestors can causally affect $\mathbf{Y}$
   - Non-ancestors are safely ignored

**Impact:**
- **Computational:** Reduces problem from all variables $\mathbf{V}$ to just $H$ 
- **Conceptual:** Focuses analysis on causally relevant pathways
- **Correctness:** Ensures we don't miss any variables that could affect $\mathbf{Y}$


### Line 4 - Loop over Consolidated Districts

$$
\textbf{4: } \textbf{for } C \in \text{CD}(H) \textbf{ do}
$$

**Symbol Definitions:**

| Symbol | Definition | How to Read |
|--------|------------|-------------|
| $\textbf{for}$ | Loop control structure | "for each" |
| $C$ | A consolidated district (set of variables) | "C" |
| $\in$ | Element of / membership | "in" or "is an element of" |
| $\text{CD}(H)$ | Set of all consolidated districts in subgraph $H$ | "CD of H" or "consolidated districts of H" |
| $H$ | Relevant variables (from Line 3) | "H" |
| $\textbf{do}$ | Begin loop body | "do" |

---

**Reading this line:**
"For each C in the consolidated districts of H, do..."

or more naturally:

"For each consolidated district C in H, execute the following..."

---
**Plain English Explanation:**

This line begins a loop that processes each **consolidated district** in $H$ separately.

**What is a consolidated district?**
- A group of variables that are "stuck together" and must be identified as a unit because they're coupled by:
1. **Latent confounders** (bidirected edges ↔), OR
2. **Feedback loops** (being in the same strongly connected component)

**Why loop over districts?**
- Variables within a district cannot be separated
- Must identify distributions over entire districts
- Different districts can be processed independently
- This is a divide-and-conquer strategy

**What happens in the loop?**
Lines 5-8 will attempt to identify each district $C$ by calling the recursive function IDCD.

---

**The Key Idea:**

In acyclic graphs without latent confounders:
- Each variable can be identified separately
- Districts have just 1 variable each

In cyclic graphs with latent confounders:
- Variables are coupled (through cycles or confounding)
- Districts can have multiple variables
- Must identify the whole district together

**On another note:** Line 5 will attempt to identify each district using the recursive IDCD function.
