
# Seeing vs Doing
**Hands‑on Notebook**


**What you'll practice**
1. Simulate observational vs interventional worlds (`P(Y|X)` vs `P(Y|do(X))`).
2. See how **confounding** inflates correlations (firing squad toy model).

In [1]:

import numpy as np
import pandas as pd
import matplotlib.pyplot as plt

rng = np.random.default_rng(7)

def summarize_binary(y, x=None, do=None, df=None, name=""):
    if df is not None and x is not None:
        p = df.loc[df[x]==1, y].mean()
        n = (df[x]==1).sum()
        print(f"P({y}=1 | {x}=1) = {p:.3f}  [n={n}]  {name}")
    if df is not None and do is not None:
        p = df[y].mean()
        n = len(df)
        print(f"P({y}=1 | do({do})) = {p:.3f}  [n={n}]  {name}")



## Seeing vs Doing with the **Firing Squad** toy model

![Firing squad DAG](../images/firing_squad.png)

- Variables: 
`C` (captain order) 
`SA` (squad A fires)
`SB` (squad B fires)
`D` (death).  


We will compare:
- **Observation**: `P(D|SA=0)` — low, because when A doesn't fires, usually B also does not fire (same cause `C`).
- **Intervention**: `P(D|do(SA=0))` — set A to not fire regardless of the command `C`; isolate A's own causal contribution. Now we have some times B firind and some times B not firing resulting in a more complex scenario.


To have a bit of more interesting scenario, we impose some imperfection to obediance of our squads in the code below:

In [2]:
N = 100000
rng = np.random.default_rng(6)

# Structural equations (binary)
C = rng.binomial(1, 0.5, size=N)  # Captain order
SA = (C & (rng.random(N) < 0.95)).astype(int)  # Squad A obeys if ordered
SB = (C & (rng.random(N) < 0.95)).astype(int)  # Squad B obeys if ordered

# Lethality (set to 1 for simplicity, adjust if you want realism)
p_kill_A, p_kill_B = 1.0, 1.0
hit_A = p_kill_A
hit_B = p_kill_B
# hit_A = (SA & (rng.random(N) < p_kill_A)).astype(int)
# hit_B = (SB & (rng.random(N) < p_kill_B)).astype(int)
D = np.maximum(hit_A, hit_B)

# Combine all simulated variables into a single DataFrame for easier analysis and plotting
obs_df = pd.DataFrame(dict(C=C, SA=SA, SB=SB, D=D))

print("Observational world:")
# Compute P(D=1 | SA=0)
p_obs = obs_df.loc[obs_df["SA"] == 0, "D"].mean()
n_obs = (obs_df["SA"] == 0).sum()
print(f"P(D=1 | SA=0) = {p_obs:.3f}  [n={n_obs}]  (seeing)")

# --- Interventional: do(SA=0) ---
C2 = rng.binomial(1, 0.5, size=N)                   # captain as before
SA2 = np.zeros(N, dtype=int)                        # force A not to fire
SB2 = (C2 & (rng.random(N) < 0.95)).astype(int)     # B still reacts to captain

hit_A2 = (SA2 & (rng.random(N) < p_kill_A)).astype(int)
hit_B2 = (SB2 & (rng.random(N) < p_kill_B)).astype(int)
D2 = np.maximum(hit_A2, hit_B2)

do_df = pd.DataFrame(dict(C=C2, SA=SA2, SB=SB2, D=D2))
p_do = do_df["D"].mean()
n_do = len(do_df)

print("\nInterventional world:")
print(f"P(D=1 | do(SA=0)) = {p_do:.3f}  [n={n_do}]  (doing)")


Observational world:
P(D=1 | SA=0) = 1.000  [n=52415]  (seeing)

Interventional world:
P(D=1 | do(SA=0)) = 0.475  [n=100000]  (doing)


Number n above shows the number of tests where SA=0 happened. When we only observed, about half of the time, SA=0 and when we intervened, it was always kept at 0.

#### We see that P(D=1 | SA=0) ≠ P(D=1 | do(SA=0))!
This difference reveals that **C (the captain’s order)** is a **confounder** — it influences both the squad’s action (`SA`) and the outcome (`D`).


## Excersice:

**Parameter flip:** In the firing squad model, change `p_kill_A` to 0.6 and `p_kill_B` to 0.99.  
   - Re-run and record `P(D|SA=0)` vs `P(D|do(SA=0))`.  
   - Explain in one sentence which way confounding moves the observational estimate.


In [3]:
N = 100000
rng = np.random.default_rng(6)

# Structural equations (binary)
C = rng.binomial(1, 0.5, size=N)  # Captain order
SA = (C & (rng.random(N) < 0.95)).astype(int)  # Squad A obeys if ordered
SB = (C & (rng.random(N) < 0.95)).astype(int)  # Squad B obeys if ordered

# Lethality (set to 1 for simplicity, adjust if you want realism)
p_kill_A, p_kill_B = 0.6, 0.99
hit_A = p_kill_A
hit_B = p_kill_B
# hit_A = (SA & (rng.random(N) < p_kill_A)).astype(int)
# hit_B = (SB & (rng.random(N) < p_kill_B)).astype(int)
D = np.maximum(hit_A, hit_B)

# Combine all simulated variables into a single DataFrame for easier analysis and plotting
obs_df = pd.DataFrame(dict(C=C, SA=SA, SB=SB, D=D))

print("Observational world:")
# Compute P(D=1 | SA=0)
p_obs = obs_df.loc[obs_df["SA"] == 0, "D"].mean()
n_obs = (obs_df["SA"] == 0).sum()
print(f"P(D=1 | SA=0) = {p_obs:.3f}  [n={n_obs}]  (seeing)")

# --- Interventional: do(SA=0) ---
C2 = rng.binomial(1, 0.5, size=N)                   # captain as before
SA2 = np.zeros(N, dtype=int)                        # force A not to fire
SB2 = (C2 & (rng.random(N) < 0.95)).astype(int)     # B still reacts to captain

hit_A2 = (SA2 & (rng.random(N) < p_kill_A)).astype(int)
hit_B2 = (SB2 & (rng.random(N) < p_kill_B)).astype(int)
D2 = np.maximum(hit_A2, hit_B2)

do_df = pd.DataFrame(dict(C=C2, SA=SA2, SB=SB2, D=D2))
p_do = do_df["D"].mean()
n_do = len(do_df)

print("\nInterventional world:")
print(f"P(D=1 | do(SA=0)) = {p_do:.3f}  [n={n_do}]  (doing)")

Observational world:
P(D=1 | SA=0) = 0.990  [n=52415]  (seeing)

Interventional world:
P(D=1 | do(SA=0)) = 0.470  [n=100000]  (doing)


In [None]:
#Since we changed the lethality probabilities, we see that given Squad A does not fire, 
#the probability of death is directly related to the lethality of Squad B.