# CI Tests Validation (Gaussian & Binary)

## Purpose

This notebook validates the correctness of the implemented conditional independence (CI) tests used in the MVPC framework.

It performs sanity checks on:

- Gaussian CI tests (TD, PermC, DRW)
- Binary CI tests (TD, PermC, DRW)

under controlled synthetic settings with known ground-truth dependencies.

## Procedure

1. Construct simple chain structures:
   - Gaussian: X → Y → Z
   - Binary: A → B → C
2. Introduce structured missingness.
3. Run CI tests on:
   - Marginal independence
   - Conditional independence
4. Compare outputs across different CI implementations.

## Goal

Ensure that:

- Dependent variables are detected as dependent.
- Conditional independencies are correctly identified.
- Missingness-aware CI tests behave as expected.

This notebook serves as a regression and sanity check for future modifications to the MVPC CI layer.

In [None]:
import sys
import os

project_root = os.path.abspath("..")

if project_root not in sys.path:
    sys.path.append(project_root)

print("Project root added:", project_root)



Project root added: c:\Users\sofia\OneDrive\Υπολογιστής\Thesis_New


In [None]:
%load_ext autoreload
%autoreload 2

The autoreload extension is already loaded. To reload it, use:
  %reload_ext autoreload


In [None]:
import numpy as np


from mvpc.ci_tests.gauss_permc import gauss_ci_td, gauss_ci_permc
from mvpc.ci_tests.gauss_drw import gauss_ci_drw


from mvpc.ci_tests.bin_td import bin_ci_td
from mvpc.ci_tests.bin_permc import bin_ci_permc
from mvpc.ci_tests.bin_drw import bin_ci_drw


from mvpc.utils.mvpc_utils import test_wise_deletion
from mvpc.utils.mvpc_utils import cond_PermC, get_prt_m_xys


In [23]:
def make_suffstat(data, prt_m=None, skel=None):
    return {
        "data": data,
        "prt_m": prt_m if prt_m is not None else {"m": [], "prt": {}},
        "skel": skel if skel is not None else np.ones((data.shape[1], data.shape[1]))
    }


In [None]:
np.random.seed(0)

# Simple 3-variable Gaussian chain X -> Y -> Z
n = 300
X = np.random.normal(size=n)
Y = 0.8 * X + np.random.normal(size=n)
Z = 0.8 * Y + np.random.normal(size=n)

data_gauss = np.column_stack([X, Y, Z])


In [25]:
data_gauss_miss = data_gauss.copy()
mask = np.random.rand(n) < 0.3
data_gauss_miss[mask, 1] = np.nan  # introduce missingness in Y


In [None]:
prt_m = {
    "m": [1],       
    "prt": {1: [0]} 
}


In [27]:
suff = make_suffstat(data_gauss_miss, prt_m=prt_m)

print("Gaussian CI tests on (X,Y):")
print("TD   =", gauss_ci_td(0, 1, [], suff))
print("PermC=", gauss_ci_permc(0, 1, [], suff))
print("DRW  =", gauss_ci_drw(0, 1, [], suff))

print("\nGaussian CI tests on (X,Z | Y):")
print("TD   =", gauss_ci_td(0, 2, [1], suff))
print("PermC=", gauss_ci_permc(0, 2, [1], suff))
print("DRW  =", gauss_ci_drw(0, 2, [1], suff))


Gaussian CI tests on (X,Y):
TD   = 0.0
PermC= 0.0
DRW  = 0.0

Gaussian CI tests on (X,Z | Y):
TD   = 0.0
PermC= 0.1020125335767832
DRW  = 0.08109215194113117


---


In [None]:
np.random.seed(1)

# Simple binary chain A -> B -> C
n = 500
A = np.random.binomial(1, 0.5, size=n)
B = np.random.binomial(1, 0.8*A + 0.1)
C = np.random.binomial(1, 0.8*B + 0.1)

data_bin = np.column_stack([A, B, C])


In [29]:
data_bin_miss = data_bin.astype(float)
mask = np.random.rand(n) < 0.25
data_bin_miss[mask, 1] = np.nan  # missingness in B


In [None]:
prt_m_bin = {
    "m": [1],        
    "prt": {1: [0]}  # parent of missingness B is A
}


In [32]:
suff_bin = make_suffstat(data_bin_miss, prt_m=prt_m_bin)

print("Binary CI tests on (A,B):")
print("TD   =", bin_ci_td(0, 1, [], suff_bin))
print("PermC=", bin_ci_permc(0, 1, [], suff_bin))
print("DRW  =", bin_ci_drw(0, 1, [], suff_bin))

print("\nBinary CI tests on (A,C | B):")
print("TD   =", bin_ci_td(0, 2, [1], suff_bin))
print("PermC=", bin_ci_permc(0, 2, [1], suff_bin))
print("DRW  =", bin_ci_drw(0, 2, [1], suff_bin))


Binary CI tests on (A,B):
TD   = 0.0
PermC= 0.0
DRW  = 0.0

Binary CI tests on (A,C | B):
TD   = 0.6366802505409666
PermC= 0.06427201629708867
DRW  = 0.6370929402087598
