-
Notifications
You must be signed in to change notification settings - Fork 231
Open
Description
The issue is based on the code in Pull request #55
Here is just a weird problem with the performance gap between KCI_UInd and KCI_CInd. Intuitively, the test of KCI_CInd) should have a worse performance due to it handling a more universal case. However, when I ran the code, the result is not as I excepted.
I test the code by a random collider dataset, which means test statistics, mean and var for convenient debugging. And the result shows a similar p-value of
Following is my test code:
from icecream import ic
from causallearn.utils.cit import CIT
from tqdm import trange
import numpy as np
def generate_single_sample(type, dim):
if (type == 'chain'):
X = np.random.random(dim)
Y = np.random.random(dim)+X
Z = np.random.random(dim)+Y
#X->Y->Z
elif (type == 'collider'):
# X->Y<-Z
X = np.random.random(dim)
Z = np.random.random(dim)
Y = np.random.random(dim)+X+Z
#Y = np.zeros(dim)+np.average(Y)
return list(X)+list(Y)+list(Z)+[1]# 31 dim X:0..9; Y:10..19; Z:20..29; 1: 30
def generate_dataset(dim, size):
dataset = []
for i in range(size):
datapoint = generate_single_sample('collider', dim)
dataset.append(datapoint)
dataset = np.array(dataset)
return dataset
if __name__ == '__main__':
dataset = generate_dataset(10, 1000)
cit_tester = CIT(dataset, method = 'kci')
#ic(cit_tester.kci(0, 20, []))
# Origin version can not pass this due to the feature-30 have the similar value
#ic(cit_tester.kci(0, 20, [30]))
# The follow is from one of my recent requirements, which is using CIT to test high dim variables
# Test high dim variables is not supported by current cit class, which is different from the documents,
# so I also implement this function in the last commit.
# An issue is related to the "CIT of test high dim variables" which I will put forward latter
ic(cit_tester.kci(range(10), range(20,30), range(10,20)))
ic(cit_tester.kci(range(10), range(20,30), []))
ic(cit_tester.kci(range(10), range(10,20), []))
ic(cit_tester.kci(range(10), range(20,30), [30]))
ic(cit_tester.kci(range(10), range(10,20), [30]))Metadata
Metadata
Assignees
Labels
No labels
