KCI-tests: Remove TestKCI and add assertions to TestCIT_KCI #51

aoqiz · 2022-07-06T08:46:11Z

Updates:

Fix bugs (or typos) in causallearn/utils/KCI/KCI.py to conform to the original Matlab implementation in http://people.tuebingen.mpg.de/kzhang/KCI-test.zip:
- KCI_CInd.kernel_matrix: KernelX and KernelY's empirical width depends on data_z's shape. The same applies to KernelY (Line 349, 373).
- KCI_CInd.kernel_matrix: centering kernel matrix kzx (Line 404, 457, 462) before feeding it into KCI_V_statistic. After double checking, the optimization @MarkDana did previously (https://github.com/cmu-phil/causal-learn/pull/49) still applies.
- KCI_CInd.kernel_matrix: change the data normalization way by adjusting the degrees of freedom correction in calculating the standard deviation.
Add test cases where the datasets are generated by nonGaussian distributions, e.g., exponential distribution, uniform distribution and the mixture of these distributions.
Remove print commands and add assertions on the exact p-value. The ground truth p-value is computed from the current version 26d8e36.

Resolving the concerns:

When the kernel width is set manually, there are two cases that may return a weird p-value even in the Matlab version. The details are listed in Weird_test_cases.pdf. After checking with the expert, these results are expected. More care should be taken as to what kernel width is set manually.

Comparison of the current Python version with the Matlab version on KCI-test results:

Since the Matlab version cannot support too flexible arguments, we only test the basic setting: applying Gaussian kernel to all variables X, Y and Z and setting the kernel width using empirical rules. The results for Python are as follows:

import numpy as np
import causallearn.utils.cit as cit

# np.random.seed(50)
# X = np.random.randn(30, 1)
# X_prime = np.random.randn(30, 1)
# Y = X + 0.5 * np.random.randn(30, 1)
# Z = Y + 0.5 * np.random.randn(30, 1)
# data = np.hstack((X, X_prime, Y, Z))
# np.savetxt("GaussianData.csv", data, delimiter=",")
data = np.loadtxt("GaussianData.csv", delimiter=",")

If we use gamma approximation (approx=True), the resulting p-value is deterministic.

approx = True
for use_gp in [True, False]:
    # The argument 'use_gp' only relates the conditional KCI, so only 'pXZY' will be affected.
    cit_CIT = cit.CIT(data, 'kci', kernelX='Gaussian', kernelY='Gaussian', kernelZ='Gaussian',
                      est_width='empirical', approx=approx, use_gp=use_gp)
    pXX = cit_CIT(0, 1)
    pXZ = cit_CIT(0, 3)
    pXZY = cit_CIT(0, 3, {2})
    print('\napprox =', approx, ', use_gp =', use_gp, ':')
    print('pXX:', round(pXX, 4))
    print('pXZ:', format(pXZ, '.4e'))
    print('pXZY', round(pXZY, 4))

Output:

approx = True , use_gp = True :
pXX: 0.2662
pXZ: 5.0105e-06
pXZY 0.3585

approx = True , use_gp = False :
pXX: 0.2662
pXZ: 5.0105e-06
pXZY 0.3977

Otherwise (approx=False), the resulting p-value is not deterministic due to the simulation of null distribution in the function null_sample_spectral(xxx) in the class KCI_UInd and KCI_CInd. (I've checked that other deterministic parts can have the same output with the Matlab version on the same input.) Therefore, we run the program 500 times and calculate the mean of p-values.

approx = False
for use_gp in [True, False]:
    np.random.seed(50)
    pXX = []
    pXZ = []
    pXZY = []
    for i in range(500):
        cit_CIT = cit.CIT(data, 'kci', kernelX='Gaussian', kernelY='Gaussian', kernelZ='Gaussian',
                  est_width='empirical', approx=approx, use_gp=use_gp)
        p01 = cit_CIT(0, 1)
        pXX.append(p01)
        p03 = cit_CIT(0, 3)
        pXZ.append(p03)
        p032 = cit_CIT(0, 3, {2})
        pXZY.append(p032)
    print('\napprox =', approx, ', use_gp =', use_gp, ':')
    print('pXX: mean = {}, var = {}'.format(round(np.mean(pXX), 4), format(np.var(pXX), '.4e')))
    print('pXZ: mean = {}, var = {}'.format(round(np.mean(pXZ), 4), format(np.var(pXZ), '.4e')))
    print('pXZY: mean = {}, var = {}'.format(round(np.mean(pXZY), 4), format(np.var(pXZY), '.4e')))

Output:

approx = False , use_gp = True :
pXX: mean = 0.246, var = 1.8853e-04
pXZ: mean = 0.0001, var = 6.6156e-08
pXZY: mean = 0.3371, var = 4.9257e-05

approx = False , use_gp = False :
pXX: mean = 0.2468, var = 1.8931e-04
pXZ: mean = 0.0001, var = 7.1376e-08
pXZY: mean = 0.3736, var = 4.5726e-05

Using the same dataset GaussianData.csv generated by Python, the Matlab version results are provided in Matlab_results.pdf.

We can see that they give the same p-value in deterministic cases and almost the same p-value in non-deterministic cases. This conclusion also applies to other datasets I tried on.

Test plan:

python -m unittest TestCIT_KCI    # should pass

TODO:

Design more comprehensive test cases, including designing dataset of corner cases (e.g., data generated by more complicated distribution, discrete data type, larger sample size).

tests/TestCIT_KCI.py

tofuwen · 2022-07-06T16:40:16Z

Thanks for the great work!! :)

Some general comments that needs to be fixed:

Could you update your test plan? The current test plan is not really a test plan --- it's more like a future work (actually very good todos). You can check previous PR to see what is good test plan, e.g. Rewrite CITests as a class && re-use covariance matrix for fisherz #46

For this PR, your test plan should be: the test you run and the test results (your tests should all pass). A test pass screenshot would be great here.

For all the todos, please add @todo(aoqiz) in the code as a comment. In this way, we can make sure the todos are assigned to the owners, and will be addressed later eventually. I am pretty sure todos without tracking will not be done almost for sure, haha. It's very easy to lose tracks to all the todos. But I agree in the first PR, we don't need to finish all the complex test.
Why we need two classes, i.e. CIT_KCI and KCI? This looks duplicated.

tofuwen

thanks for the great work!!!

tests/TestCIT_KCI.py

tests/TestKCI.py

This reverts commit 5c09818.

MarkDana · 2022-07-07T14:40:23Z

Thanks @aoqiz for your great work! Some of my comments:

As is mentioned by @tofuwen "Why we need two classes, i.e. CIT_KCI and KCI? This looks duplicated", we could actually delete TestKCI.py. And all our tests of other CITests can be named this way, e.g., TestCIT_fisherz.py.
Currently TestCIT_KCI is doing exactly the same thing asTestKCI in our old codebase, with some rewriting for the new CIT class. This is good, but could we add something more? See below:
- The current test only aims to show the "correctness" of KCI, by e.g., pvalue>0.01. But this may be misleading - the result pvalue may just happen to be correct while the code logic itself is incorrect. To avoid this, we'll need to ensure that the code logic is unchanged after every updates. We could refer to the test plan of this: for every possible parameters combination of KCI, the returned pvalue is unchanged. E.g., we could load data_linear_10.txt (example), assert KCI's pvalue and severity of 0;1, 0;1|2, ..., always equal to your saved values by the current runs - just put them in your code is okay.
- Adding nonGaussian distribution is necessary for KCI. E.g., np.random.uniform, np.random.exponential.
- Currently the multi-dimensional Y and Z are somewhat weird. Why do we need to concat X twice here, but in CItests, we only test one dimensional of Y and Z? Yes we could keep this later to test KCI for multi-dimensional unconditional variables, but for the standard tests, we may just use the univariate variables, e.g., Y=X+EY, Z=Y+EY, ...

aoqiz · 2022-07-07T14:58:57Z

Thanks @MarkDana @tofuwen for the very helpful and constructive suggestions! I'll keep on improving these tests.

tofuwen · 2022-07-07T15:36:16Z

Thanks @MarkDana for your comments.

Regarding "making sure the code logic is unchanged after every updates", I am a little worried about the current code correctness. This only makes sense if we can ensure the current version will gives correct output and no randomness.

I am thinking another way to make sure the test is doing as expected from stat point of view: under the null, p-value should follow uniform distribution. We can run the current KCI code for multiple times when null is true to see whether the p-value follows uniform distribution?

tofuwen · 2022-07-08T04:00:02Z

Hi @aoqiz,

Could you address @MarkDana's previous comments and push the PR to be accepted?

Currently there are still some issues in this PR. Some important ones:

tests/TestKCI.py seems duplicate. We can delete it.
For now, we can do the easy thing to make sure the test is reasonable: you can just follow @MarkDana's suggestion to ensure the code logic is unchanged for now.
adds some non-gaussian distribution to KCI
address the concat(X, X) issue --- currently I do not fully understand all the logic here, I will spend more time on it.

Let me know if you have more questions --- we can meet to discuss! :) I know usually it's very hard to get to know how to write tests and design tests in the beginning (and write industry level code). :)

This reverts commit e561b9e.

This reverts commit e065cd3.

This reverts commit bb3d5a2.

tofuwen

thanks for the great work! Overall the structure looks quite good, but I have some stat related questions.

Also, please update your test plan to include the screenshot after you run the command.

tests/TestCIT_KCI.py

tofuwen · 2022-07-16T16:17:48Z

@aoqiz Do you mind sharing some updates here?

What's the current status regarding comparing matlab version and python version? Maybe the first step would be make sure the python version can return the identical results with matlab version

causallearn/utils/cit.py

MarkDana · 2022-07-20T02:20:15Z

causallearn/utils/KCI/KCI.py

                elif self.est_width == 'empirical':
-                    kernelX.set_width_empirical_kci(data_x)
+                    kernelX.set_width_empirical_kci(data_z)


Oh I'm not sure: you mean kernelX's and kernelY's width should be set empirically both from data_z, instead of data_x and data_y themselves?

But this doesn't matter for empirical width right? data_z and data_x, data_y have the same sample size?

Yes, according to the Matlab version, the empirical kernel width of kernelX and kernelY should depend on the shape of data_z, not data_x or data_y.

Yes, they have the same sample size but a different column size. Here, data_x = np.concatenate((data_x, 0.5 * data_z), axis=1), so data_x.shape[1] is 2, while data_z.shape[1] is 1.

I see! Wow, thanks for being so careful!!

Then,

Just for clarity, could you please add a comment and reference to the original code (with file and line number)? (so that readers won't feel like it's a typo).

Is there any empirical performance gain / theoretical guarantee for this change? (so that we can ensure that it's not a typo in the matlab version).

After this change, will KCI in python and in matlab return the same value over the same dataset (try to fix randomness)?

And does this have anything to do with The performance gap bewteen KCI_UInd and KCI_CInd under a similar setting #56?

@aoqiz Could you address Haoyue's question? I am also curious....

Sure, I'm working on that.

@aoqiz Could you answer @MarkDana's question? I am also curious haha

"Is there any empirical performance gain / theoretical guarantee for this change? (so that we can ensure that it's not a typo in the matlab version)." I am also worried about this, and I don't know KCI much. To just make sure, in the paper, here we should data_z shape, right? And it's theoretically sound?

based on your updated description, if I understand correctly, now we can ensure almost identical result between matlab and python now, right?

Could you take a look at the issue @MarkDana mentioned? I am also curious whether it's related

@tofuwen @MarkDana I'm sorry, I think I cannot answer these questions now, and I'm still investigating. I hope I can give you an answer before this weekend. Thanks for your patience!

Is there any empirical performance gain / theoretical guarantee for this change? (so that we can ensure that it's not a typo in the Matlab version).

Actually, I didn't find anything about this in the paper, and I'm discussing it with Mingming now.

After this change, will KCI in python and Matlab return the same value over the same dataset (try to fix randomness)?

After fixing that bug, they can return almost the same p-value over the same dataset. For three datasets, the program was run 100 times on each dataset, the mean of the p-value returned by Python and Matlab are 0.2494 and 0.2463 respectively (unconditional KCI); 0.36024, 0.3585 (conditional KCI); 0.34238, 0.3381 (unconditional KCI); 0.4971, 0.4910 (conditional KCI); 0.62922, 0.6281 (unconditional KCI); 0.3795, 0.3793 (conditional KCI).

But I think the mean of the p-value should be the same for Python and Matlab if it's only due to the randomness. So what I'm doing now is checking whether there is any difference in the deterministic part.

Could you take a look at the issue @MarkDana mentioned? I am also curious whether it's related

Once I fix the difference in the deterministic part, I'll look into this issue.

Thanks for your patience again!

tofuwen

Thanks for the great work, I think we are almost done!

And thanks for finishing debugging this very tricky KCI bug! Fixing the bug makes causal-learn much better --- KCI is so fundamental that we must ensure our implementation is correct.

tofuwen · 2022-07-21T06:37:44Z

tests/TestCIT_KCI.py

+        pvalue03_truth = [0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
+                          0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0,
+                          0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.0, 0.138, 0.138, 0.143, 0.113,
+                          0.138, 0.138, 0.141, 0.15, 0.051, 0.051, 0.066, 0.057, 0.051, 0.051, 0.056, 0.048, 0.004,


here the result is a little weird.

Is this expected?

tofuwen · 2022-07-21T06:38:00Z

tests/TestCIT_KCI.py

+                           0.142, 0.132, 0.13, 0.133, 0.133, 0.131, 0.12, 0.142, 0.142, 0.141, 0.135, 0.12, 0.12,
+                           0.114, 0.116, 0.116, 0.116, 0.109, 0.108, 0.12, 0.12, 0.105, 0.105, 0.116, 0.116, 0.105,
+                           0.101, 0.12, 0.12, 0.108, 0.108, 0.116, 0.116, 0.108, 0.105, 0.504, 0.504, 0.494, 0.518,
+                           0.005, 0.005, 0.005, 0.005, 0.477, 0.477, 0.473, 0.479, 0.076, 0.076, 0.076, 0.077, 0.549,


hmmm, 0.005 seems weird. so small

This and the above case are two cases that the Matlab itself will give weird results. The details are listed in Weird_test_cases.pdf.
I‘ve turned to Mingming and Kun for help and waiting for the answer. Hope I can give you the answer before this weekend.

Thanks, that's super clear. I think we can leave it as it is for now, and after Kun and Mingming respond, we can change it if necessary.

Super nice!! Thanks!!

aoqiz · 2022-07-22T09:18:27Z

Hi, @tofuwen @MarkDana, here is a summary of the comparison of the current Python version and the Matlab version on KCI test.

Since the Matlab version cannot support too flexible arguments, we only test the basic setting: applying Gaussian kernel to all variables X, Y and Z and setting the kernel width using empirical rules. The results for Python are as follows:

import numpy as np
import causallearn.utils.cit as cit

# np.random.seed(50)
# X = np.random.randn(30, 1)
# X_prime = np.random.randn(30, 1)
# Y = X + 0.5 * np.random.randn(30, 1)
# Z = Y + 0.5 * np.random.randn(30, 1)
# data = np.hstack((X, X_prime, Y, Z))
# np.savetxt("GaussianData.csv", data, delimiter=",")
data = np.loadtxt("GaussianData.csv", delimiter=",")

If we use gamma approximation (approx=True), the resulting p-value is deterministic.

approx = True
for use_gp in [True, False]:
    # The argument 'use_gp' only relates the conditional KCI, so only 'pXZY' will be affected.
    cit_CIT = cit.CIT(data, 'kci', kernelX='Gaussian', kernelY='Gaussian', kernelZ='Gaussian',
                      est_width='empirical', approx=approx, use_gp=use_gp)
    pXX = cit_CIT(0, 1)
    pXZ = cit_CIT(0, 3)
    pXZY = cit_CIT(0, 3, {2})
    print('\napprox =', approx, ', use_gp =', use_gp, ':')
    print('pXX:', round(pXX, 4))
    print('pXZ:', format(pXZ, '.4e'))
    print('pXZY', round(pXZY, 4))

Output:

approx = True , use_gp = True :
pXX: 0.2662
pXZ: 5.0105e-06
pXZY 0.3585

approx = True , use_gp = False :
pXX: 0.2662
pXZ: 5.0105e-06
pXZY 0.3977

Otherwise (approx=False), the resulting p-value is not deterministic due to the simulation of null distribution in the function null_sample_spectral(xxx) in the class KCI_UInd and KCI_CInd. (I've checked that other deterministic parts can have the same output with the Matlab version on the same input.) Therefore, we run the program 500 times and calculate the mean of p-values.

approx = False
for use_gp in [True, False]:
    np.random.seed(50)
    pXX = []
    pXZ = []
    pXZY = []
    for i in range(500):
        cit_CIT = cit.CIT(data, 'kci', kernelX='Gaussian', kernelY='Gaussian', kernelZ='Gaussian',
                  est_width='empirical', approx=approx, use_gp=use_gp)
        p01 = cit_CIT(0, 1)
        pXX.append(p01)
        p03 = cit_CIT(0, 3)
        pXZ.append(p03)
        p032 = cit_CIT(0, 3, {2})
        pXZY.append(p032)
    print('\napprox =', approx, ', use_gp =', use_gp, ':')
    print('pXX: mean = {}, var = {}'.format(round(np.mean(pXX), 4), format(np.var(pXX), '.4e')))
    print('pXZ: mean = {}, var = {}'.format(round(np.mean(pXZ), 4), format(np.var(pXZ), '.4e')))
    print('pXZY: mean = {}, var = {}'.format(round(np.mean(pXZY), 4), format(np.var(pXZY), '.4e')))

Output:

approx = False , use_gp = True :
pXX: mean = 0.246, var = 1.8853e-04
pXZ: mean = 0.0001, var = 6.6156e-08
pXZY: mean = 0.3371, var = 4.9257e-05

approx = False , use_gp = False :
pXX: mean = 0.2468, var = 1.8931e-04
pXZ: mean = 0.0001, var = 7.1376e-08
pXZY: mean = 0.3736, var = 4.5726e-05

Using the same dataset GaussianData.csv generated by Python, the Matlab version results are provided in Matlab_results.pdf.

We can see that they give the same p-value in deterministic cases and almost the same p-value in non-deterministic cases. This conclusion also applies to other datasets I tried on.

causallearn/utils/KCI/KCI.py

tofuwen · 2022-07-22T09:29:31Z

Thanks, this is awesome @aoqiz!

I think the only thing left is we need to debug the matlab KCI version when it outputs weird values right? What's the progress there?

But I think matlab KCI issue is not very relevant to our code here though. I think we can merge this code first. Do you mind adding a todo in your code to remind us debug matlab KCI version, and update the "issue tracking doc" accordingly? :)

Thanks so much for spending a lot of time to debug this very tricky bug!! :)

causallearn/utils/KCI/KCI.py

tofuwen

super nice!

I think we are really close to merge this PR.

But since our codebase changed quite a lot, I noticed that you may need to rebase your code.

You can ask @MarkDana if you need help, I believe he should be familiar with it (?). I know for people who don't have much coding experience, rebasing is usually painful. So don't hesitate to ask for help!

causallearn/utils/KCI/KCI.py

tofuwen · 2022-07-22T09:39:34Z

Oh, btw, after your fix, how about the KCI issue we mentioned previously?

aoqiz · 2022-07-22T09:40:07Z

Thank you, @tofuwen. I haven't submitted the current test file yet in case of new changes. But I'll do this now. Once finished, I'll remind you. Thanks!

adding a todo in your code to remind us debug matlab KCI version, and update the "issue tracking doc" accordingly?
Sure, I'll do this later.

aoqiz · 2022-07-22T09:50:06Z

Oh, btw, after your fix, how about the KCI issue we mentioned previously?
We do not have those issues after fixing the bug on kernelX.set_width_empirical_kci(data_z) last time.

What we left now is

to debug the Matlab KCI version when it outputs weird values.
ask Mingming these expert questions.

tofuwen · 2022-07-25T06:36:37Z

Hey @aoqiz,

Just to be clear.

#56 this issue is solved after your fix, right? :)

tofuwen · 2022-07-25T06:40:34Z

Great, thanks for the great work, @aoqiz !!

I think this PR is ready to be pushed. cc @kunwuz
After we got response from Mingming and Kun, we can open another PR to fix. This PR is already great!

tofuwen · 2022-07-25T06:41:14Z

BTW, please first rebase before merging this PR --- otherwise I am afraid bad conflict will happen (cc @MarkDana @kunwuz @aoqiz )

aoqiz · 2022-07-25T07:22:56Z

@tofuwen After checking with the expert, the weird results in the Matlab version are expected. More care should be taken as to what kernel width is set manually. So there's no more issue remaining in this PR, I think. It's ready to be merged.

@MarkDana I'm not familiar with resolving conflicts and rebasing the code, so could you please help me with this? Many Thanks!

I will look into the issue #56 after #62 is merged, which is ready to merge, right? since I need the code for multi-dimensional variables in PR #62.

tofuwen · 2022-07-25T07:36:30Z

@aoqiz great to know! Then in the test in your code, could you also set kernel width correctly so that we don't have weird p-value results in your test?

Also, how should we set KCI width? Is there any instructions? I guess the users may not be able to set Kernel width correctly as well.

aoqiz · 2022-07-25T07:59:06Z

@tofuwen The weird p-value only happens when users set kernel width manually in two cases listed in Weird_test_cases.pdf. But, usually, they may set the parameter est_width=empirical instead of est_width=manual, except that they have special requirements.

I don' think we need to set kernel width 'correctly' when we're testing setting kernel width manually since it's just an exhaustive test on any possible argument values we could have. The weird p-values are expected in these two cases, while in other cases, these manually setting kernel width works quite well.

tofuwen · 2022-07-25T08:03:09Z

@aoqiz I see. I still prefer to set a reasonable kernel_width and we shouldn't allow weird p-values exist in our test. If users check our code and saw these extremely weird p-value, they will lose trust.

And the test should cover real use case --- if our test return super weird p-value, this means it should never be the real use case.

aoqiz · 2022-07-25T08:20:21Z

@tofuwen Ok, let me think about it.

aoqiz · 2022-07-25T09:53:59Z

@tofuwen Fixed. There's no more weird p-values now. Please have a double check, thanks!

tofuwen · 2022-07-25T10:29:24Z

Thanks, looks super nice now. :) We can merge it after resolving conflicts.

tofuwen · 2022-07-26T07:48:08Z

@MarkDana could you help @aoqiz to merge this PR?

MarkDana · 2022-07-26T09:18:13Z

@tofuwen Oh yes. I already put it here. aoqiz#2

Aoqi will merge that when she's available. And then this pr should be able to merge :)

Aoqi kci rebase

MarkDana · 2022-07-26T09:34:58Z

@kunwuz finally, this is ready to go!!

Thanks so much @aoqiz for your terrific work. Must be a lot of time. And thanks @tofuwen @kunwuz for making all this happen :))

aoqiz · 2022-07-26T09:43:30Z

Big thanks to @MarkDana @tofuwen @kunwuz for your help with this PR!! thank you very much!

KCI-add-test: Add assertions to TestKCI and TestCIT_KCI

f28d158

MarkDana reviewed Jul 6, 2022

View reviewed changes

tests/TestCIT_KCI.py Outdated Show resolved Hide resolved

MarkDana reviewed Jul 6, 2022

View reviewed changes

tests/TestCIT_KCI.py Outdated Show resolved Hide resolved

MarkDana reviewed Jul 6, 2022

View reviewed changes

tests/TestCIT_KCI.py Outdated Show resolved Hide resolved

MarkDana reviewed Jul 6, 2022

View reviewed changes

tests/TestCIT_KCI.py Outdated Show resolved Hide resolved

Fix TestCIT_KCI.py

295cee1

tofuwen reviewed Jul 6, 2022

View reviewed changes

tests/TestCIT_KCI.py Outdated Show resolved Hide resolved

tests/TestCIT_KCI.py Outdated Show resolved Hide resolved

tests/TestCIT_KCI.py Outdated Show resolved Hide resolved

tests/TestKCI.py Outdated Show resolved Hide resolved

aoqiz added 3 commits July 7, 2022 20:18

Add TODO to KCI and CIT_KCI

5c09818

Revert "Add TODO to KCI and CIT_KCI"

b075152

This reverts commit 5c09818.

Add TODO and random seed to TestKCI and TestCIT_KCI

5ee5363

aoqiz added 5 commits July 8, 2022 16:50

Change assertions and add test cases

e561b9e

Revert "Change assertions and add test cases"

e065cd3

This reverts commit e561b9e.

Revert "Revert "Change assertions and add test cases""

bb3d5a2

This reverts commit e065cd3.

Revert "Revert "Revert "Change assertions and add test cases"""

2ef86fc

This reverts commit bb3d5a2.

Change assertions on the p-value and add test cases

8255357

aoqiz changed the title ~~KCI-add-test: Add assertions to TestKCI and TestCIT_KCI~~ KCI-add-test: Remove TestKCI and add assertions to TestCIT_KCI Jul 8, 2022

aoqiz changed the title ~~KCI-add-test: Remove TestKCI and add assertions to TestCIT_KCI~~ KCI-tests: Remove TestKCI and add assertions to TestCIT_KCI Jul 8, 2022

tofuwen reviewed Jul 8, 2022

View reviewed changes

tests/TestCIT_KCI.py Outdated Show resolved Hide resolved

tests/TestCIT_KCI.py Show resolved Hide resolved

tests/TestCIT_KCI.py Outdated Show resolved Hide resolved

tofuwen mentioned this pull request Jul 11, 2022

The performance gap bewteen KCI_UInd and KCI_CInd under a similar setting #56

Open

Fix a bug in KCI.py

3446b6f

MarkDana reviewed Jul 20, 2022

View reviewed changes

causallearn/utils/cit.py Outdated Show resolved Hide resolved

MarkDana reviewed Jul 20, 2022

View reviewed changes

aoqiz and others added 2 commits July 20, 2022 11:37

Add comments and references for the fixed bug in KCI.py.

27a8e7a

Updated comments for KCI_V_statistic

6d4731e

tofuwen reviewed Jul 21, 2022

View reviewed changes

Fix KCI.py to conform to the original Matlab implementation.

eda61e9

aoqiz commented Jul 22, 2022

View reviewed changes

causallearn/utils/KCI/KCI.py Show resolved Hide resolved

aoqiz commented Jul 22, 2022

View reviewed changes

causallearn/utils/KCI/KCI.py Outdated Show resolved Hide resolved

tofuwen reviewed Jul 22, 2022

View reviewed changes

causallearn/utils/KCI/KCI.py Show resolved Hide resolved

causallearn/utils/KCI/KCI.py Outdated Show resolved Hide resolved

Add centering kernel matrix Kzx

26d8e36

Set more reasonable kernel width for the manually-setting cases

dc3e1d8

MarkDana added 2 commits July 26, 2022 13:27

directly merge

72322fc

conflicts resolved for ddof in zscore norm

3bea350

Merge pull request #2 from MarkDana/aoqi_KCI

0750b07

Aoqi kci rebase

kunwuz merged commit 52527e2 into py-why:main Jul 26, 2022

KCI-tests: Remove TestKCI and add assertions to TestCIT_KCI #51

KCI-tests: Remove TestKCI and add assertions to TestCIT_KCI #51

Conversation

aoqiz commented Jul 6, 2022 • edited

Updates:

Resolving the concerns:

Comparison of the current Python version with the Matlab version on KCI-test results:

Test plan:

TODO:

tofuwen commented Jul 6, 2022 • edited

tofuwen left a comment

Choose a reason for hiding this comment

MarkDana commented Jul 7, 2022

aoqiz commented Jul 7, 2022

tofuwen commented Jul 7, 2022

tofuwen commented Jul 8, 2022

tofuwen left a comment

Choose a reason for hiding this comment

tofuwen commented Jul 16, 2022

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aoqiz Jul 20, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

MarkDana Jul 20, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tofuwen left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aoqiz Jul 21, 2022 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

aoqiz commented Jul 22, 2022

tofuwen commented Jul 22, 2022

tofuwen left a comment

Choose a reason for hiding this comment

tofuwen commented Jul 22, 2022

aoqiz commented Jul 22, 2022

aoqiz commented Jul 22, 2022

tofuwen commented Jul 25, 2022

tofuwen commented Jul 25, 2022

tofuwen commented Jul 25, 2022 • edited

aoqiz commented Jul 25, 2022

tofuwen commented Jul 25, 2022

aoqiz commented Jul 25, 2022

tofuwen commented Jul 25, 2022

aoqiz commented Jul 25, 2022

aoqiz commented Jul 25, 2022 • edited

tofuwen commented Jul 25, 2022

tofuwen commented Jul 26, 2022

MarkDana commented Jul 26, 2022

MarkDana commented Jul 26, 2022

aoqiz commented Jul 26, 2022

aoqiz commented Jul 6, 2022 •

edited

tofuwen commented Jul 6, 2022 •

edited

aoqiz Jul 20, 2022 •

edited

MarkDana Jul 20, 2022 •

edited

aoqiz Jul 21, 2022 •

edited

tofuwen commented Jul 25, 2022 •

edited

aoqiz commented Jul 25, 2022 •

edited