-
Notifications
You must be signed in to change notification settings - Fork 194
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix NaN problem while normailze the data #55
Changes from all commits
2ee93c6
7f40937
67f6ca7
ffe75f9
3603458
c58c816
709999d
04e2ba5
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
@@ -1,4 +1,5 @@ | ||
from math import log, sqrt | ||
from collections.abc import Iterable | ||
|
||
import numpy as np | ||
from scipy.stats import chi2, norm | ||
|
@@ -59,9 +60,18 @@ def _unique(column): | |
} | ||
|
||
def kci(self, X, Y, condition_set): | ||
if type(X) == int: | ||
X = [X] | ||
Comment on lines
+63
to
+64
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Better with an And then |
||
elif type(X) != list: | ||
Y = list(X) | ||
if type(Y) == int: | ||
Y = [Y] | ||
elif type(Y) != list: | ||
Y = list(Y) | ||
|
||
if len(condition_set) == 0: | ||
return self.kci_ui.compute_pvalue(self.data[:, [X]], self.data[:, [Y]])[0] | ||
return self.kci_ci.compute_pvalue(self.data[:, [X]], self.data[:, [Y]], self.data[:, list(condition_set)])[0] | ||
return self.kci_ui.compute_pvalue(self.data[:, X], self.data[:, Y])[0] | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. great, thanks! QQ:
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. @tofuwen There is no special design in the old code for supporting only one variable. I just aligned with other tests (used in constraint-based methods), and forgot that KCI can take in multivariate unconditional variables. I was just about to fix it. So thanks so much for your work @cogito233 ! There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. we'll need to handle the case for iterable There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Also for the cache key in https://github.com/cmu-phil/causal-learn/blob/ffe75f95c4003fa7e9d7d5f3bbec4ace90ed3a41/causallearn/utils/cit.py#L339, we'll need to handle There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I have no ideas about how to change the assert in the case of Maybe There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Wow cool. You're so productive! Thanks for all these! I see your point. This line was intended for correctness checks in constraint-based methods. As for the citest itself, is How about we force There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. Done. |
||
return self.kci_ci.compute_pvalue(self.data[:, X], self.data[:, Y], self.data[:, list(condition_set)])[0] | ||
|
||
def fisherz(self, X, Y, condition_set): | ||
""" | ||
|
@@ -326,7 +336,23 @@ def __call__(self, X, Y, condition_set=None, *args): | |
else: | ||
assert len(args) == 2, "Arguments other than skel and prt_m are provided for mc_fisherz." | ||
if condition_set is None: condition_set = tuple() | ||
assert X not in condition_set and Y not in condition_set, "X, Y cannot be in condition_set." | ||
|
||
if type(X) == int and type(Y) == int: | ||
assert X not in condition_set and Y not in condition_set, "X, Y cannot be in condition_set." | ||
else: | ||
if isinstance(X, Iterable): | ||
assert len(set(condition_set).intersection(X)) == 0, "X cannot be in condition_set." | ||
elif isinstance(X, int): | ||
assert X not in condition_set, "X cannot be in condition_set." | ||
else: | ||
raise Exception("Undefined type of X, X should be int or Iterable") | ||
if isinstance(Y, Iterable): | ||
assert len(set(condition_set).intersection(Y)) == 0, "Y cannot be in condition_set." | ||
elif isinstance(Y, int): | ||
assert Y not in condition_set, "Y cannot be in condition_set." | ||
else: | ||
raise Exception("Undefined type of Y, Y should be int or Iterable") | ||
|
||
i, j = (X, Y) if (X < Y) else (Y, X) | ||
cache_key = (i, j, frozenset(condition_set)) | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@MarkDana
Currently X and Y here can be int / Iterable? This doesn't sound like a good design --- if possible, we better make it every variable type-checked.
Why not enforce X and Y here to be a list of Int? This is the most general one right? So we don't need those lines just to do type-checking --- it's usually a good idea to make code concise.
But I think this can be changed in your later PR, @MarkDana
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes let me handle this in the later pr, or in the new KCI subclass.