Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Line # 96 in merf.merf.py might be better when modifying "len(indices_i)" to "sum(indices_i)" #65

Open
CCChen-Menggg opened this issue Jun 2, 2022 · 0 comments

Comments

@CCChen-Menggg
Copy link

Hello,
Thanks for your great work on merf!
When I debug merf, I found that there is one line that does not work in any case:

63   def predict(self, X: np.ndarray, Z: np.ndarray, clusters: pd.Series):
           ...
          for cluster_id in self.cluster_counts.index:
                indices_i = clusters == cluster_id

               # If cluster doesn't exist in test data that's ok. Just move on.
96           if len(indices_i) == 0:  < ------------------ might revise to: if sum(indices_i) == 0
                    continue

               # If cluster does exist, apply the correction.
                ...

I marked Line # 96 and suggested changing "len()" to "sum()", otw, this if will never run in any case because indices_i is a pd.Series has the same shape with the input cluster.

And if possible, I would like to ask another question about the random effect matrix Z.

In the given example notebook, I noticed that when considering one variance as a random effect, the Z is not simply composed of one column but also has another column of ones. Why are the ones necessary?

Thanks in advance!
meng

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant