Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Fix t-tests when variance is zero #621
Looks great! I wasn't aware of this high-dimensional version of a t-test in Scipy, which seems to be as efficient as the current implementation. I only investigated thoroughly for Wilcoxon rank and found that Scipy doesn't have a scalable version to offer.
But yes, this will get merged after 1.4.1.
I'm not completely sure this doesn't break anything, but the regression tests pass. The internal code is very similar, so I'm not too worried about these changes.
It does look like it's (very) slightly slower. Running this a thousand times for pbmc68k dataset took ~2.3% longer (about 1.4 ms per run) than the previous version. That said, we're very inefficient about mean and variance calculation, so I think that's a better place to optimize.
Edit: I've force pushed to fix some minor formatting issues (trailing white space, blank line, typo) that I didn't think deserved it's own commit.