-
-
Notifications
You must be signed in to change notification settings - Fork 5.1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Default value of equal_var parameter should be False (scipy.stats.ttest_ind) #10889
Comments
It looks to me like changing the default value of that kwarg would break backward compatibility. Not saying it can't be done if |
Hey @tylerjereddy, thanks for the comment. I said that the backward compatibility won't be affected because I believe these 2 settings (of equal_var = False or True) will lead to pretty much the same t-score (or very similar) if variances are actually equal. The only time Student's t-test (equal_var = True) value would differ from Welch's t-test (equal_var = False) is if there are differences in variance and in that case Student’s t-test shouldn't be used. Anyway people who are aware of this would have set equal_var = False (in cases where variances are not equal or it is unknown) and if they weren't aware, they are using default setting which is wrong. So this proposed change is either backward compatible or it changes the result to correct result. Please let me know if my understanding is incorrect. My 2 cents is that updating this default setting (to equal_var = False) would probably get the results right in most of the future cases. This would also be a step in the right direction and could be accomplished with clear notification in documentation. This is why it feels like its worth the disruptions this change might cause. I would let the stats people decide which is best and the right thing to do... Thanks, (Edited few times to make some snippets more explicit, apologies for multiple edits) |
sounds good |
@josef-pkt I see you have worked with |
To maintain backwards compatibility, we can't really change this unless we were to create a new function, and I don't think that's warranted here. Thanks for the suggestion @anandna123. Do let us know if there's something we can do to make the documentation even clearer; we wouldn't want the existing default to cause users to make mistakes. |
Is your feature request related to a problem? Please describe.
I was looking into the ttest_ind functionality and saw that the default value of equal_var parameter is True and this would be accurate/applicable only in cases where population variances are equal.
Describe the solution you'd like
Considering the fact that most of the cases variance would be different, doesn't it make sense to change the default value of equal_var parameter to False in maybe next version? Many times the non-statistical/non-mathematical background users aren't aware that the value should be False if variances are different.
Describe alternatives you've considered
Backward compatibility shouldn't be an issue for newer versions, is what I believe. Please let me know your thoughts...
Additional context (e.g. screenshots)
The text was updated successfully, but these errors were encountered: