You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
When trying to work on a large dataframe, containing several columns, some of which could be analyzed using dabest, I realized that other columns that are unrelated to the comparison I'm trying to do (i.e. columns that are not included in the x/y parameters) are interfering with the results.
test = dabest.load(data=df, x='groups', y='value', idx=['Group 1', 'Group 2'])
test.mean_diff
This generates a bunch of warnings:
.../numpy/core/fromnumeric.py:3118: RuntimeWarning: Mean of empty slice.
out=out, **kwargs)
.../numpy/core/_methods.py:85: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
.../dabest/_stats_tools/confint_2group_diff.py:157: RuntimeWarning: invalid value encountered in less
prop_less_than_es = sum(B < effsize) / len(B)
.../dabest/_classes.py:545: UserWarning: The lower limit of the BCa interval cannot be computed. It is set to the effect size itself. All bootstrap values were likely all the same.
stacklevel=0)
.../dabest/_classes.py:550: UserWarning: The upper limit of the BCa interval cannot be computed. It is set to the effect size itself. All bootstrap values were likely all the same.
stacklevel=0)
.../scipy/stats/stats.py:5001: RuntimeWarning: divide by zero encountered in double_scalars
z = (bigu - meanrank) / sd
.../numpy/core/fromnumeric.py:3367: RuntimeWarning: Degrees of freedom <= 0 for slice
**kwargs)
.../numpy/core/_methods.py:110: RuntimeWarning: invalid value encountered in true_divide
arrmean, rcount, out=arrmean, casting='unsafe', subok=False)
.../numpy/core/_methods.py:132: RuntimeWarning: invalid value encountered in double_scalars
ret = ret.dtype.type(ret / rcount)
and then the result is incorrect:
(...)
The unpaired mean difference between Group 1 and Group 2 is nan [95%CI nan, nan].
The two-sided p-value of the Mann-Whitney test is 0.0.
(...)
running the same analysis but keeping only the columns that are relevant generates the correct result
test = dabest.load(data=df[['groups','value']], x='groups', y='value', idx=['Group 1', 'Group 2'])
test.mean_diff
(...)
The unpaired mean difference between Group 1 and Group 2 is -0.0708 [95%CI -0.202, 0.0631].
The two-sided p-value of the Mann-Whitney test is 0.268.
(...)
Alternatively, if the unrelated column(s) do not contain NaNs, everything works as expected:
(...)
The unpaired mean difference between Group 1 and Group 2 is -0.0708 [95%CI -0.202, 0.0631].
The two-sided p-value of the Mann-Whitney test is 0.268.
(...)
The text was updated successfully, but these errors were encountered:
Thanks for the excellent diagnosis of the problem, @DizietAsahi ! This was very recently brought to my attention by a colleague as well. Expect a bugfix shortly. Thanks!
When trying to work on a large dataframe, containing several columns, some of which could be analyzed using dabest, I realized that other columns that are unrelated to the comparison I'm trying to do (i.e. columns that are not included in the x/y parameters) are interfering with the results.
Demonstration:
create example dataframe
compare Group 1 vs Group 2:
This generates a bunch of warnings:
and then the result is incorrect:
(...)
The unpaired mean difference between Group 1 and Group 2 is nan [95%CI nan, nan].
The two-sided p-value of the Mann-Whitney test is 0.0.
(...)
running the same analysis but keeping only the columns that are relevant generates the correct result
(...)
The unpaired mean difference between Group 1 and Group 2 is -0.0708 [95%CI -0.202, 0.0631].
The two-sided p-value of the Mann-Whitney test is 0.268.
(...)
Alternatively, if the unrelated column(s) do not contain NaNs, everything works as expected:
(...)
The unpaired mean difference between Group 1 and Group 2 is -0.0708 [95%CI -0.202, 0.0631].
The two-sided p-value of the Mann-Whitney test is 0.268.
(...)
The text was updated successfully, but these errors were encountered: