You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I'm really excited to use the new posthocs package, and have been trying to run the Dunn test (posthoc_dunn) on results from a survey over the last few days (I have about 1100 respondents). I have no problem when I run it on results that represent the difference between two feeling thermometers (a variable that ranges from -100 to 100). But every time I try to run it on a Likert-type scale item that takes values of 1 through 3 or 1 through 5, it returns a table full of null results (NaN in all cells except the diagonal). This comes along with a series of warnings as follows:
_posthocs.py:191: RuntimeWarning: invalid value encountered in sqrt:
z_value = diff / np.sqrt((A - x_ties) * B)
multitest.py:176: RuntimeWarning: invalid value encountered in greater
notreject = pvals > alphaf / np.arange(ntests, 0, -1)
multitest.py:251: RuntimeWarning: invalid value encountered in greater
pvals_corrected[pvals_corrected>1] = 1
I am not a programming expert, but my impression is that what is happening here is that the compare_dunn function (lines 187-193 in posthoc.py) is not returning valid p-values, and I am guessing that this is because (A - x_ties) is negative for some reason and so the np.sqrt function isn't computing a value for the z_value.
I played around with some groups of small arrays involving combinations of values ranging from 1 to 3 and 1 to 5, on the same scale as my data. Sometimes these had no problem returning valid results and other times they yielded the same NaNs that I get with my full dataset. I'm wondering if the issue has something to do with the total number or overall proportion of ties in the data. Obviously with Likert-type scale items there are a lot of ties. I'd love your thoughts on whether it's something that can be fixed to make analysis on this type of data possible. Thanks!!
The text was updated successfully, but these errors were encountered:
Hi Maksim, thanks for agreeing to take a look, and sorry for the delay in getting back to you. I've attached here an Excel file (Github wouldn't let me upload a CSV) that you can hopefully load into Pandas as a dataframe and replicate the results I am getting. There are four columns: I've tried to use the first three ('Question A', 'Question B', 'Question C') as the 'val_col' in the posthoc_dunn function, while using the 'Treatment' column as the 'group_col.' Please let me know if that doesn't work for you, and thanks in advance.
Andrea, thank you for submitting your data. I have already fixed this bug and released the new version of package. Please update and try it out. Tell me if it works for you and I'll close the issue here.
I'm really excited to use the new posthocs package, and have been trying to run the Dunn test (posthoc_dunn) on results from a survey over the last few days (I have about 1100 respondents). I have no problem when I run it on results that represent the difference between two feeling thermometers (a variable that ranges from -100 to 100). But every time I try to run it on a Likert-type scale item that takes values of 1 through 3 or 1 through 5, it returns a table full of null results (NaN in all cells except the diagonal). This comes along with a series of warnings as follows:
z_value = diff / np.sqrt((A - x_ties) * B)
notreject = pvals > alphaf / np.arange(ntests, 0, -1)
pvals_corrected[pvals_corrected>1] = 1
I am not a programming expert, but my impression is that what is happening here is that the compare_dunn function (lines 187-193 in posthoc.py) is not returning valid p-values, and I am guessing that this is because (A - x_ties) is negative for some reason and so the np.sqrt function isn't computing a value for the z_value.
I played around with some groups of small arrays involving combinations of values ranging from 1 to 3 and 1 to 5, on the same scale as my data. Sometimes these had no problem returning valid results and other times they yielded the same NaNs that I get with my full dataset. I'm wondering if the issue has something to do with the total number or overall proportion of ties in the data. Obviously with Likert-type scale items there are a lot of ties. I'd love your thoughts on whether it's something that can be fixed to make analysis on this type of data possible. Thanks!!
The text was updated successfully, but these errors were encountered: