Join GitHub today
GitHub is home to over 50 million developers working together to host and review code, manage projects, and build software together.
Sign upp_direction: return as 0.5 - 1 float or keep as 50 - 100 percentage? #168
Comments
|
I remember we already had this discussion about pd at the very beginning |
|
and also it's indeed probably too late :) However, we might maybe emphasize the fact that although it is called probability of direction it is actually usually reported as a percentage. But again the two are the same in the end, and since the print method returns a percentage sign I think it's okay to keep things as they are. |
I think, the print-method should do all the job of "transformation", while the raw data is a value between 0 and 1. This will most likely be a confusion if users further process the return-results of |
|
but |
|
I think I would prefer to rename the pd as being the percentage of direction than changing the return |
|
Yes, maybe it's a confusion due to the wording. First, "probability" suggests value between 0 and 1; and second, I can't recall if there are many functions that really take a "percentage" value as input, so typically I would expect and am used to values between 0 to 1 (ratios), rather than "percentages". Even arguments that are labelled as "percentage values" take value between 0 and 1 (see I think these points are responsible for my initial confusion here. |
This was also why originally ci's level took a 0-100 value I understand the issue, but I think that since the probability/percentage of direction is anyway a new thing (to my knowledge it has not yet been formalized as such), people that will use it will know its output, and it's very unlikely that they would have any strong priors as to its return. So I think it's safe... In general, I am not sure the print methods should do any transformation, as a discrepancy between what is displayed and what is returned is definitely a way to confuse people (I was really annoyed for instance by the fact that BayesFactor displays regular BFs but returns log(BFs), but that's maybe just me :) |
|
Ok, I think we can close this issue then, maybe we just need some small refining of the docs, but... (see below, last paragraph)
I think this is one of the main benefits of having print-methods, as long as the conversion is correct (i.e. a proportion of .5 in the data is printed as 50%) I think there's a difference between what you process in further analyses / scripts / whatever (the .5), and the requirements to a "human-readable" output (50%). Mathematically, all computations would proceed with .5, I think. :-) To be nitpicking, the returned data values (in the data frame or as value) are 80 or 100, and not "80%" or "100%", so formally, the return results from |
|
The more I think about it, the more I tend to revise the code and use "ratios" as values, not integers. We're currently still at a point-of-return, but I guess this chance will be over soon. But I'm open to be convinced otherwise, especially if there are good arguments and/or a majority voting to preserve the current state. :-) |
|
it's not integerers it's percentage haha, you're too nitpick! anyway I am traveling today so ill be out for 24h, please refrain from changing until then, I'll think about this while being high (literally high) :) |
|
I won't revise any code, this is something with consequences that is in the responsibility of the maintainer. |
|
Moreover, there is no risk of confusing the output of the percentage since the pd output is never < 1 (as it is between 50 and 100). However, I am now wondering about - for consistency and readability - outputting the ROPE Percentage as a percentage... |
|
Today I feel a bit less categorical about this (did you cast some dark magic rituals while I was asleep |
|
@pdwaggoner @mattansb @humanfactors @IndrajeetPatil |
|
Proportion makes more sense to me? But I don't feel strongly one way or the other |
|
I feel relatively strongly (pd 99% ) that the underlying stored value should be a decimal ratio (e.g., One really really convincing reason to do this is that an individual may want to multiple the This is just incorrect, and we don't want to create functions which create 'invisible' errors like this.
So I think float makes most sense. |
This is a strong argument indeed. We can change it for v3. @strengejacke you can sleep easy |
|
Not sure, the plot-method in see might be affected, but we can update the packages short after each other, so there won't be any problems for too long. |
I hope you meant 0.3.0 :-P But... you re-opened this pandora's box. :-) |
|
Agreed - proportions are typically how I think about this stuff, but like @mattansb , I don't have a strong opinion here. |

Not sure, probably it's too late, but I just realized that
pd()returns porportions multiplied by 100, instead, say, .65 for 65%.Accordingly,
convert_pd_to_p()requires a number between 1 and 100, where I would expect a number between 0 and 1 (the convention for the probability, I think we had this discussion with ci_level as well...).Any thoughts? Should we keep the current behaviour, or try breaking changes to the functions?