Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

StatsBase.mad computes the wrong median absolute deviation #905

Open
astrobc1 opened this issue Dec 15, 2023 · 1 comment
Open

StatsBase.mad computes the wrong median absolute deviation #905

astrobc1 opened this issue Dec 15, 2023 · 1 comment

Comments

@astrobc1
Copy link

By default, StatsBase.mad computes a "normalized" median absolute deviation where the normalization factor is equal to quantile(Normal(), 3/4). The default behavior of this function should just be the mad, not an estimator for the standard deviation. This behavior is unexpected and is further not applicable for most distributions, anyway.

@aplavin
Copy link
Contributor

aplavin commented Feb 15, 2024

Also got caught by this!
Median absolute deviation is a well-defined statistic that doesn't include the normalization factor. Other software, such as scipy, matlab, mathematica, follow the definition (of course) and compute median(abs.(x .- median(x))). Would be totally sensible for Julia to also follow the definition.

Such behavior can only be changed in a breaking version, because it's documented. StatsBase does have breaking versions from time to time, can this be added to a list for the next potential breaking version? This amounts to chaning normalize=true to normalize=false as the default value.

I think the reason for this behavior in StatsBase is mostly historical. See earlier issues on this topic:
#97
#347

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants