Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ENH: Added iqr function to compute IQR metric in scipy/stats/stats.py #5808

Closed
wants to merge 2 commits into from

Conversation

madphysicist
Copy link
Contributor

Transfer of numpy PR numpy/numpy#7137.

This function is different from the IQR in statsmodels. It appears to have different inputs (one array versus two), which actually makes this version more general.

As per @rgommers's request, I have added nan_policy, which basically just selects between np.percentile and np.nanpercentile. Since there were some bugs in the shape of arrays returned by np.nanpercentile, I have added a workaround that should eventually be removed.

As per @josef-pkt's request, I have added a 'scale' argument that accepts the values 'raw' or 1.0, 'normal' or sp.special.erfinv(0.5) * 2 * sp.sqrt(2) ~= 1.349, or any other number. The result is divided by the scale. The default is 'raw'.

The tests have been transferred almost as-is from numpy. However, if the numpy version being used is < 1.11.0, tests involving interpolation='midpoint' are turned off because bug-fix numpy/numpy#7129 and its backport numpy/numpy#7166 would not be in effect.

@rgommers rgommers added scipy.stats enhancement A new feature or improvement labels Feb 4, 2016
`scipy.stats.iqr` function
~~~~~~~~~~~~~~~~~~~~~~~~~~

Computes the interquatrile region of a distribution.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

typo "interquartile"

@madphysicist
Copy link
Contributor Author

@rgommers I have addressed all of your comments. Additionally, I wrapped all the docs to 72 characters except a couple of lines where that did not seem appropriate, and removed the reversed keyword to sorted along with the pointless comment above it.

@madphysicist madphysicist force-pushed the iqr-function branch 4 times, most recently from 8b206da to 1dbb236 Compare February 5, 2016 15:30
@madphysicist
Copy link
Contributor Author

I am trying to think of how to handle older versions of numpy. Would a warning that "interpolation will be ignored as it is not supported for numpy versions < 1.09" be acceptable/sufficient?

@madphysicist madphysicist force-pushed the iqr-function branch 6 times, most recently from e21fc2b to 20357f5 Compare February 8, 2016 13:27
@argriffing
Copy link
Contributor

This PR is aesthetically scary but I hope it can be merged. I'm not quite brave enough to merge this one unilaterally.

try:
result = np.percentile(x, q, axis=axis, keepdims=keepdims,
interpolation=interpolation)
except TypeError:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you comment on the circumstances under which this will happen? It will help the next dev understand what can give rise to this and what needs to be modified.

@larsoner
Copy link
Member

Other than my nitpicks LGTM. @argriffing I see what you mean about the aesthetics, the numpy bug/version stuff makes it tough. But it is well documented at least, so I can't think of a cleaner solution.

@argriffing
Copy link
Contributor

Thanks for looking at this, @Eric89GXL. I'm OK with this being merged once your comments have been addressed.

@argriffing
Copy link
Contributor

@madphysicist ping

@madphysicist
Copy link
Contributor Author

Sorry for the delay. I'm just getting back to work after having kid #2. I will get to addressing the issues pretty soon. The PR is not dead and I have not forgotten about it.

@rgommers
Copy link
Member

Congrats @madphysicist!

Added tests. The tests run through most of the aspects tested by
`numpy.percentile`, from which they are originally derived. Restrictions
apply to the functionality as the version of numpy decreases below
1.11.0. The tests and documentation make this explicit.
@madphysicist
Copy link
Contributor Author

I have addressed all the comments via appropriate code changes. While I personally have a preference for using try...except KeyError for dictionary lookups despite the lengthy error tracebacks, I have bowed to public opinion and used if...not in... instead (this has been a point of contention in a couple of my numpy PRs as well).

The aesthetics of this function are a good indication of the fact that it belongs in numpy rather than here. Perhaps if I ever get around to implementing my weighted percentile function, that will be enough motivation to allow the move.

@ev-br
Copy link
Member

ev-br commented Jun 12, 2016

Needs a rebase

ev-br added a commit that referenced this pull request Jun 18, 2016
@ev-br
Copy link
Member

ev-br commented Jun 18, 2016

All right, it's been okayed by Alex and Eric, so I rebased and merged it in 8346eea
Thanks @madphysicist, all

@ev-br ev-br closed this Jun 18, 2016
@ev-br ev-br added this to the 0.18.0 milestone Jun 18, 2016
@madphysicist
Copy link
Contributor Author

Thanks @ev-br. Sorry I did not respond earlier. Been offline for a couple of months since kid #2 was born.

@ev-br
Copy link
Member

ev-br commented Jun 20, 2016

Congratulations @madphysicist !

@madphysicist madphysicist deleted the iqr-function branch January 6, 2022 23:55
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.stats
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants