Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

improve quantile: reduce allocations, use partial sort #14413

Merged
merged 1 commit into from
Dec 16, 2015

Conversation

simonbyrne
Copy link
Contributor

This improves the performance of quantile by reducing allocations and using a partial sort. I see a 3x improvement for large cases.

It should be mostly non-breaking, but it does slightly change the behaviour of values outside the interval [0,1] (the previous version was a bit more lax), but I don't think this should prevent backporting.

@tkelman
Copy link
Contributor

tkelman commented Dec 15, 2015

Potential behavior changes like this are why I generally don't backport anything until I get a chance to run PackageEvaluator with the change included.

@tkelman
Copy link
Contributor

tkelman commented Dec 15, 2015

There should also be tests of any corner cases, if there aren't yet.

@StefanKarpinski
Copy link
Sponsor Member

👍

@simonbyrne
Copy link
Contributor Author

Good point, I'll add some more tests.

By the way, I forgot to mention that this add float(::Type{T}) methods, returning the type of the float(::T) (or Any, if T is not a subtype of number). I can get rid of this if it's too awkward (though perhaps think of it as similar to widen, which has similar behaviour).

@ViralBShah
Copy link
Member

This performance improvement may be worth capturing in NEWS.md.

"""
quantile!([q, ] v, p; sorted=false)

Compute the quantile(s) of a vector `v` at the probabilities `p`, with optional output into array `q` (if not provided, a new output array is created). The keyword argument `sorted` indicates whether `v` can be assumed to be sorted; if `false` (the default), then the elements of `v` may be partially sorted.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

line breaks

and should be sure the rst sig is consistent, and rerun genstdlib

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, right. I haven't edited the docs since the a-doc-alypse.

@simonbyrne simonbyrne force-pushed the sb/quantile branch 2 times, most recently from ff97cdd to 7bd187f Compare December 16, 2015 12:26
@simonbyrne
Copy link
Contributor Author

Okay, rebased. Hopefully this should all be good to go.

function quantile!(v::AbstractVector, q::AbstractVector)
isempty(v) && throw(ArgumentError("empty data array"))
isempty(q) && throw(ArgumentError("empty quantile array"))
doc"""
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

#14378 moved almost completely away from doc""". Latex is now denoted by double backticks.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, good to know. Fixed.

jakebolewski added a commit that referenced this pull request Dec 16, 2015
improve quantile: reduce allocations, use partial sort
@jakebolewski jakebolewski merged commit 2778a52 into master Dec 16, 2015
@tkelman tkelman deleted the sb/quantile branch December 16, 2015 23:18

function bound_quantiles(qs::AbstractVector)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

hopefully no one was using this? only public code on github I can find that isn't a julia fork is StatsBase's own copy, so should be okay

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It wasn't exported, so I think it is safe to remove.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

on master yes, but not necessarily for a backport

simonbyrne added a commit that referenced this pull request Jan 9, 2016
@ivarne ivarne mentioned this pull request Apr 11, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants