-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
sort!(::Vector{Union{T, Missing}})
could be more efficient
#27781
Comments
Not really a regression, right? Since this was previously definitely not efficient. |
The difference is that
and then uses Maybe we could treat What I mean is: when calling
and then call |
As people have spotted on the Discourse thread, a first issue is that merge sort is used instead of quick sort. This explains the allocations, but using quick sort "only" reduces the time by one third. Probably still worth doing though, see #27789. julia> @btime sort!(y, alg=QuickSort) setup=(y = rand(100));
460.335 ns (0 allocations: 0 bytes)
julia> @btime sort!(y, alg=MergeSort) setup=(y = rand(100));
665.809 ns (3 allocations: 608 bytes)
julia> @btime sort!(y2, alg=QuickSort) setup=(y = rand(100); y2 = ifelse.(rand(length(y)) .< 0.9, y, missing));
1.861 μs (0 allocations: 0 bytes)
julia> @btime sort!(y2, alg=MergeSort) setup=(y = rand(100); y2 = ifelse.(rand(length(y)) .< 0.9, y, missing));
1.918 μs (2 allocations: 624 bytes) |
Good catch @haampie! I guess we could try extending |
But is there a safe way to do so? Reinterpreting the vector will not work: > reinterpret(Float64, Vector{Union{Missing,Float64}}(rand(100)))
ERROR: ArgumentError: cannot reinterpret `Union{Missing, Float64}` `Float64`, type `Union{Missing, Float64}` is not a bits type |
No, we'd need to access the data part of the array, which is a hidden implementation detail. But I hope we can avoid that. |
I've just tested a quick hack like this: @@ -1021,7 +1021,7 @@ right(o::Perm) = Perm(right(o.order), o.data)
lt(::Left, x::T, y::T) where {T<:Floats} = slt_int(y, x)
lt(::Right, x::T, y::T) where {T<:Floats} = slt_int(x, y)
-isnan(o::DirectOrdering, x::Floats) = (x!=x)
+isnan(o::DirectOrdering, x::Union{Floats,Missing}) = (x === missing || x!=x)
@@ -1082,7 +1082,7 @@ end
fpsort!(v::AbstractVector, a::Sort.PartialQuickSort, o::Ordering) =
sort!(v, first(axes(v,1)), last(axes(v,1)), a, o)
-sort!(v::AbstractVector{<:Floats}, a::Algorithm, o::DirectOrdering) = fpsort!(v,a,o)
+sort!(v::AbstractVector{<:Union{Floats, Missing}, a::Algorithm, o::DirectOrdering) = fpsort!(v,a,o) AFAICT it gives correct results for normal values, but mixes julia> @btime sort!(y2, alg=QuickSort) setup=(y = rand(100); y2 = ifelse.(rand(length(y)) .< 0.9, y, missing));
796.571 ns (0 allocations: 0 bytes) |
The correct behavior based on |
I've prepared a PR at #27817. It's about twice faster than the current |
What about larger vectors? I tried it with ~1000 elements, and that seemed to give the largest performance gap. |
You mean compared with R? Actually I could have mentioned that "50% slower" referred to a comparison for a vector with 10M entries (as in the original Discourse post), to limit the variability of measurements. |
I opened this issue #27831 with a question about the reinterpret-trick, cause I feel that's the way to tackle this problem in general. |
So, the consensus of #27831 seems that something nice & performant is not yet around, but something quick & dirty will work: @inline function partition_missing!(v::Vector{Union{T,Missing}}) where {T}
i, j = 1, length(v)
@inbounds while true
while i < j && v[i] !== missing; i += 1; end
while i < j && v[j] === missing; j -= 1; end
i >= j && break
v[i], v[j] = v[j], v[i]
i += 1; j -= 1;
end
@inbounds return v[i] === missing ? i - 1 : i
end
function my_sort!(v::Vector{Union{T,Missing}}) where {T}
m = partition_missing!(v)
w = unsafe_wrap(Array, Ptr{T}(pointer(v)), m)
sort!(w)
v
end
### test
using Test
function test_things()
@test issorted(my_sort!([missing, 3, 2, 10, missing, 4]))
@test issorted(my_sort!(Vector{Union{Int,Missing}}([missing, missing])))
@test issorted(my_sort!(Vector{Union{Int,Missing}}([4,2,6,2,9])))
end
### bench
using BenchmarkTools
using Random
function bench(T::Type = Int, n = 1_000)
y = rand(T, n)
vec = ifelse.(y .< 0.9, y, missing)
bench_new = @benchmark my_sort!(z) setup = (z = copy($vec))
bench_curr = @benchmark sort!(z) setup = (z = copy($vec))
bench_notmissing = @benchmark sort!(z) setup = (z = copy($y))
bench_new, bench_curr, bench_notmissing
end gives
So, sorting a vector with missing data is actually faster than sorting a vector without missing values :). Probably has to do with the fact that we don't have to sort 10% of the data when moving the |
Fascinating! ;-) That approach sounds promising to me, maybe it can handle missing values in general and avoid the need for #27817. One question is to find what types |
I don’t think we really need a completely general solution to this right away, it would be fine to hack it in for common important bits types like ints and floats. |
That's not the right predicate, but the simplest is maybe |
Sure. What I was wondering is whether we need special code in |
My guess is that the above might not be noticeably slower than #27817, maybe even faster? |
Yes, it should be similar, though moving |
Yes, I'm sorry for not pursuing this right away. I'll give it a shot this weekend! |
Gently bumping this. Would be nice to get something along these lines into 1.7. |
I've rebased #27817. Given that we haven't made any progress on the more general solution for more than two years, it's probably a good idea to merge it to make common cases fast at least. EDIT: and anyway #27817 works for any |
This is still an issue in 1.9.0-DEV right now for non-floating point unions with missing. |
using BenchmarkTools
@btime sort!(x) setup=(x=vcat([(rand(),) for x in 1:10^5]));
# 8.858 ms (2 allocations: 390.67 KiB)
@btime sort!(x) setup=(x=vcat([(rand(),) for x in 1:10^5], [missing]));
# 15.823 ms (2 allocations: 439.55 KiB) |
sort!(::Vector{Union{Float64, Missing}})
allocates unexpectedly in v0.7-betasort!(::Vector{Union{T, Missing}})
could be more efficient
Opening an issue as suggested here: https://discourse.julialang.org/t/with-missings-julia-is-slower-than-r/11838/17?u=rdeits
Sorting a Float64 array in-place is quite fast and non-allocating:
but sorting an array of
Union{Float64, Missing}
in-place allocates a new copy and is about 2x slower, regardless of the input size:The text was updated successfully, but these errors were encountered: