Skip to content

Conversation

@nalimilan
Copy link
Member

@nalimilan nalimilan commented Feb 13, 2021

This used to make getindex faster, but it's no longer the case on Julia 1.6, and since it triggers one allocation for each level it makes getindex with slices slower with large pools.

master, Julia 1.6:

using CategoricalArrays, BenchmarkTools
function sumequals(A::AbstractArray, v::CategoricalValue)
    n = 0
    @inbounds for x in A
        n += isequal(x, v)
    end
    n
end

ca = CategoricalArray(repeat(string.('A':'J'), outer=1000));
mca = CategoricalArray(repeat([missing; string.('A':'J')], outer=1000));


julia> @btime sumequals(ca, ca[1]);
  30.187 μs (1 allocation: 16 bytes)

julia> @btime sumequals(mca, $(first(skipmissing(mca))));
  42.165 μs (1 allocation: 16 bytes)

PR, Julia 1.6

julia> @btime sumequals(ca, ca[1]);
  30.175 μs (1 allocation: 16 bytes)

julia> @btime sumequals(mca, $(first(skipmissing(mca))));
  28.608 μs (1 allocation: 16 bytes)

PR, Julia 1.5.3:

julia> @btime sumequals(ca, ca[1]);
  20.152 μs (2 allocations: 48 bytes)

julia> @btime sumequals(mca, $(first(skipmissing(mca))));
  82.890 μs (10001 allocations: 312.52 KiB)

This used to make `getindex` faster, but it's no longer the case on Julia 1.6,
and since it triggers one allocation for each level it makes `getindex` with slices
slower with large pools.
@bkamins
Copy link
Member

bkamins commented Feb 14, 2021

If compiler is able to handle this it is great. I also understand that it should speed things up when creating or slicing CategoricalArray as it reduces its memory footprint. Right?

@nalimilan
Copy link
Member Author

Yes -- but mainly with large pools.

@nalimilan nalimilan merged commit 7e840a3 into master Mar 31, 2021
@nalimilan nalimilan deleted the nl/valindex branch March 31, 2021 20:36
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

3 participants