Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

WIP: try ordered Dict representation #10116

Closed
wants to merge 1 commit into from
Closed

Conversation

JeffBezanson
Copy link
Sponsor Member

This implements the representation mentioned in #10092, storing keys and values in dense ordered arrays, with a sparse(r) index vector. There are a couple TODOs in the code. This is expected to be a bit slower for deletion-heavy workloads, but I haven't benchmarked that yet.

My benchmark code is at https://gist.github.com/JeffBezanson/ea224aa45e2242ca3ee9 .
I took some benchmarks from past Dict performance issues, and added a couple more. The performance characteristics are interesting. Highlights:

  • Simply iterating over a Dict is 10x faster, as I decided to make start squeeze out deleted items first.
  • rehash! doesn't need to reallocate the keys and values arrays at all in theory, but we might need to in order to fix the problem of finalizers that remove keys during rehashing.
  • Seems to be faster for String keys, but unfortunately much slower for Int keys.

Profiling indicates that a huge amount of insertion time is spent on the line

        elseif si > 0 && isequal(key, keys[si])

My best guess is that this is because keys[si] is random access, where before we iterated through the slots and keys arrays in order. But if pointer indirection is involved anyway (e.g. String keys) we get a significant net speedup.

@StefanKarpinski
Copy link
Sponsor Member

Sorry, didn't see the PR before writing that comment. This is pretty fascinating. If it's a win whenever there's pointer indirection, it's clearly a good tradeoff for Python, where that's almost always the case.

@JeffBezanson
Copy link
Sponsor Member Author

Really big gains are possible if you need the ordering feature. For example with this change, setdiff can be rewritten as follows:

function setdiff(a, b)
    args_type = promote_type(eltype(a), eltype(b))
    bset = Set(b)
    seen = Set{args_type}()
    for a_elem in a
        !in(a_elem, bset) && push!(seen, a_elem)
    end
    collect(seen)
end

And I get:

setdiff:
elapsed time: 0.074992577 seconds (42 MB allocated, 2.35% gc time in 2 pauses with 0 full sweep)

which is 6x faster than master, and almost 10x faster than 0.3.

@StefanKarpinski
Copy link
Sponsor Member

This seems like a fairly big win overall – if we can figure out a clever way to make the Int case (immediate keys in general) faster, then I would say that switching to this would be well worth it. Another benefit is that it's one less data structure to have – no need for a separate ordered dict implementation.

@StefanKarpinski
Copy link
Sponsor Member

So, am I correct that the thing that's slow here for dicts of Ints is calling copy on them? That makes it seem like we really should be copying Dict objects more efficiently – why not just copy their internal arrays directly? Is the idea behind the current implementation of dict copying that rebuilding them is an equivalent amount of work to copying the internals and then rehashing?

That aside, I guess the real point of the benchmark is that inserting into an Int dict is 30% slower and iterating over an Int dict is also slower, making copy, which does both, a total of 70% slower.

@JeffBezanson
Copy link
Sponsor Member Author

Iterating over any dict, including Int dicts, is 10x faster since it becomes equivalent to iterating over two arrays. Mostly assigning is slower.

Yes, we could definitely copy more efficiently.

@timholy
Copy link
Sponsor Member

timholy commented Feb 8, 2015

I rarely iterate over Dicts, so I'm curious to know how other workloads fare. For example, a way I often use Dicts is as "flexible sparse matrices," i.e.,

cachekey(i::Int, j::Int) = i <= j ? Pair(i, j) : Pair(j, i)
cache[key] = dot(vectors[i], vectors[j])

and sometime later if vectors[j] gets eaten, delete!(cache, key) for all keys associated with j.

I guess I should just check this branch out and benchmark it, but I won't have time for that for the next several days.

@JeffBezanson
Copy link
Sponsor Member Author

If you send me a test case I will happily benchmark it!

@tkelman
Copy link
Contributor

tkelman commented Feb 8, 2015

Some of the CI jobs appear to be running out of memory here

@quinnj
Copy link
Member

quinnj commented Feb 8, 2015

julia> r = Dict(1=>"hey",2=>"ho")
Dict{Int64,ASCIIString} with 2 entries:
  1 => "hey"
  2 => "ho"

julia> r[1]
"hey"

julia> r[2]
"ho"

julia> delete!(r,1)
Dict{Int64,ASCIIString} with 1 entry:Error showing value of type Dict{Int64,ASCIIString}:
ERROR: UndefRefError: access to undefined reference
 in showdict at dict.jl:81
 in writemime at replutil.jl:34
 in display at REPL.jl:105
 in display at REPL.jl:108
 in display at multimedia.jl:149
 in print_response at REPL.jl:127
 in print_response at REPL.jl:112
 in anonymous at REPL.jl:588

julia> r
Dict{Int64,ASCIIString} with 1 entry:Error showing value of type Dict{Int64,ASCIIString}:
ERROR: UndefRefError: access to undefined reference
 in showdict at dict.jl:81
 in writemime at replutil.jl:34
 in display at REPL.jl:105
 in display at REPL.jl:108
 in display at multimedia.jl:149
 in print_response at REPL.jl:127
 in print_response at REPL.jl:112
 in anonymous at REPL.jl:588

@quinnj
Copy link
Member

quinnj commented Feb 8, 2015

Excited about this!

@bfredl
Copy link
Contributor

bfredl commented Feb 8, 2015

Intresting work! However, in a use case, where dict elements are deleted when going "down" in a stack and then later added back in reverse order, when going back up the stack (a simple back-tracking algoritm), this causes a (worst-case) 3x slow-down, but I suppose this is an unavoidable consequence from the strict ordering semantics. (if an element are removed and then immediately readded, it must be pushed on front right?)
code+data here (sorry for messy code, it was an assignment where only speed mattered, not code readability. Once it was faster than the c++/STL implementation I stopped working on it :) )

@JeffBezanson
Copy link
Sponsor Member Author

Thanks for the test case, that's very helpful!

@JeffBezanson
Copy link
Sponsor Member Author

Ok, I was able to get back most of the performance by removing the sizehint, which was causing excess memory allocation and initialization overhead. Seems not to be worth it.

@bfredl
Copy link
Contributor

bfredl commented Feb 9, 2015

Nice! Still somewhat slower, but I guess this is an extreme case of (structural) mutation relative to access....

@timholy
Copy link
Sponsor Member

timholy commented Feb 11, 2015

Sorry for being slow, but perhaps this will be useful:

function initialize(N, nnbrs)
    nbrs_list = [Set{Int}() for i = 1:N]
    dp = Dict{Pair{Int,Int},Float64}()
    for i = 1:N
        nnbr = rand(nnbrs)
        nbrs = nbrs_list[i]
        for j = 1:nnbr
            nbr = rand(1:N)
            push!(nbrs, nbr)
            dp[cachekey(i,j)] = rand()
        end
    end
    nbrs_list, dp
end

function eliminate!(nbrs_list, dp)
    N = length(nbrs_list)
    order = randperm(N)
    for i in order
        nbrs = nbrs_list[i]
        for j in nbrs
            delete!(dp, cachekey(i,j))
        end
        empty!(nbrs)
    end
end

cachekey(i, j) = i <= j ? Pair(i, j) : Pair(j, i)

nnbrs = 1:20
nbrs_list, dp = initialize(5, nnbrs)
eliminate!(nbrs_list, dp)
@time 1

N = 10^5
@time nbrs_list, dp = initialize(N, nnbrs)
@time eliminate!(nbrs_list, dp)

@hayd
Copy link
Member

hayd commented May 22, 2015

Is it possible this could make 0.4? As this is awesome.

Does the int-case have to be special-cased for performance (e.g. using the old code)? Note: intset is special-cased (see #10065).

@kmsquire
Copy link
Member

If this doesn't make 0.4, it might be worth replacing OrderedDict in
DataStructures.jl with this version.

On Friday, May 22, 2015, Andy Hayden notifications@github.com wrote:

Is it possible this could make 0.4? As this is awesome.

Does the int-case have to be special-cased for performance (e.g. using the
old code)? Note: intset is special-cased (see #10065
#10065).


Reply to this email directly or view it on GitHub
#10116 (comment).

@JeffBezanson
Copy link
Sponsor Member Author

I'm starting to think we should go with this. Int keys are arguably not the most important case, and they're easy to special-case if necessary. Being faster for strings, faster iteration, and not needing a separate OrderedDict seem to be worth it.

@timholy
Copy link
Sponsor Member

timholy commented May 26, 2015

If you try that benchmark I posted, I'd be curious to know how it fares.

@hayd hayd mentioned this pull request May 27, 2015
8 tasks
@JeffBezanson
Copy link
Sponsor Member Author

The timings are a bit variable, but I got this on master:

   2.582 microseconds (155 allocations: 10845 bytes)
 522.997 milliseconds (1099 k allocations: 184 MB, 19.66% gc time)
 182.668 milliseconds (6 allocations: 781 KB)

and on this branch:

   2.451 microseconds (155 allocations: 10845 bytes)
 502.958 milliseconds (1170 k allocations: 152 MB, 22.28% gc time)
 188.579 milliseconds (6 allocations: 781 KB)

@JeffBezanson
Copy link
Sponsor Member Author

Best time I've seen on master:

   2.916 microseconds (155 allocations: 10845 bytes)
 457.631 milliseconds (1099 k allocations: 184 MB, 22.73% gc time)
 172.746 milliseconds (6 allocations: 781 KB)

best time here:

   2.634 microseconds (155 allocations: 10845 bytes)
 466.899 milliseconds (1170 k allocations: 152 MB, 23.64% gc time)
 182.558 milliseconds (6 allocations: 781 KB)

I think it's at least safe to conclude there's no major regression.

@timholy
Copy link
Sponsor Member

timholy commented May 30, 2015

LGTM!

@ScottPJones
Copy link
Contributor

LGTM too... (and I do plan on using ordered Dicts heavily, so getting this in will be very nice)

@StefanKarpinski
Copy link
Sponsor Member

I'm all for making this change. Ordered Dicts are very useful. One issues this raises is that dicts are now distinguished not just by their contents but by their ordering. Should two Dicts be considered equal if they have different orders? While making Dicts ordered is non-breaking, changing their equality is not.

@ggggggggg
Copy link
Contributor

Maybe worth merging soon? Seems like the discussion has died down and most people are pro.

@kmsquire
Copy link
Member

FWIW, this implementation is the one used for OrderedDicts in DataStructures.jl.

@samoconnor
Copy link
Contributor

Bump.
Can we have this, please?

@StefanKarpinski
Copy link
Sponsor Member

I agree – we should probably do this and make let various Dict APIs take advantage of it.

@samoconnor
Copy link
Contributor

FWIW I agree with Stefan's agreement to the last "bump".

oxinabox pushed a commit to JuliaCollections/Heaps.jl that referenced this pull request Jan 5, 2018
* JuliaLang/julia#10116
* This is largely copy and paste + rename, fixups
* Offers good speed improvements in OrderedDict iteration,
  small improvements elsewhere
rofinn pushed a commit to JuliaCollections/Tries.jl that referenced this pull request Feb 27, 2018
* JuliaLang/julia#10116
* This is largely copy and paste + rename, fixups
* Offers good speed improvements in OrderedDict iteration,
  small improvements elsewhere
@Moelf
Copy link
Sponsor Contributor

Moelf commented Oct 20, 2021

looks like this is bit rotten, what are people's thought after 4 years? knowing we have it in DataStructures

@oscardssmith
Copy link
Member

IMO, this is a good change but we probably need to remake the branch.

@kmsquire
Copy link
Member

Note that this implementation was ported to DataStructures.jl (and currently resides in OrderedCollections.jl, which DataStructures.jl re-exports).

@StefanKarpinski
Copy link
Sponsor Member

I think this is probably a good idea.

@ViralBShah ViralBShah closed this Jan 18, 2022
@ViralBShah ViralBShah deleted the jb/ordereddict branch January 18, 2022 10:03
@Moelf
Copy link
Sponsor Contributor

Moelf commented Jan 18, 2022

is this closed because it's implemented somewhere? AFAIC this is still not done and could/should be done

@oscardssmith
Copy link
Member

This is closed due to bit-rot. It still should be done, but it needs a new PR.

@ViralBShah
Copy link
Member

Can't we just use it from DataStructures?

@oscardssmith
Copy link
Member

We can copy that implementation, but we still need a new PR to copy it to Base so that all users of Dict get the speedup.

@timholy
Copy link
Sponsor Member

timholy commented Jan 19, 2022

Tread cautiously about assuming the one in DataStructures is up to snuff, as it was split from Base years ago and I doubt it has seen a lot of development since. JuliaCollections/DataStructures.jl#234 in particular requires investigation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet