-
-
Notifications
You must be signed in to change notification settings - Fork 5.7k
Description
I noticed this a couple of years ago and posted it to Discourse. My observations repeated here:
I don’t know much about hash table implementations, so maybe this is a known weakness of the algorithm chosen. I did notice ndel is only ever increased (in _delete!), then used as a condition for rehashing (if more than 3/4 slots are deleted), and then reset to 0 on rehash. Does that mean deleted slots are never re-used? I thought commonly deleted slots are treated as occupied on probe, and empty for reuse on insert.
DataStructures.RobinDict doesn’t exhibit the same behavior.
Code for validation:
using BenchmarkTools
using DataStructures
function foo(d, reps)
for i in 1:reps
d[i] = 1
delete!(d, i)
end
nothing
end
@btime foo($(Dict{Int, Int}()), 1000)
# 24.200 μs (249 allocations: 42.80 KiB)
@btime foo($(RobinDict{Int, Int}()), 1000)
# 33.101 μs (0 allocations: 0 bytes)I've confirmed this behavior on 1.7, 1.8, and 1.9. It's forced us to use RobinDict for no other reason than this allocation issue, and it's not quite as well supported and tested (we hit a nasty infinite loop bug with it today).