Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Performance improvements for IdDict #51091

Open
wants to merge 16 commits into
base: master
Choose a base branch
from

Conversation

Zentrik
Copy link
Contributor

@Zentrik Zentrik commented Aug 28, 2023

The main improvement is from get! which no longer calculates the hash twice when it has to setindex and so is twice as fast. This is my first time working with the Julia C code so there might be some mistakes there but it seems to work fine for me.
This is useful for a global value numbering pass I've written.

@oscardssmith oscardssmith added the performance Must go faster label Aug 29, 2023
base/iddict.jl Outdated Show resolved Hide resolved
@JeffBezanson
Copy link
Sponsor Member

Thanks, any speedup numbers to whet the appetite?

I'm not really comfortable duplicating all the lookup code, and the try-catch around the conversion is not a good pattern to use. This needs a different design. Maybe we can copy how get! works for Dict.

@Zentrik
Copy link
Contributor Author

Zentrik commented Aug 30, 2023

Here's some performance numbers,

function f!(d)
    for x in 1:10^4
        d[x] = 1
    end
end

function g!(d)
   for x in 1:10^4
       get!(d, x, 1)
   end
end

On master

julia> versioninfo()
Julia Version 1.11.0-DEV.356
Commit 4768d7ab8fe (2023-08-28 12:54 UTC)
Build Info:
  Official https://julialang.org/ release
Platform Info:
  OS: Linux (x86_64-linux-gnu)
  CPU: 16 × AMD Ryzen 7 1700 Eight-Core Processor
  WORD_SIZE: 64
  LLVM: libLLVM-15.0.7 (ORCJIT, znver1)
  Threads: 1 on 16 virtual cores

julia> @btime f!(d) setup=(d=IdDict{Int, Int}(); sizehint!(d, 10^4))
  409.015 μs (9489 allocations: 148.27 KiB)

julia> @btime g!(d) setup=(d=IdDict{Int, Int}(); sizehint!(d, 10^4))
  639.763 μs (18978 allocations: 296.53 KiB)

On this pr

@btime f!(d) setup=(d=IdDict{Int, Int}(); sizehint!(d, 10^4))
  324.515 μs (9489 allocations: 148.27 KiB)

@btime g!(d) setup=(d=IdDict{Int, Int}(); sizehint!(d, 10^4))
  330.156 μs (9489 allocations: 148.27 KiB)

base/iddict.jl Outdated
if !(val isa V) # avoid a dynamic call
val = convert(V, val)::V
end
keyindex = ccall(:jl_eqtable_keyindex, Cssize_t, (Any, Any), d, key)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why does this need to be implemented in C?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not sure how to call jl_table_assign_bp directly from Julia as it can change d.ht.

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There used to be ways, but not anymore with the generational GC. The usual way now to implement this is to switch the return object around as such:

Suggested change
keyindex = ccall(:jl_eqtable_keyindex, Cssize_t, (Any, Any), d, key)
keyindex = Ref(UInt(0))
d.ht = ccall(:jl_table_assign_bp, Any, (Any, Any, Ref{UInt}), d.ht, key, keyindex)
keyindex = keyindex[]

Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you change this around still also? It will avoid us from needing to reflect the iddict struct into C

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just to note, I believe this would remove the ~20% speedup (see comments above) we get from not having to heap allocate for setindex!. Happy to make this change if that's fine.

test/dict.jl Show resolved Hide resolved
@Zentrik Zentrik force-pushed the perf-iddict branch 2 times, most recently from 75060fe to 74e0ca6 Compare October 29, 2023 00:26
base/iddict.jl Outdated Show resolved Hide resolved
base/iddict.jl Outdated Show resolved Hide resolved
Remove old comment

Switch to atomic pointers

Fixes errors in ASAN as previously we did atomic stores to non atomic pointers

Split empty! change into seperate pr

Remove duplication in C code

Add tests

15% speed up for setindex!

Remove try catch

Signifcant performance improvement for path where error used to be caught.

Fix bug

Actually fix bug

Restore performance by not boxing key twice

Fix comment

Add comment about IdDict being mirrored in julia.h

Fix comment

Fix another comment

Allow writing NULL into iddict

Reverts the change of behaviour just in case.

Remove clutter
Added inbounds as dict.jl seems to use them as well
Unfortunately, cost is ~106 now so still too high, JuliaLang#51912 is tracking issue about high cost.

Lowers inlining cost of setproperty as we know rhs isa Memory{Any}
About 15% faster on microbenchmark, `f!(d)` example in pr comments
This reverts commit 7eb2ed6.
Added inbounds as dict.jl seems to use them as well
@Zentrik
Copy link
Contributor Author

Zentrik commented Feb 3, 2024

CI looks good, ASAN just timed out I think.

base/iddict.jl Outdated Show resolved Hide resolved
base/iddict.jl Outdated Show resolved Hide resolved
function setindex!(d::IdDict{K,V}, @nospecialize(val), @nospecialize(key)) where {K, V}
keyindex, inserted = ht_keyindex!(d, key)
if !(val isa V) # avoid a dynamic call
val = convert(V, val)::V
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

There seems to be a discrepancy here in the state of the IdDict during this convert call or if this convert failed. The ht_keyindex! call has fully inserted the key, but not the value. For Dict, neither has occurred yet, and all insertion and accounting happens at once in _setindex!. But in IdDict here, it looks like half of the insertion occurs in ht_keyindex! (effectively inserting it in the deleted state) but then the final accounting update for it doesn't happen until after the _setindex! call later in this method. Do we update ndel over this call so the value is consistent?

There also may need to be an age counter added to the IdDict, so it can detect concurrent modifications that happen in this convert call (or move this convert call back to the top of the function)

Conform to dict API, by moving `@inbounds` and changing `ht_keyindex!`.
Deal with failure case when `convert` calls fails.

TODO: think about concurrent modifications, should only be a problem in `get!`. Add tests for `convert` failing.
base/iddict.jl Outdated

if keyindex < 0
# If convert call fails we need the key to be deleted
@inbounds d[-keyindex] = nothing
Copy link
Sponsor Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
@inbounds d[-keyindex] = nothing

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does the deleted state not require they key to be nothing? Though this prevents a segfault so I'm probably missing something.

This comment was marked as outdated.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nevermind, the segfault was because I meant to write d.ht[-keyindex] = nothing. My concern is if the convert fails we have a valid key inserted and the value is NULL. This will prevent us from ever updating the value as we will hit this assert

julia/src/iddict.c

Lines 63 to 69 in e22db88

if (jl_atomic_load_relaxed(&tab[index + 1]) != NULL) {
jl_atomic_store_release(&tab[index + 1], val);
jl_gc_wb(a, val);
return 0;
}
// `nothing` is our sentinel value for deletion, so need to keep searching if it's also our search key
assert(key == jl_nothing);

I can move the key updating code out of ht_keyindex2! if that helps.

Caused segfaults when ht_keyindex2! caused a rehash
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
performance Must go faster
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants