-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Change Hash implementation #5256
Conversation
use array of entries + open addressing index of entries.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some name changes, and a explanitory comment on how it works would be invaluable to reviewing this.
src/hash.cr
Outdated
@buckets_size : Int32 | ||
@first : Entry(K, V)? | ||
@last : Entry(K, V)? | ||
@sz : UInt8 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is badly named.
src/hash.cr
Outdated
end | ||
|
||
# Similar to `#first_key?`, but returns its value. | ||
def first_value? | ||
@first.try &.value | ||
unless empty? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
return nil if empty?
/return yield if empty?
is more readable and ditto down the page.
src/hash.cr
Outdated
protected def find_entry(key) | ||
return nil if empty? | ||
@[AlwaysInline] | ||
private def iter_entries |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
each_entry
?
src/hash.cr
Outdated
@sz = newsz | ||
end | ||
|
||
private def need_shrink(size : Int32, sz : UInt8) : Bool |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
needs_shrink?
?
src/hash.cr
Outdated
|
||
private def nindex(sz) | ||
mask = SIZES[sz].indexmask | ||
mask + (mask != 0 ? 1 : 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
mask == 0 ? 0 : 1
src/hash.cr
Outdated
(h >> 1) + 1 | ||
end | ||
|
||
private def nindex(sz) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This isn't a very crystally name.
Worse problem: doesn't look it is faster :-( And 32bit compiler test didn't pass in Travis-CI. |
Looks like, reallocation costs too much. |
I think that Hash may have two different implementations for small (<7) and other sizes. I'm sure that small hashes have a big value. |
I benchmarked this PR and it works twice as fast as the current Hash. Here's my benchmark: a = {} of Int32 => String
strings = Array.new(10_000_000, &.to_s)
time = Time.now
strings.each_with_index do |s, i|
a[i] = s
end
puts Time.now - time Remember to run this with It would be interesting to see your benchmarks. I think this Hash should be more efficient because in the current Hash every time you add a key-value pair there's an allocation for the node. In your implementation that doesn't seem to be the case. |
It is expected for single huge hash to be faster. But http helloworld ( from crystal-lang.org main page) prouces many small hashes. |
Testing compiler compile performance (with I don't like the idea of having a rigid cutoff point with 2 different hash implementations. There must be a way to make it fast at any size. Another optimization I would love to make is having only 1 malloc call for am empty hash, as at least in the rust compiler I know they found that hashtables being created, never used, but discarded was fairly common. |
I have an idea to use chunked allocaion, ie allocate by chunks of 8 entries. |
@funny-falcon Assuming that a larger chunk size would increase performance, you could change chunk size based on the size of the hash to keep the memory overhead bounded when small and performance overhead miniscule when large. |
What about implementation of pessimistic downsize politics? |
@akzhan , what do you mean? |
Do relocations on increase, but rarely on decrease.
пт, 10 нояб. 2017 г. в 20:12, Sokolov Yura <notifications@github.com>:
… @akzhan <https://github.com/akzhan> , what do you mean?
—
You are receiving this because you were mentioned.
Reply to this email directly, view it on GitHub
<#5256 (comment)>,
or mute the thread
<https://github.com/notifications/unsubscribe-auth/AAG3hFfU_zDu06-XLl6-ikZ99Y3YL3Unks5s1IPrgaJpZM4QVeWz>
.
|
That was already in this patch: it resized down only if |
Looks like chunked version is not slower than current implementation even on http hello world, so that I reopen pull request. |
@funny-falcon Please give us some documentation on how this works to work with! |
@rebuild_num = 0_u16 | ||
@first = 0_u32 | ||
@last = 0_u32 | ||
@index = Pointer(UInt32).new(0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd suggest using Pointer.empty
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You mean Pointer.null
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@RX14 right, my bad.
src/hash.cr
Outdated
@block = block | ||
unless initial_capacity.nil? |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if initial_capacity
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it may be zero, isn't it?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
only nil
, Pointer.null
and false
are falsey in crystal. 0
is truthy, which means the only practical type where you need to use Object#nil?
is Bool?
, which is fairly rare and a situation often better described with an enum.
src/hash.cr
Outdated
chunks = chunks_ptr | ||
@first.upto(@last - 1) do |i| | ||
chunk = chunks[i / CHUNK] | ||
entry = chunk + i % CHUNK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not certain there is a noticeable difference or if llvm is smart enough to notice this by itself, but perhaps
chunk_index, offset = i.divmod(CHUNK)
to save a division.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
llvm should optimize division and module by 8 to shifts and mask (ie i >> 3
and i & 7
)
Is it GTG? |
It plays bad with conservative GC on 32 bit platform:
So, while small scripts will still work, larger applications become less reliable on 32bit platform with this patch. So, if support of 32 bit platform is important, this patch should be improved. I don't know, how it should be done. I'm still thinking. |
Doubtfuly Crystal will gain precise GC soon. |
Have you seen #5271? |
@akzhan, not totally clear if anyone can or wants to push that forward ATM though. |
I wouldn't focus on the performance of 32-bit too much, but it should work and be stable. 32bit is almost obsolete on x86, but on arm it'll still be a while until 32bit SoCs stop being sold. |
32bit version works. It just doesn't play well with conservative GC. I may make 32bit version more "workable" by not storing hash-sum, and changing it to closed addressing/chained version. I think, a lot of code will be common between 32bit and 64bit versions, so they will co-exists in one file with several |
Perhaps seperating the "implementation-specific" and "helper-methods" parts of |
This Pull request is alone that not yet finished anf merged with inspiration of #4675, what is it's state now? |
I was a bit busy with work and family. I plan to spend some time on weekends. |
To work around 32bit platform's issues, use chaining instead of open addressing.
aa84284
to
b7641b0
Compare
damn :-( now 64bit version fails with "Repeated allocation of very large block"... I can't understand, why ;-( |
And I can't reproduce on my notebook :-( My version of libgc doesn't complain against. As well as circleci. |
Can you ask @ivmai about strange "Repeated allocation of very large block"? |
I thought the repeated allocation was just a warning. I get that in my implementation too. EDIT: Or well, it is not really 'just' a warning as stuff like this shouldn't warn, but not an error. |
No, the error is that the compiler specs ran out of memory. Check the usage before and after locally. |
Problem is I cann't reproduce. And circleci cann't as well. So, looks like it depends on hosts libgc. I have one suggestion: probably it is because old code doesn't reallock huge bins array on |
I turned off deallocation on clear, but looks like it doesn't help. |
@funny-falcon it's unlikely to be a libgc bug. You probably can't reproduce it simply because you have more ram than the travis VM. Look at your ram usage when running |
@RX14, I don't claim it is libgc bug. I've said, libgc behaves differently between travis-ci and my computer. Unfortunately, on my pc I have no this warnings. So it complicates investigation a lot. |
On ~5minute point (~610 samples) master consumed 490MB and patched 390MB, therefore it is not direct memory consumption, unfortunately. |
@funny-falcon that's very strange. |
@funny-falcon Do you think it would make a sense to try with latest Crystal release? It's been a while, maybe the issue disappeared? :) |
Yes, I will try |
Yeah i'm quite sad this PR stalled, |
With #8017 merged in should this one be closed? |
Yes, closing. |
use array of entries + open addressing index of entries.
Closes #4557