New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Hash tables with open addressing #1264

Closed
wants to merge 10 commits into
base: trunk
from
Copy path View file
274 ChangeLog
@@ -1,3 +1,277 @@
Sat Mar 16 03:01:32 2016 Vladimir Makarov <vmakarov@redhat.com>
* hash.c (obj_any_hash): Move up.
(jauquet_prime_mod): New.
(any_hash, rb_any_hash): Make a copy and rename to rb_any_hash and
rb_any_hash_strong. Use non-crypto hash functions to the original
ones.
(rb_num_hash_start): Modify.
(objhash): Add rb_any_strong.
* include/ruby/st.h (st_hash_type): Add field strong_hash.
(struct st_table): Add new field curr_hash.
(st_hash_index, st_hash_double): New.
* st.c: (do_hash): Use curr_hash.
(make_tab_empty): Set up curr_hash.
(inside_table_rebuild_p): New.
(rebuild_table): Set up inside_table_rebuild_p. Extend an assert.
(reset_entry_hashes): New.
(HIT_THRESHOULD_FOR_STRONG_HASH): New.
(find_table_bin, find_table_bin_ptr): Add optional code for double
probing.
(find_table_bin_ptr_and_reserve): Ditto. Add code for switching
to stronger hashes. Change arg hash_value type. Adjust the
calls.
(Uint128Low64, Uint128High64, Hash128to64, UNALIGNED_LOAD64): New.
(UNALIGNED_LOAD32, uint32_in_expected_order): New
(uint64_in_expected_order, bswap_32, bswap_64, LIKELY, Fetch64):
New.
(Fetch32, k0, k1, k2, k3, Rotate, RotateByAtLeast1, ShiftMix):
New.
(HashLen16, HashLen0to16, HashLen17to32, pair64): New.
(WeakHashLen32WithSeeds0, WeakHashLen32WithSeeds, HashLen33to64):
New.
(CityHash64, CityHash64WithSeeds, CityHash64WithSeed): New.
(UNALIGNED_WORD_ACCESS,MurmurMagic_1, MurmurMagic_2, MurmurMagic):
Remove.
(murmur, murmur_finish, murmur_step): Remove.
(UNALIGNED_ADD_4, UNALIGNED_ADD_8, UNALIGNED_ADD_16): Remove.
(UNALIGNED_ADD_ALL, UNALIGNED_ADD, st_hash_uint32, ): Remove.
(strhash, st_hash, st_hash_uint, st_hash_end, st_hash_start):
Redefine.
(st_hash_index, st_hash_double): New.
* benchmark/bm_hash_small2.rb : New.
* benchmark/bm_hash_small4.rb: New.
* benchmark/bm_hash_small8.rb: New.
Sat Mar 12 07:59:37 2016 Vladimir Makarov <vmakarov@redhat.com>
* compile.c (ibf_table_index, ibf_dump_id_list_i): Use num_entries
instead of num_elements.
* internal.h (RHASH_SIZE): Ditto.
* encoding.c (rb_enc_name_list_i): Ditto.
* marshal.c (w_symbol, SINGLETON_DUMP_UNABLE_P, w_object): Ditto.
(r_entry, r_prepare, r_symreal): Ditto.
* regparse.c (onig_number_of_names): Ditto.
* symbol.c (symbols_i): Ditto.
* transcode.c (rb_econv_asciicompat_encoding): Ditto.
* variable.c (iv_index_tbl_newsize, iv_index_tbl_extend): Ditto.
(rb_ivar_count, autoload_delete, rb_const_list, cvar_list): Ditto.
* ext/-test-/st/numhash/numhash.c (numhash_size): Ditto.
* gc.c (rb_objspace_call_finalizer): Ditto.
(mark_entry, allrefs_dump_i, wmap_size): Ditto.
* hash.c (rb_hash_rehash, rb_hash_reject_bang): Ditto.
(rb_hash_select_bang, rb_hash_clear, rb_hash_initialize_copy): Ditto.
(rb_num_hash_start): Change hash calculation.
* ext/-test-/st/foreach/foreach.c (force_unpack_check, unp_fec):
Add tests on packed tables.
(unp_fe): Ditto.
* include/ruby/st.h (st_table_element): Rename to st_table_entry.
(st_entry_t): Rename to st_bin_t.
(struct st_table): Rename num_elements, deleted_entries,
allocated_elements, entries, elements_start, elements_bound,
elements correspondingly to num_entries, deleted_bins,
allocated_entries, bins, entries_start, entries_bound, entries.
* st.c: Change entry, element to bin, entry everywhere. Fix typos
in the comments.
Fri Mar 11 10:08:12 2016 Vladimir Makarov <vmakarov@redhat.com>
* st.c (st_assert): Exchange definitions for ST_DEBUG.
Fir Mar 11 02:17:55 2016 Vladimir Makarov <vmakarov@redhat.com>
* st.c (st_assert): Make debug version.
(ST_INIT_VAL, ST_INIT_VAL_BYTE): New.
(do_hash): Make static.
(MAX_POWER2_FOR_TABLES_WITHOUT_ENTRIES): Move up.
(hash_entry, elements_mask): Ditto.
(st_check): New.
(st_init_table_with_size): Initialize for debugging. Add st_check
call.
(st_clear): Add st_check call.
(st_memsize): Process null entries.
(rebuild_entries): Rewrite loops on elements. Add st_check call.
(rebuild_table): Ditto.
(find_element): Rewrite loop on elements.
(find_table_entry_ptr_and_reserve): Change check on fullness of
the array elements.
(st_insert, st_copy, st_general_delete, st_update): Add st_check
call.
(st_insert2): Ditto. Fix an assertion.
(st_shift, st_general_foreach): Rewrite loops on elements. Add
st_check call.
(st_general_keys, st_general_values): Rewrite loops on elements.
Wed Mar 10 05:58:17 2016 Vladimir Makarov <vmakarov@redhat.com>
* include/ruby/st.h (struct st_table): Add comments about the
circular array elements.
* st.c: Modify the top comment.
(MARK_ENTRY_DELETED): Add code to process a null pointer.
(make_tab_empty): Add code to process null entries.
(MAX_POWER2_FOR_TABLES_WITHOUT_ENTRIES): New.
(st_init_table_with_size): Used it to decide to allocate entries.
(st_free_table): Add code to process null entries.
(find_table_entry_ptr, find_table_entry_ptr_and_reserve): Change
the return type and a parameter.
(hash_entry): Move up.
(elements_mask): New.
(rebuild_entries): Check null entries. Modify code to deal with
circular array elements.
(rebuild_table): Ditto. Create a new table based on the elements
number.
(secondary_hash): Change peterb before. Use 11 for shift.
(find_element): New.
(find_table_entry): Use it if there is no array entries.
(find_table_entry_ptr): Ditto. Change the return type and add a
parameter.
(find_table_entry_ptr_and_reserve): Ditto. Change rebuild table
condition.
(st_insert, st_insert2): Adjust call of
find_table_entry_ptr_and_reserve. Check null entry_ptr. Consider
elements circular when change the bound.
(st_copy): Check null entries. Copy all arrays elements and
entries.
(update_range_for_deleted): New.
(st_general_delete, st_update): Use it. Adjust call of
find_table_entry_ptr.
(st_shift): Make it work for circular elements array.
(st_general_foreach): Ditto. Adjust call of find_table_entry_ptr.
Use update_range_for_deleted. Use find_table_entry to find the
current element after rebuilding table. Notify the function if
the current element was deleted.
(st_general_keys, st_general_values): Make it work for circular
elements array.
* benchmark/bm_hash_aref_dsym.rb: Increase the number of
iterations.
* benchmark/bm_hash_aref_fix.rb: Ditto.
* benchmark/bm_hash_aref_flo.rb: Ditto.
* benchmark/bm_hash_aref_miss.rb: Ditto.
* benchmark/bm_hash_aref_str.rb: Ditto.
* benchmark/bm_hash_aref_sym.rb: Ditto.
* benchmark/bm_hash_aref_sym_long.rb: Ditto.
* benchmark/bm_hash_flatten.rb: Ditto.
* benchmark/bm_hash_ident_flo.rb: Ditto.
* benchmark/bm_hash_ident_num.rb: Ditto.
* benchmark/bm_hash_ident_obj.rb: Ditto.
* benchmark/bm_hash_ident_str.rb: Ditto.
* benchmark/bm_hash_ident_sym.rb: Ditto.
* benchmark/bm_hash_keys.rb: Ditto.
* benchmark/bm_hash_shift.rb: Ditto.
* benchmark/bm_hash_shift_u16.rb: Ditto.
* benchmark/bm_hash_shift_u24.rb: Ditto.
* benchmark/bm_hash_shift_32.rb: Ditto.
* benchmark/bm_hash_to_proc.rb: Ditto.
* benchmark/bm_hash_values.rb: Ditto.
Tue Mar 08 11:10:21 2016 Vladimir Makarov <vmakarov@redhat.com>
* include/ruby/st.h (struct st_table): Remove allocated_entries.
Make rebuilds_num of type st_index.
* st.c (MINIMAL_POWER2): New.
(get_power2): Use it.
(get_entries_num): New.
(initialize_entries): Use get_entries_num instead of
allocated_entries.
(st_init_table_with_size): Ditto.
(st_memsize, hash_entry, find_table_entry_ptr_and_reserve): Ditto.
(st_copy): Ditto.
(rebuild_entries): Reset num_elements.
(REBUILD_THRESHOLD): New.
(rebuild_table): Add table compaction. Remove deleted elements
and move elements to the array start.
(st_general_foreach): Assume rebuilt elements starting with zero
index.
Sun Mar 05 09:01:31 2016 Vladimir Makarov <vmakarov@redhat.com>
* include/ruby/st.h (struct st_table): Remove hash_mask.
* st.c (st_init_table_with_size): Remove hash_mask assignment.
(rebuild_table): Ditto.
(hash_entry): Calculate the hash mask using number of allocated
entries.
Wed Feb 24 02:24:29 2016 Vladimir Makarov <vmakarov@redhat.com>
* include/ruby/st.h (MAX_ST_INDEX_VAL, st_table_element): New.
(st_entry_t): New.
(struct st_table): Remove num_bins, entries_packed, num_entries,
and as. Add num_elements, deleted_entries, allocated_entries,
allocated_elements, rebuilds_num, hash_mask, entries,
elements_start, elements_bound, and elements.
(st_reverse_foreach): Remove.
* st.c: Don't include ccan/list/list.h. Include <assert.h> and
<stdlib.h>.
(ATTRIBUTE_UNUSED): New.
(st_table_entry, st_packed_entry): Remove.
(st_table_element): New.
(STATIC_ASSERT, ST_DEFAULT_MAX_DENSITY): Remove.
(ST_DEFAULT_INIT_TABLE_SIZE, ST_DEFAULT_PACKED_TABLE_SIZE): Ditto.
(PACKED_UNIT, MAX_PACKED_HASH): Ditto.
(rehash): Remove.
(EQUAL): Rewrite.
(do_hash): Make it inline function.
(hash_pos): Remove.
(PTR_EQUAL): New.
(st_alloc_entry, st_free_entry, st_alloc_table, st_dealloc_table):
Remove.
(st_alloc_bins, st_free_bins, st_realloc_bins): Remove.
(MAX_POWER2): New.
(bins, real_entries, PACKED_BINS, PACKED_ENT, PKEY, PVAL): Remove.
(PHASH, PKEY_SET, PVAL_SET, PHASH_SET): Ditto.
(remove_packed_entry, remove_safe_packed_entry): Ditto.
(next_pow2, new_size): Ditto.
(get_power2): New.
(EMPTY_ENTRY, DELETED_ENTRY, MARK_ENTRY_EMPTY): New.
(MARK_ENTRY_DELETED): New.
(EMPTY_ENTRY_P, DELETED_ENTRY_P, EMPTY_OR_DELETED_ENTRY_P): New.
(EMPTY_ENTRY_PTR_P, DELETED_ENTRY_PTR_P): New.
(EMPTY_OR_DELETED_ENTRY_PTR_P, MARK_ELEMENT_DELETED): New.
(DELETED_ELEMENT_P, initialize_entries, make_tab_empty): New.
(st_head, FIND_ENTRY): Remove.
(st_init_table_with_size, st_clear, st_free_table, st_memsize):
Rewrite.
(rebuild_entries): New.
(find_entry, find_packed_index_from, find_packed_index): Remove.
(rebuild_table, hash_entry): New.
(collision_check): Define it as 1 and move it before definition of
COLLISION.
(secondary_hash, find_table_entry, find_table_entry_ptr): New.
(find_table_entry_ptr_and_reserve): New.
(new_entry, add_direct, unpack_entries, add_packed_direct):
Remove.
(st_lookup, st_insert, st_insert2, st_get_key, st_add_direct):
Rewrite.
(st_copy, st_delete, st_delete_safe, st_shift, st_cleanup_safe): Rewrite.
(st_update): Rewrite.
(st_general_delete): New.
(remove_entry, get_keys, get_values): Remove.
(st_general_foreach, st_general_keys, st_general_values): New.
(st_foreach_check, st_foreach, st_keys_check): Rewrite.
(st_values, st_values_check): Rewrite.
(st_reverse_foreach_check, st_reverse_foreach): Remove.
* compile.c (ibf_table_index, ibf_dump_id_list): Use
num_elements instead of num_entries.
* encoding.c (rb_enc_name_list): Ditto.
* gc.c (rb_objspace_call_finalizer, mark_tbl): Ditto.
(allrefs_dump, wmap_size): Ditto.
* hash.c (rb_hash_rehash, rb_hash_reject_bang): Ditto.
(rb_hash_select_bang, rb_hash_clear): Ditto.
(rb_hash_initialize_copy): Ditto.
* internal.h (RHASH_SIZE): Ditto.
* marshal.c (w_symbol, hash_each, w_object, r_entry): Ditto.
(r_prepare, r_symreal): Ditto.
* regparse.c (onig_number_of_names): Ditto.
* symbol.c (symbols_i): Ditto.
* transcode.c (rb_econv_asciicompat_encoding): Ditto.
* variable.c (iv_index_tbl_newsize): Ditto.
(iv_index_tbl_extend, rb_ivar_count, autoload_delete): Ditto.
(rb_const_list, cvar_list): Ditto.
* ext/-test-/st/numhash/numhash.c (numhash_size): Ditto.
* ext/-test-/st/foreach/foreach.c (force_unpack_check): Remove
check on packed and unpacked entries.
(unp_fec, unp_fe): Ditto.
Tue Feb 23 21:52:24 2016 Martin Duerst <duerst@it.aoyama.ac.jp>
* enc/unicode/case-folding.rb, casefold.h: Outputting actual titlecase
@@ -1,4 +1,4 @@
h = {}
syms = ('a'..'z').map { |s| s.to_sym }
syms.each { |s| h[s] = 1 }
200_000.times { syms.each { |s| h[s] } }
400_000.times { syms.each { |s| h[s] } }
Copy path View file
@@ -1,4 +1,4 @@
h = {}
nums = (1..26).to_a
nums.each { |i| h[i] = i }
200_000.times { nums.each { |s| h[s] } }
800_000.times { nums.each { |s| h[s] } }
Copy path View file
@@ -1,4 +1,4 @@
h = {}
strs = [*1..10000].map! {|i| i.fdiv(10)}
strs.each { |s| h[s] = s }
50.times { strs.each { |s| h[s] } }
500.times { strs.each { |s| h[s] } }
@@ -2,4 +2,4 @@
strs = ('a'..'z').to_a.map!(&:freeze)
strs.each { |s| h[s] = s }
strs = ('A'..'Z').to_a
200_000.times { strs.each { |s| h[s] } }
500_000.times { strs.each { |s| h[s] } }
Copy path View file
@@ -1,4 +1,4 @@
h = {}
strs = ('a'..'z').to_a.map!(&:freeze)
strs.each { |s| h[s] = s }
200_000.times { strs.each { |s| h[s] } }
500_000.times { strs.each { |s| h[s] } }
Copy path View file
@@ -6,4 +6,4 @@
syms.map!(&:to_sym)
end
syms.each { |s| h[s] = s }
200_000.times { syms.each { |s| h[s] } }
500_000.times { syms.each { |s| h[s] } }
@@ -10,4 +10,4 @@
syms.map!(&:to_sym)
end
syms.each { |s| h[s] = s }
200_000.times { syms.each { |s| h[s] } }
500_000.times { syms.each { |s| h[s] } }
Copy path View file
@@ -4,6 +4,6 @@
h[i] = nil
end
1000.times do
2000.times do
h.flatten
end
@@ -1,4 +1,4 @@
h = {}.compare_by_identity
strs = (1..10000).to_a.map!(&:to_f)
strs.each { |s| h[s] = s }
50.times { strs.each { |s| h[s] } }
500.times { strs.each { |s| h[s] } }
@@ -1,4 +1,4 @@
h = {}.compare_by_identity
nums = (1..26).to_a
nums.each { |n| h[n] = n }
200_000.times { nums.each { |n| h[n] } }
500_000.times { nums.each { |n| h[n] } }
@@ -1,4 +1,4 @@
h = {}.compare_by_identity
objs = 26.times.map { Object.new }
objs.each { |o| h[o] = o }
200_000.times { objs.each { |o| h[o] } }
500_000.times { objs.each { |o| h[o] } }
@@ -1,4 +1,4 @@
h = {}.compare_by_identity
strs = ('a'..'z').to_a
strs.each { |s| h[s] = s }
200_000.times { strs.each { |s| h[s] } }
500_000.times { strs.each { |s| h[s] } }
@@ -1,4 +1,4 @@
h = {}.compare_by_identity
syms = ('a'..'z').to_a.map(&:to_sym)
syms.each { |s| h[s] = s }
200_000.times { syms.each { |s| h[s] } }
500_000.times { syms.each { |s| h[s] } }
Copy path View file
@@ -4,6 +4,6 @@
h[i] = nil
end
5000.times do
10000.times do
h.keys
end
Copy path View file
@@ -4,7 +4,7 @@
h[i] = nil
end
50000.times do
1000000.times do
k, v = h.shift
h[k] = v
end
@@ -4,7 +4,7 @@
h[i] = nil
end
300000.times do
1000000.times do
k, v = h.shift
h[k] = v
end
@@ -4,7 +4,7 @@
h[i] = nil
end
300000.times do
1000000.times do
k, v = h.shift
h[k] = v
end
Oops, something went wrong.
ProTip! Use n and p to navigate between commits in a pull request.