Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

refactor newhash (revision 58463 another try) #1600

Closed
wants to merge 3 commits into from

Conversation

shyouhei
Copy link
Member

  • st.c (rb_hash_bulk_insert): new API to bulk insert entries
    into a hash. Given arguments are first inserted into the
    table at once, then reindexed. This is faster than inserting
    things using rb_hash_aset() one by one.

    This arrangement (rb_ prefixed function placed in st.c) is
    unavoidable because it both touches table internal and write
    barrier at once.

  • internal.h: delcare the new function.

  • hash.c (rb_hash_s_create): use the new function.

  • vm.c (core_hash_merge): ditto.

  • insns.def (newhash): ditto.

-----------------------------------------------------------
benchmark results:
minimum results in each 7 measurements.
Execution time (sec)
name    before  after
loop_whileloop2  0.136  0.137
vm2_bighash*     1.249  0.623

Speedup ratio: compare with the result of `before' (greater is better)
name    after
loop_whileloop2 0.996
vm2_bighash*    2.004

	* st.c (rb_hash_bulk_insert): new API to bulk insert entries
	  into a hash. Given arguments are first inserted into the
	  table at once, then reindexed. This is faster than inserting
	  things using rb_hash_aset() one by one.

	  This arrangement (rb_ prefixed function placed in st.c) is
	  unavoidable because it both touches table internal and write
	  barrier at once.

	* internal.h: delcare the new function.

	* hash.c (rb_hash_s_create): use the new function.

	* vm.c (core_hash_merge): ditto.

	* insns.def (newhash): ditto.

-----------------------------------------------------------
benchmark results:
minimum results in each 7 measurements.
Execution time (sec)
name    before  after
loop_whileloop2  0.136  0.137
vm2_bighash*     1.249  0.623

Speedup ratio: compare with the result of `before' (greater is better)
name    after
loop_whileloop2 0.996
vm2_bighash*    2.004
@shyouhei
Copy link
Member Author

@vnmakarov I would like to merge this hash bulk insert feature. Can you take a look at st.c diff?

@vnmakarov
Copy link

Shyouhei, thank you for asking my opinion. About half year ago I thought about this too to improve vm2_bighash and overall hash tables benchmark score. To be honest I did not expect such big improvement of vm2_bighash which you reported here.

That time my approach would be simply using st_insert2 with providing its inlining. Basically it would be analog of your rb_hash_bulk_insert with st_insert2 (more accurately a static inline function which could be used in st_insert2 too) inside rb_hash_bulk loop. Your implementation is probably faster as it has more specialized code although I don't know how much faster it is.

I have no time and hardly will have time to work on hash tables more. So I am really glad that you implemented it.

I've read your code. It looks OK for me. If you need my support to put your changes into st.c. I gladly give it to you.

@shyouhei
Copy link
Member Author

Thank you for reviewing. Will merge this.

FYI, although reverted due to random bugs, I tried using st_insert before (cf: 4ee09d9). It was a bit slower than what is proposed in this pull request (1.8 times faster then versus 2.0 times faster here).

@hsbt hsbt closed this in 29ca20d Apr 27, 2017
@shyouhei shyouhei deleted the bulk-newhash branch April 27, 2017 04:41
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
2 participants