Join GitHub today
GitHub is home to over 28 million developers working together to host and review code, manage projects, and build software together.Sign up
optimize gc sweep #1049
This PR optimizes the
Consider the case of sweeping a string.
In case of trunk, the code path is: switch -> call obj_free -> switch -> switch -> call rb_str_free (note that obj_free does not get inlined even though it is marked as such).
On my MacBook Pro using GCC 5.1.0, the following synthetic benchmark went from 7.57 sec. (trunk) to 7.21 sec (this PR).
The number of GC benchmarks changed as follows:
Please take with a grain of salt when looking at the changes in user time; there were some variation between repetitive runs, and I am not sure why the numbers changed - does the incremental GC of ruby refer to some timer (in which case reduced number of sweep runs (as its faster) would lead to better CPU cache usage explaining the gain), or if not, it could be due to better use of branch prediction unit.