-
Notifications
You must be signed in to change notification settings - Fork 5.3k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve performance of fiber creation by using pool allocation strategy. #2224
Conversation
@ko1 what do you think? |
I want to test this in real world code too, but I am on holiday I don't have my desktop computer so I can't really test it at the extreme limits. |
Also, I don't expose |
c340778
to
bc8b220
Compare
@ko1 I get strange error related to refinements on my desktop but not on travis:
|
On 32-bit travis test:
|
ea78187
to
3a0928b
Compare
@ko1 I managed to catch
|
@ko1 this is now only blocking on one test failure which I don't know how should be fixed:
|
e9ea7ea
to
6d49a46
Compare
4aea3c8
to
eac8077
Compare
2ecab3e
to
88f05f5
Compare
Replace previous stack cache with fiber pool cache. The fiber pool allocates many stacks in a single memory region. Stack allocation becomes O(log N) and fiber creation is amortized O(1). Around 10x performance improvement was measured in micro-benchmarks.
On 32-bit platforms, expanding the fiber pool by a large amount may fail, even if a smaller amount may succeed. We limit the maximum size of a single allocation to maximise the number of fibers that can be allocated. Additionally, we implement the book-keeping required to free allocations when their usage falls to zero.
`madvise(free)` and similar operations are good because they avoid swap usage by clearing the dirty bit on memory pages which are mapped but no longer needed. However, there is some performance penalty if there is no memory pressure. Therefore, we do it by default, but it can be avoided.
…pace. We use COROUTINE_LIMITED_ADDRESS_SPACE to select platforms where address space is 32-bits or less. Fiber pool implementation enables more book keeping, and reduces upper limits, in order to minimise address space utilisation.
If `mmap` fails to allocate memory, try half the size, and so on. Limit FIBER_POOL_ALLOCATION_MAXIMUM_SIZE to 1024 stacks. In typical configurations this limits the memory mapped region to ~128MB per allocation.
Okay, it was merged. |
https://bugs.ruby-lang.org/issues/15997
It would be good to get some feedback on this PR.