Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: replace span freelist with free bitmap #8894

dvyukov opened this Issue Oct 7, 2014 · 2 comments


None yet
3 participants
Copy link

dvyukov commented Oct 7, 2014

The slowest parts of mallocgc and sweeping are freelist manipulation.
The idea is to replace the freelist with bitmap of free blocks. For spans with objects
>= 128 bytes, we allocate an additional word in MSpan for this; for spans < 128
bytes, we reserve the necessary amount of words at the end of the span.

Here is a prototype CL:

This has several advantages:
1. We always allocate in memory order.
2. We don't touch MSpan nor memory blocks themselves in mallocgc (freemap is cached in
3. We don't touch blocks at all in MSpan_Sweep (only mark bits and free bitmap).
4. We don't work with linked lists in MSpan_Sweep.
5. We don't walk freelist in MSpan_Sweep to mark already free objects (bitmap set is

This comment has been minimized.

Copy link

randall77 commented Oct 7, 2014

Comment 1:

Now that the garbage collector is fully precise, we can drop free lists before GC
instead of walking and marking them during sweep.  That will eliminate #5 and get us #1
I'm not sure #2 is actually an advantage.  We want to touch the objects we're
allocating, the caller is going to want them in the cache.  #3 could actually be
helpful, though.
I'm not convinced linked list operations are really that slow.  Popping something off
the head of the list is just two loads and a store.

This comment has been minimized.

Copy link
Member Author

dvyukov commented Oct 8, 2014

Comment 2:

To answer this question we simply need to benchmark. But I don't have time for it right
But I can say that the freelist operation is the slowest part of mallocgc.
Regarding caching, yes, caller wants the block in cache. But there is significant
difference between async store that goes into write buffer causing prefetch into cache
and a load of uncached data with a subsequent control dependency on the value. See eg:

@rsc rsc added this to the Unplanned milestone Apr 10, 2015

@rsc rsc removed release-none labels Apr 10, 2015

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.
You signed in with another tab or window. Reload to refresh your session. You signed out in another tab or window. Reload to refresh your session.