New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Concurrent sweep #1681
Concurrent sweep #1681
Conversation
Both the concurrent sweep thread as well as the nursery collector will need access to the block array. Until we've made that lock-free, we're simply using a lock.
The nursery collector requires that sweeping has finished, and instead of waiting it will cooperate with the sweep thread to finish more quickly. The sweep thread will traverse the block array from high indexes to low ones while the nursery collector will go from low to high. They will contend only very briefly when they meet somewhere in the middle.
New blocks need to be swept if they're allocated during a non-concurrent major collection or while a concurrent major collection is running.
It's updated from nursery collections and from the sweep thread concurrently.
Don't use the difference to the last collection, but just calculate the maximum heap size and trigger a collection when it's reached.
And since we always wait for the sweep now we can do iterations over the blocks without taking the lock.
We cannot start threads during stop-the-world pauses (some thread APIs, including pthreads, can deadlock), and we want to do marking and sweeping with the same threads. This new thread pool facility will allow that.
So we only have one worker thread that's used both for marking and, later, for sweeping, too.
@schani please note that there's a crash during the System.dll testsuite on jenkins: |
@akoeplinger Fixed. |
I didn’t observe an appreciable difference in performance between master and this branch on binarytree, on Linux nor OS X. Here is a cursory run of
|
@schani looks like it still fails with the same assert during System.GC.GetTotalMemory (at least on i386, the amd64 build failed due to an already started xvfb...). |
@evincarofautumn a more useful metric for this would be total pause time and not wallclock time. |
@akoeplinger It seems that wasn't the latest commit. It works now. @kumpera On pause times see my charts above. |
@schani I pulled and build this PR to make sure it's not a Jenkins error and I see crashes locally during the System testsuite too (Ubuntu 14.04/amd64). |
@akoeplinger Could you post a log of that? It works for me on Ubuntu 14.10/amd64. |
@schani did a few test runs: http://pastebin.com/Dtyh790k |
@akoeplinger I can't reproduce this, neither on Linux nor on OSX. Can you confirm that this does not occur for you with master? Are you using any specific options? Could you post a full log? |
@schani you're right, I just saw it happening on master on Jenkins as well and on my copy. Looks like I somehow didn't properly clean up my working dir when I tested on master, sorry! |
The purpose of this is to reduce pause times of major collections by making sweep completely concurrent.
The first phase of sweeping is iterating through the block list, freeing blocks without live objects, and designating the others for lazy sweeping. This phase happened while the world was stopped. This makes it concurrent. The changes include the introduction of a very simple thread pool abstraction (currently supporting only a single thread) that unifies concurrent marking, jobs when scanning roots, and concurrent sweeping.
These are benchmarking results on Linux/AMD64:
"default-sgen" is master. "sgen-concurrent-sweep" is this branch with concurrent sweep enabled, "sgen-no-concurrent-sweep" is this branch with concurrent sweep disabled.
I don't know why
binarytree
is slower with this branch, concurrent sweep disabled. I ran it on my OSX machine and got these results:Here are some pause time graphs. This branch on the left (with concurrent sweep enabled), master on the right.
graph4
:health
:binarytree
: