Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime: improve barrier implementation layering #21640

josharian opened this issue Aug 26, 2017 · 1 comment

runtime: improve barrier implementation layering #21640

josharian opened this issue Aug 26, 2017 · 1 comment


Copy link

@josharian josharian commented Aug 26, 2017

This is a migration of some discussion in CL 37628.

bulkBarrierPreWrite calls writebarrierptr_prewrite1 repeatedly. Every call to writebarrierptr_prewrite1 does some sanity checks and hops on and off the system stack. This seems silly.

@aclements wrote:

What do you think about instead lifting some of the setup from writebarrierptr_prewrite1 (like switching to the system stack) into bulkBarrierPreWrite and just calling gcmarkwb_m directly from bulkBarrierPreWrite? Then, as a possible second step, unrolling the bulkBarrierPreWrite loop that calls gcmarkwb_m?

I wrote:

That was my first instinct, but I wasn't sure whether for a very large bulk pre-write this might exceed the latency budget in a way that hopping back and forth off of the system stack might not.

@aclements wrote:

bulkBarrierPreWrite is already non-preemptible, so this wouldn't be making it any worse. In fact, it would be kind of nice for it to be obviously non-preemptible, rather than subtly non-preemptible like it is now. :)

This isn't great, obviously. But if we wanted to fix this (which we might have to), I think we would need to do it at the typedmemmove and friends level by breaking it up into smaller bulkBarrierPreWrite and memmove segments with a preemption point after each segment.

There are several TODOs here:

  1. Improve the layering.
  2. Figure out whether typedmemmove and friends need to break their work up into chunks to avoid long pauses.
  3. Check whether loop unrolling in bulkBarrierPreWrite improves performance.
Copy link

@gopherbot gopherbot commented Feb 7, 2018

Change mentions this issue: runtime: remove legacy eager write barrier

gopherbot pushed a commit that referenced this issue Feb 13, 2018
Now that the buffered write barrier is implemented for all
architectures, we can remove the old eager write barrier
implementation. This CL removes the implementation from the runtime,
support in the compiler for calling it, and updates some compiler
tests that relied on the old eager barrier support. It also makes sure
that all of the useful comments from the old write barrier
implementation still have a place to live.

Fixes #22460.

Updates #21640 since this fixes the layering concerns of the write
barrier (but not the other things in that issue).

Change-Id: I580f93c152e89607e0a72fe43370237ba97bae74
Run-TryBot: Austin Clements <>
Reviewed-by: Rick Hudson <>
TryBot-Result: Gobot Gobot <>
@ianlancetaylor ianlancetaylor added this to the Unplanned milestone Mar 30, 2018
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants