New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Generalize the substituting write barriers #1038
Comments
There are two kinds of write barriers. The first kind is a call-back when the object graph is mutated. This kind of barriers only needs to be applied when updating a reference field. For example,
The second kind is a helper for the mutator to locate the field to update. This kind of barriers needs to be used when updating any fields, including reference fields and non-reference value fields. For example,
In today's meeting, @steveblackburn mentioned that this distinction is similar to the two kinds of read barriers. The first kind (the "loaded value barrier") is applied to the reference loaded out of the slot, and the second kind (the "use barrier") is applied when loading any field out of an object. The barrier that ZGC uses to clear tag bits after loading belongs to the first category, while the Brooks' indirection barrier belongs to the second category. When redesigning the barrier API, it will be helpful to acknowledge the difference between those two kinds of write barriers. We should also be careful not to make the API too invasive. There are much less write operations than read operations (about 1:8 in some benchmarks), and there may be much more write operations to non-reference value fields than reference fields. We should make sure the VMs that don't need GC algorithms that requires the second kind of barrier don't have to slow down their write operations for non-reference fields. |
Some VMs have limits on the forms of write barriers. Unknown slot addressFor example, in CRuby, the Instead, CRuby uses an alternative form of write barrier: fn object_reference_written(
mutator: Mutator,
object: ObjectReference
old_value: ObjectReference,
new_value: ObjectReference
); CRuby calls Only remembering objectIn other places, CRuby uses fn object_modified(
mutator: Mutator,
object: ObjectReference
); During mutator time, it remembers What API should MMTk core provide?The remembering-only barrier The form with old value instead of slot address Our current |
We have seen similar cases with Julia as well. There is no slot in Julia's write barrier function, and there is one kind of barrier that has no target object (just remembering the source object). This is not an issue for our object remembering barriers, which just needs the source object. As long as we know (and assert) which kind of barrier is in use, we can omit some arguments that are not really in use for the barrier. |
That's what I am doing for Ruby's write barriers at this moment. I just set the |
Maybe it is like pinning. Not all the plans can support pinning. So if a VM has to pin objects, they can only use certain plans. If they have no such restriction or if they want to refactor to remove such restrictions, they can use all the plans and our API works fine with that. This also applies to the write barrier. Unless the VM fixes the restriction they introduce in the VM side, they can only use certain write barriers. I am not saying that our write barrier API is fully general. But for those two cases ("unknown slot address" and "only remembering object"), it sounds like issues in the language implementation. We may want to document it more clearly, rather than changing our API for this reason. |
Currently, the
Barrier
trait provides the substituting barrier,object_reference_write
, which includes both the write operation to the slot, and the write barrier semantics (such as checking the unlogged bits for the object-remembering barrier). However, this is not enough.Memory orders
The read and write operations may have acquire and/or release semantics. For example, in Java,
seq_cst
in C++14.release
in C++14.Implementation-wise, those memory orders translate to different machine-level instructions.
Atomic read-modify-write operations
Atomic read-modify-write operations do both a read and (optionally) a write operation. For example, in Java,
f
which in theory can do anything. It may be implemented with CAS or LL/SC underneath.Note that for
compareAndExchange
andgetAndUpdate
, the write operation may or may not be executed, depending on whether the actual value in the slot matches the expected value. The write barrier should only be executed if the write operation actually takes place.Updating non-reference fields
The Sapphire algorithm needs write barriers for non-reference fields, too. Sapphire keeps two copies of objects during GC. Read operations read from either copy, but write operations need to write to both copies. This means not only reference fields, but all fields need to write to two different addresses for each language-level field update.
In order to support Sapphire, JikesRVM was refactored (mmtk/jikesrvm@721ca5a) to add write barriers for all field types in Java, but the default implementation (when barrier is not required for a type) is a simple memory write operation. The Sapphire plan overrides the barrier to handle the forwarding (see: https://github.com/perlfu/sapphire/blob/1424b489dc667f6080fc601df5437a3b7f87e828/MMTk/src/org/mmtk/plan/otfsapphire/OTFSapphireMutator.java#L1104).
JikesRVM's approach requires one function for each field type, but Java's field types are limited. There are only
byte
,short
,int
,long
,char
,boolean
,float
,double
andObject
. This is feasible for a Java-specific GC framework. But the Rust MMTk is designed to be language-neutral.Related topic: tagged pointers
Main issue: #1034
In some VMs, such as Ruby, a slot may sometimes hold a reference, and sometimes hold a value (such as small integer). In some VMs, such as V8, a slot may hold a reference together with some tag bits to indicate it is holding a reference (not a value).
If a slot can hold values, an atomic exchange or compare-exchange operation may exchange a reference with a value, or exchange a value with a reference. From the GC's point of view, it is like exchanging a valid reference with a
null
, and vice versa. But one obvious thing to notice is that the data to store into the slot may not be the reference MMTk cares about. For example, if V8 does an CAS operation to exchange a small integer (SMI) with a reference, and it is successful, then MMTk should observe that 'That slot did not hold a reference, and now it holds a reference', and the MMTk barrier implementation should not see the tag bit in the new reference. Similar is true if the VM exchanges a reference with a SMI. If MMTk needs to remember the old value of the field, it shall remember the old reference in the field, without the tag bits, either.Related topic: slot layout
In some VMs, a slot has more than a pointer. Lua, for example, uses a two-word struct for each slot. One of them holds a pointer (or value, depending on the tag), while the other holds a tag. In such cases, we can no longer assume a slot holds exactly an
ObjectReference
. Currently, theEdge
trait provides an abstraction over it, but it needs another method for overwriting the field instead of updating the field for forwarding objects due to copying GC.See also: #1034
Conclusion
In conclusion, one single
Barrier::object_reference_write
method is insufficient to support the rich semantics the VMs support, and it may need refactoring. There are mainly two things to take care of:So a substituting write barrier method
Barrier::object_reference_write
should have both of the two things as arguments. The former may beObjectReference
arguments (or data structures to obtain the old and newObjectReference
in the slot), and the latter may be a call-back (closure) for the VM to do the actual storing or atomic RMW operation.The text was updated successfully, but these errors were encountered: