In the runtime there's an implicit invariant that's being maintained: write barriers may not be called into while the mheap.lock is held or the runtime could deadlock. This invariant is not documented anywhere nor is it enforced via nowritebarrier or nowritebarrierrec annotations.
Consider the following situation:
Call into runtime.XXX
runtime.XXX grabs the heap lock.
runtime.XXX assigns a heap pointer value into some location.
The write barrier is on and is invoked.
The write barrier attempts to enqueue the pointer into a marking work buffer.
There are no empty work buffers available to enqueue the pointer for marking.
Call into mheap.allocManual to allocate space for more work buffers.
mheap.allocManual attempts to grab the heap lock.
After some manual inspection, I'm fairly convinced that this isn't really a problem today because everywhere the heap lock is held, only notinheap-annotated structures are manipulated, so no write barriers are ever generated in those sections.
traceBufs for example are managed via sysAlloc and sysFree directly, whereas GC mark work buffers live in heap-allocated spans. If we were to move GC mark work buffers to use the same model, this would effectively remove this invariant. Making this change will require some work, so I'm kicking this issue to 1.13.