Updating the Go memory model #47141

rsc · 2021-07-12T13:57:24Z

rsc
Jul 12, 2021
Maintainer

I posted a blog post Updating the Go memory model about some changes I am planning to propose regarding Go's memory model.

This discussion is for collecting feedback about those changes before filing an official proposal.

Please make good use of the limited threading below: post a top-level comment for something new and use the reply feature to reply to someone else's comment. We hope this will help keep discussion manageable.

Thank you!

josharian · 2021-07-12T15:54:56Z

josharian
Jul 12, 2021
Collaborator

A hearty +1 to atomic.Bool and friends.

5 replies

cristaloleg Jul 12, 2021

Some of the implementation already exist, some examples https://github.com/cristalhq/atomix or more popular https://github.com/uber-go/atomic (ah, and many more inside stdlib as unexported type)

cespare Jul 12, 2021
Collaborator

In general the implementation of those types is trivial; the API design is the only interesting bit.

(Edit: To be clear, I mean it's trivial to build things like Bool on top of existing atomic operations. Getting the underlying atomic ops right is obviously tricky and requires compiler/hardware awareness, but that work has already been done.)

rsc Jul 12, 2021
Maintainer Author

I'd be interested to see representative examples of use of atomic.Bool. I wonder if there is a higher-level API we should be targeting for package sync as well. I mean, something like sync.Once is roughly an atomic.Bool, but a bit more, and the bit more makes it much better.

cespare Jul 13, 2021
Collaborator

We have an internal type at work that's essentially atomic.Bool (Get, Set, and Swap; though they should probably be named Load, Store, and Swap). For comparison, we use it roughly as much as we use sync.Once.

I read through all the call sites and they are straightforward uses where you could imagine using a sync.Mutex + a bool but the atomic.Bool is more convenient and avoids a lock; there isn't any existing mutex or other synchronization on which to piggyback. Two examples:

A server with a "killswitch" feature where it can be set to temporarily refuse all requests with some error (but stay up and running). The killswitch is an atomic bool; it is set to true to turn on the killswitch and every request begins by consulting the value of the killswitch.
A couple of TCP servers with Close methods have an atomic closed bool that we can use to return a particular error when a server has been closed (which doesn't normally happen except in tests). I checked and net/http.Server does the same thing: Server.inShutdown uses that package's atomicBool type for the same purpose.

None of our use cases seem like they want a higher-level abstraction.

neild Jul 13, 2021
Maintainer

I'd be interested to see representative examples of use of atomic.Bool.

The protobuf internals include several lazily initialized data structures which include an atomic uint32, set to 1 after the structure is initialized. This could be a sync.Once, but as I recall handwriting the init operation allowed inlining the fast path when the object is already initialized. We would have used a sync.Bool if one existed.

https://go.googlesource.com/protobuf/+/refs/heads/master/internal/filedesc/desc.go#41

jamie-digital · 2021-07-12T16:49:21Z

jamie-digital
Jul 12, 2021

There's a minor typo in a code sample towards the end:

i := *p
if i < 0 || i >= len(fns) {
	panic("invalid function index")
}
... complex code ...
// compiler must NOT reload i = *p here
funcs[i]()

The call to len should be len(funcs).

That aside, the proposal looks great. I too would prefer not to add unsynchronized atomics.

3 replies

cespare Jul 12, 2021
Collaborator

Other typos: search for "sequental" (there are two instances).

rsc Jul 12, 2021
Maintainer Author

Thanks, fixed both, and a straighforward.

kortschak Jul 14, 2021

"This section list the adjustments..." s/list/lists/

cespare · 2021-07-12T17:26:33Z

cespare
Jul 12, 2021
Collaborator

@rsc, I don't quite follow this:

To answer an obvious suggestion, I don't see a clean way to use generics to provide just a single atomic.Atomic[T] that would let us avoid introducing Bool, Int, and so on as separate types, at least not without special cases in the compiler.

I see a specific issue with Bool in that it doesn't have an Add operation. What's the problem with defining a single parameterized type for the numeric ones? (Maybe you still want special compiler support for performance, but it sounds like you're talking about a type system issue.)

6 replies

szabba Jul 12, 2021

With a future extension to generics like switch type T:

the implementations could be kept type safe,
which branch is taken in a given generic type instantiation would be known at compile time.

Then an optimization that removed the other branches would not have to be a special case. Or am I confused and is that different from what you're talking about?

ianlancetaylor Jul 13, 2021
Maintainer

Note that we already have special cases in the compiler for all of the individual sync/atomic functions. The actual code in sync/atomic is not used for normal builds. So if we introduced atomic.Atomic[T atomicTypes] we would indeed require a special case in the compiler, but it is a special case that is essentially already there. The fallback method implementations could use a type switch as you suggest.

prattmic Jul 13, 2021
Maintainer

I wonder if we can gather data about how frequently the non-intrinsic atomic implementations are used. My guess would be not very often at all (deferred atomic operation may be the most common?).

ianlancetaylor Jul 13, 2021
Maintainer

I think the race detector uses the non-intrinsic versions. Not 100% sure, though.

prattmic Jul 13, 2021
Maintainer

Ah yes, good call. That said, the race atomics may already be sufficiently slow that a switch won't matter. (They are already assembly, so no inlining is happening).

ConradIrwin · 2021-07-12T17:42:43Z

ConradIrwin
Jul 12, 2021

I like the idea of being clear about what kind of invalid programs are allowed in the case when you have a data-race and how they might behave. I also like the call-out that concurrent multi-word reads/writes allow you to observe inconsistent (and potentially corrupt) structures.

One thing that I think could be more explicit is the behaviour when you have concurrent reads and writes to a memory location X that is smaller than (or equal in size to) a machine-word.

  var x int8 = 0

  go func () {
    x = 1
  }()
  go func () {
    fmt.Println(x) // could this print anything except 0 or 1?
  }()

I think either of the following options make sense, and experimentally the first seems true (and definitely the most expected!):

For a memory location X where a write that happened before set X=A. If a write that sets X=B and a read of X happen concurrently, then the read may observe X having a value of either A or B.
For a memory location X where a write that happened before set X=A. If a write that sets X=B and a read of X happen concurrently, then the read may observe X having any value.

7 replies

ConradIrwin Jul 12, 2021

I don't think the current rules actually say this (though it is quite possible that I'm missing an implication in the text – as you say, it is very subtle). The two cases I am worried about:

(from your blog post) "[a compiler] must not allow a single write to write multiple values.".
```
// *p = 2, i = 2

*p = i + *p/2.  
// is "optimized" to
*p /= 2
*p += i
```
There are three possibilities for a concurrent read in the "optimized" case. It either sees 1, 2 or 3. As in your blog post we'd like to exclude reading a 1, but (at least my reading of) the current go memory model's rules do not:
- In this case there is no synchronization to establish "happens before" relationships, so the read is not "guaranteed to observe" any particular write.
- The read is "allowed to observe" any of the writes.
- There seems to be a missing exclusion of "The read is not allowed to observe anything else"
- I'm not sure the best way to explain the guarantee, maybe: "if any write w to a memory location l happens before a read r of l, r is guaranteed to observe either w or a write w' to l that happens before or concurrently to r and which does not happen before w."). It may need additional context for reading from uninitialized memory before any writes to that location (though I think that's only possible with unsafe/cgo?), "If no writes to l happen before a read, that read may see an arbitrary value (though this is unlikely in practice, as the write during initialization happens before any read to a variable)".
A (maybe hypothetical?) worry about the behavior of writes to "machine-word" sized values or smaller. As someone who's not very familiar with hardware architecture, it's not clear what guarantees I get. The current text calls out "Reads and writes of values larger than a single machine word behave as multiple machine-word-sized operations in an unspecified order." This doesn't say anything about reads and writes of values smaller than (or equal in size to) machine words (though it does imply they probably are defined in some way). I think the suggestion for 1. is sufficient to handle this; but it could be made clearer for future readers by saying. "Even in the presence of data-races, reads of a machine-word size location will not see partially written values."

Either way this may just be a scope-creep request on top of your change to make the implications more explicit, not push-back on the idea. I'd love to see the page significantly expanded and clarified as you propose.

rsc Jul 13, 2021
Maintainer Author

Thanks. I will work to make that text clearer.

bshanks Jul 13, 2021

Expanding on this, you say "a read must observe some other value that was actually written there. Unlike C++, the read is not allowed to observe "any value"." But as ConradIrwin points out, "There seems to be a missing exclusion of "The read is not allowed to observe anything else"".

In section Happens Before, the existing wording does say:

 A read r of a variable v is _allowed_ to observe a write w to v if both of the following hold:

    r does not happen before w.
    There is no other write w' to v that happens after w but before r.

As written, that does not disallow anything, it only allows. You might consider making it:

A read r of a variable v is allowed to observe a write w to v only if both of the following hold:

However, is this intended to prevent acausal/out-of-thin-air reads? I'm not sure that it does. In the example from https://research.swtch.com/plmm#acausality , my understanding was that there IS a write of "42" to the variable, and it does not happen-before the corresponding read (because neither happens-before the other). In other words, the read and the write of 42 are concurrent and there is no intervening write, therefore both "r does not happen before w" and "There is no other write w' to v that happens after w but before r" are satisfied. Or is the intention to allow acausal reads?

rsc Jul 14, 2021
Maintainer Author

However, is this intended to prevent acausal/out-of-thin-air reads? I'm not sure that it does.

Acausal and out-of-thin-air reads are, informally, not allowed. However, these rules do not attempt to do that formally. As far as I can tell no one has any idea how to do that formally. Go won't be the first. I'm not concerned about that particular detail.

roudkerk Oct 4, 2021

Is it worth explicitly saying that "word tearing" cannot happen, like Java does: https://docs.oracle.com/javase/specs/jls/se7/html/jls-17.html#jls-17.6.

jhaungs · 2021-07-12T23:35:32Z

jhaungs
Jul 12, 2021

In the list of ordering of operations, the word "happens" seems imprecise:

A receive from an unbuffered channel happens before the send on that channel completes.
The k'th receive on a channel with capacity C happens before the k+C'th send from that channel completes.

If you read "happens" to mean "begins and ends," then it would seem to be impossible for a read to end before the corresponding send completes. Maybe "happens" should be "begins," if that's the correct semantics.

3 replies

rsc Jul 13, 2021
Maintainer Author

The sentences are correct, if subtle.

If you think of the send as a function call send(channel, value), then "the send completes" means "the call returns". The actual communication happens during the call, before the return, so it does make sense for the receive to happen before the send completes (unblocks and lets the code doing the send keep running).

bshanks Jul 13, 2021

(i made a similar comment below): As others noted, a potential problem is that we have both "send < receive" (from the existing text in https://golang.org/ref/mem section channel communication) and also "receive < send" (where < means happens-before). Perhaps channel send/receive could be modeled as consisting of separate begin and end events: send_begin < receive_begin < send_end < receive_end

kylelemons Jul 28, 2021

The way that I think about it, it does makes sense, because synchronous channel operations do establish a mutual happens-before relationship.

ch = make(chan int)
{ x = A ;    <-ch ; return y }
{ y = B ; ch <- 1 ; return x }

It is safe to write to x/y and to read y/x because you can trace the happens-before in both directions (for sync channels):

x = A happens before <-ch happens before ch <- 1 happens before return x
y = B happens before ch <- 1 happens before <-ch happens before return y

robpike · 2021-07-13T00:37:57Z

robpike
Jul 13, 2021
Collaborator

Some minor comments from the first pass.

s/Like in Java/As in Java/

Two bullet points in "Go's memory model today" are mutually incompatible:

A send on a channel happens before the corresponding receive from that channel completes.
A receive from an unbuffered channel happens before the send on that channel completes.

Did you miss a "buffered" in the first bullet?

s/This section list/&s/

In the summary in the next section, s/useful/practical/.

The definition of data race doesn't feel quite right to me. Maybe concurrently is the issue. Try this: A data race is defined as a write to a memory location happening while another read or write to that same location is also happening.

I like the point about encouraging external (please don't say third-party; I don't even know who the second party is) packages documenting their semantics.

Do you need to say more about the operations around atomic operations? I think there is a happens-before claim you can make about code on either side of an atomic update, but I'm not sure. What you have there now talks only about the atomics, not the rest. Look at:

a = b
<atomic operation on c>

What can you say about a here?

In "C++ provides both, of course", drop "of course". No need to editorialize.

Outside the parens, say "barriers" not "fences"

I would argue strongly against adding unsynchronized atomics, to keep with the philosophy of enough is enough, and the subtlety of adding more operations with different semantics.

Where would you document the disallowed compiler optimizations? Also, can you find a categorical definition instead of a catalog of bad examples?

3 replies

bshanks Jul 13, 2021

I think something like "A send on a channel happens before the corresponding receive from that channel completes" is needed even in unbuffered channels to prevent a paradox like printing "0" in:

// Thread 1           // Thread 2
x = 1;                while(done == 0) { /* loop */ }
done = 1;             print(x);

(a similar Go example is in https://golang.org/ref/mem section Channel communication)

Perhaps a solution is to specify channel send/receive as consisting of separate begin and end events: send_begin < receive_begin < send_end < receive_end (where < means happens-before)

bshanks Jul 13, 2021

"Concurrently" is defined in the existing text at https://golang.org/ref/mem in section Happens Before. In the definition of data race, the new text could refer to the existing definition of concurrently by adding something like "(see below for a precise definition of concurrently)".

bshanks Jul 13, 2021

I think the happens-before between "a=b" and "<atomic operation on c>" is already covered by the existing text at https://golang.org/ref/mem : "Within a single goroutine, the happens-before order is the order expressed by the program."

tredeske · 2021-07-13T00:46:19Z

tredeske
Jul 13, 2021

Possibly related: cache alignment and false sharing.

It would be great if there was a way to cause cache aligned allocation so as to avoid false sharing when concurrent access occurs.

There are some work arounds, such as adding padding, to ensure one thing is not on the same cache line as another, but having more precision, and especially having a way to align the start of a struct, would be very helpful.

3 replies

rsc Jul 13, 2021
Maintainer Author

There have been proposals about alignment that we have not gotten to agreement on yet. Happily that's only an optimization problem and not a "different semantics" problem, so it's separable from the memory model more generally.

In general if you do a fresh allocation of a size that is a multiple of some power of two, you can generally assume it will be that-power-of-two-aligned, which is enough for programmers being very careful about this sort of thing.

matt4biz Oct 17, 2021

Ian's statement in #19057, "The first word in a global variable or in an allocated struct or slice can be relied upon to be 64-bit aligned" appears to be stronger than this and stronger than what's in the current Go spec (https://golang.org/ref/spec#Size_and_alignment_guarantees).

Can we get this verbiage into the spec?

ianlancetaylor Oct 17, 2021
Maintainer

Ian's statement in #19057, "The first word in a global variable or in an allocated struct or slice can be relied upon to be 64-bit aligned"

To be clear, that's not my statement exactly. It's a quote from the package documentation: https://pkg.go.dev/sync/atomic#pkg-note-BUG. I believe that statement is intended to be specific to the listed 32-bit processors.

I don't think we should mix discussions about alignment with discussions about the memory model. While they have some connections, they aren't the same thing.

ulikunitz · 2021-07-13T06:24:34Z

ulikunitz
Jul 13, 2021

While I agree with the statement about the loss of consistency of internal pointer, length and pointer, type pairs I would like to see a rationale for it, even though it may be obvious. Not being an expert I guess consistency would require synchronization, which would lead to higher memory requirements and performance penalties and would break programs relying on the current internal representation of the affected data structures and the benefit would be small because parallel access to multiple fields of a structure or multiple variables would still be inconsistent.

1 reply

thepudds Jul 13, 2021
Collaborator

Hi @ulikunitz, If you haven’t already read this, you might be interested in:

https://research.swtch.com/gorace

…especially the section titled “The Fix”.

The conclusion includes:

The races exist because the data representations were chosen for performance: the race-free versions introduce an extra pointer, which carries with it the cost of extra indirection and extra allocation.

…and there is also some discussion there of additional instructions required.

erifan · 2021-07-13T07:50:03Z

erifan
Jul 13, 2021
Collaborator

Hi @rsc I don't quite understand this sentence "If the effect of an atomic operation A is observed by atomic operation B, then A happens before B.", in my opinion this is obvious, I don't know if I miss some points. Could you elaborate ? thanks

3 replies

Merovius Jul 13, 2021

I think that is because you consider "happens before" colloquially, instead of formally. It is obvious that a write must occur before a read observing it, in real time. It is not obvious that a read observing a write creates an edge in the happens-before partial ordering.

Note that it's also obvious, for example, that "The go statement that starts a new goroutine happens before the goroutine's execution begins". But this isn't about making statements about the "real time", it's about what specific partial ordering edges exist in any given execution of a program.

(Personally, that's why I dislike the term "happens before" - it is not clearly enough distinguished from its colloquial use, for my taste. But it's a standard term in the field, so we're stuck with it)

bshanks Jul 13, 2021

To extend on what Merovius said, it may be helpful to think of "happens-before" as something with a weird name; for example, pretend it is called "ax343" instead of "happens before". We are building up a mathematical definition of this thing called "ax343". The purpose of doing this is to allow us to make precise mathematical proofs (or even to use an automated theorem prover) regarding the properties of the mathematical relation ax343.

erifan Jul 14, 2021
Collaborator

Oh, I misunderstood the "happens before" relationship, thanks for your answers.

bshanks · 2021-07-14T03:10:17Z

bshanks
Jul 14, 2021

Thank for all of your work on this. I also really loved the other blog posts in that series. I have a question about the goals for Go's memory model in the case of races, and how these goals differ from Javascript's ( for my future reference: https://tc39.es/ecma262/#sec-memory-model ). At first glance it sounds like Go wants the same as what Javascript wants; both Javascript and Go eschew "DRF or catch fire", and, like Javascript, you also say that with Go "programs with data races have defined semantics with a limited number of outcomes". But you also imply that Go wants to sit in-between Javascript and C in some fashion.

Is it accurate to say that you want less specification complexity than Javascript, and in exchange you are willing to pay the price of more formal ambiguity in the following form: sometimes Go will permit a variable to take on a 'paradoxical' value, when this would have been ruled out by Javascript's more extensive formal semantics?

1 reply

rsc Jul 23, 2021
Maintainer Author

Go and JavaScript match well until you get to multiword values like slices or interface values. Those can lead to corruption in Go; there's no equivalent in JavaScript. But for single-word values, yes, they match.

rhysh · 2021-07-14T06:01:23Z

rhysh
Jul 14, 2021
Collaborator

The proposal describes the happens-before relationships that sync/atomic creates. It also, less formally, says that memory locations that are accessed with sync/atomic must never be accessed without sync/atomic. (The next section says "Whether a particular value should be accessed with atomics is therefore a property of the value and not of a particular access.")

If we can arrange for happens-before relationships that eliminate racy access to a memory location that was previously accessed with sync/atomic, can a correct program later do a non-atomic read of that location? Can a correct program do non-atomic initialization of locations that are later accessed with sync/atomic?

I think the answer is "yes, that's allowed and will work according to the usual happens-before rules". But the way that sync/atomic is discussed—both in the current memory model and with the proposed changes—makes it seem like those operations somehow taint the memory location and make the normal set of happens-before rules not completely describe the situation.

https://play.golang.org/p/Cb73lX6ZPuR

package main

import (
	"fmt"
	"sync/atomic"
)

var a int32 = 1

func init() {
	a += 2
}

func main() {
	a += 4 // hb line 18 via line 17, and hb line 21
	c := make(chan struct{})
	go func() {
		atomic.AddInt32(&a, 8) // hb line 23 via 22 and 19
		close(c)
	}()
	atomic.AddInt32(&a, 16) // hb line 23
	<-c
	a += 32
	fmt.Println(a) // could this print anything but 63?
}

1 reply

rsc Jul 23, 2021
Maintainer Author

I think the answer is "yes, that's allowed and will work according to the usual happens-before rules".

Yes, that's the answer, and the text is carefully chosen to make that the answer.
(The only other possible answer is "or else catch fire", which we're avoiding.)

Atomic types would make that question impossible to ask, which I think is also fine,
but we still need to answer the question for the current atomic operations.

Merovius · 2021-07-14T14:39:32Z

Merovius
Jul 14, 2021

If we adopt the atomic.Int32 etc (which I'm in favor of), would that make it possible to verify the alignment requirements of atomic variables? Could the compiler (or go vet) check that atomic fields are aligned correctly and reject a program if they are not?

As discussed below, it would probably be preferable if the compiler would just automatically align the dedicated atomic types to fit the alignment guarantees.

6 replies

Merovius Jul 15, 2021

Speaking personally, I'm in favor of new atomic types just having an alignment that is guaranteed to work on the platform.

If we can, sure, that would likely be preferable. I wasn't sure the compiler is allowed to pad, but I guess it is.

FWIW I did think about the issues you bring up, but I don't think they are a problem. But there's no point arguing that - clearly, if we can just align them correctly, that's better.

zephyrtronium Jul 15, 2021

Unfortunately, forcing this new type to have a special alignment does result in additional special cases in the compiler. I don't think it's to the same extent of special casing as a generic atomic type, but that additional complexity is still a factor.

Notably, those special cases are effectively already implemented in https://go-review.googlesource.com/c/go/+/308971/, pending acceptance of #19057.

mknyszek Jul 15, 2021
Maintainer

Makes sense. I don't think the issues with compiler enforcement are insurmountable. The allocator could also try to do more to make the compiler error out in fewer ways, too.

RE: the compiler being allowed to pad, that's a good point and I'd like to note my thinking here explicitly. I believe it's totally fine with respect to the language specification, since it appears to be very, very loose on alignment guarantees in general. My impression is that the spirit of this looseness is to give the compiler the freedom it needs to pick an alignment so things "just work." The fact that we can't easily alter alignments a la #36606 is really just an unfortunate consequence of code depending on it. I'm no expert on the language spec however, so don't quote me on this. :)

mknyszek Jul 15, 2021
Maintainer

@zephyrtronium Oh, I totally forgot about that. Yeah, that works, too. :) Though, that change is only internal at the moment.

Merovius Jul 15, 2021

@mknyszek I agree that the spec doesn't preclude the compiler from padding (after I skimmed the spec again for this discussion). I fully support your suggestion that the compiler should probably automatically align the atomic types and pad structs as necessary.

zephyrtronium · 2021-07-15T07:06:56Z

zephyrtronium
Jul 15, 2021

I have a module which makes significant use of atomic Booleans. Having atomic.Bool would simplify some parts of its implementation, but it is insufficient to fully replace how I use atomics in it. In particular, there are situations in which I need to inspect and update multiple flags simultaneously.

This code had me on the verge of writing a proposal for an atomic.Flags type, providing bitwise rather than arithmetic operations on either uint8 or uint32. I think that such a type would be valuable even if atomic.Bool comes to exist, but there is obviously overlap between a type that describes one Boolean and a type that describes a set of related Booleans. Maybe it's time to write that proposal.

3 replies

rsc Jul 23, 2021
Maintainer Author

I'm not sure we need atomic.Flags any more than we need non-atomic Flags.
You can always build one yourself out of a typed uint32, and building one yourself lets you give the actual flag bits a unique type.

zephyrtronium Jul 23, 2021

Non-atomic flags can be implemented on any integer type using f |= m, f &= m, &c. There are no equivalent atomic operations in sync/atomic; the only solution is CAS loops, even on e.g. amd64 where LOCK ORL would be an option for the generated code. So, I disagree with that comparison. The building blocks are different.

I should mention, I only ever considered writing a proposal because I saw these operations were available (on some platforms?) in runtime/internal/atomic. The suggestion would have been essentially to export that API, with refinements for usability.

rsc Jul 23, 2021
Maintainer Author

Fair enough, although that would probably be better on Int etc than on a separate Flags. Thanks.

hundt · 2021-07-20T14:38:52Z

hundt
Jul 20, 2021

I think if I were reading the Go Memory Model page without having read much about memory models and data races before I would find this section to be a little misleading:

Programs with data races are invalid in the sense that an implementation may report the race and terminate the program. But otherwise, programs with data races have defined semantics with a limited number of outcomes, making errant programs more reliable and easier to debug.

Programs with data races may also be "invalid" in another less formal sense, which is that they can no longer rely on the language performing to spec. To give the example from elsewhere in the doc, a racing write to a map could corrupt arbitrary memory, which sure doesn't sound like "defined semantics with a limited number of outcomes"! If I understand correctly, that phrase refers to each individual memory read/write, but that may not be clear to someone who is not familiar with the underlying issues/history.

0 replies

changkun · 2021-07-21T21:46:04Z

changkun
Jul 21, 2021
Collaborator

I found the proposal missed a discussion regarding the runtime.SetFinalizer, which was previously reported in #9442 (probably the only API that falls outside the channel and sync package?) As @dvyukov pointed out:

package main

import (
	"runtime"
	"time"
)

type X struct {
	y *int
}

func main() {
	x := new(X)
	runtime.SetFinalizer(x, func(x *X) {
		if x.y != nil && *x.y != 42 {
			panic("race")
		}
	})
	i := 42
	x.y = &i
	runtime.GC()
	time.Sleep(time.Second)
}

If the compiler reorders "x.y = &i" and "*i = 42". This result can be unexpected, which falls into the category of compiler optimization rules.

@ianlancetaylor also suggested a description "Given a pointer p, there is no possible ordering such that the finalizer happens-before any read or write of *p." because the inverse of the statement is too strong that prevents possible compiler optimization.

Maybe this is more understandable (?): Given a pointer p, any read or write of *p either happens-before or happens concurrently with its finalizer.

6 replies

Merovius Jul 22, 2021

Maybe this is more understandable (?): Given a pointer p, any read or write of *p either happens-before or happens concurrently with its finalizer.

"Given a pointer p, no read or write of *p happens before its finalizer" is equivalent and snappier (though maybe more obscure in its consequences).

rsc Jul 23, 2021
Maintainer Author

I think the rule is "A call to SetFinalizer(x, f) happens before the finalizer invocation f(x)."

ConradIrwin Jul 23, 2021

Should there be a related rule that "A call to KeepAlive(x) happens before the finalizer invocation f(x)" ? That seems like the intent from the documentation, but may be tricky to guarantee in non-trivial examples.

// If the finalizer runs, it could print 0 or 42.
type X {y int}
x := &X{}
runtime.SetFinalizer(x, func(x *X) { fmt.Println(x.y) })
x.y = 42 // [the compiler is allowed to elide this write as it does not happen before any reads]

// If the finalizer runs, it must print 42.
type X {y int}
x := &X{}
runtime.SetFinalizer(x, func(x *X) { fmt.Println(x.y) })  
x.y = 42 // [the compiler must not elide this write as it happens before the read in the finalizer]
runtime.KeepAlive(x)

rsc Jul 23, 2021
Maintainer Author

If we did that, KeepAlive would have to be some kind of atomic release on x instead of a run-time no-op.
That seems unnecessary in general.

Think of runtime.SetFinalizer(x, f) as 'go f(x)' where the call doesn't start until all the references to x are gone.
If code needs to synchronize between that goroutine and other goroutines, the usual mechanisms apply, such as a mutex or atomic load/store.

ConradIrwin Jul 23, 2021

Agreed, thanks for confirming!

Updating the Go memory model #47141

rsc Jul 12, 2021 Maintainer

Replies: 15 comments · 51 replies

josharian Jul 12, 2021 Collaborator

cespare Jul 12, 2021 Collaborator

rsc Jul 12, 2021 Maintainer Author

cespare Jul 13, 2021 Collaborator

neild Jul 13, 2021 Maintainer

cespare Jul 12, 2021 Collaborator

rsc Jul 12, 2021 Maintainer Author

cespare Jul 12, 2021 Collaborator

ianlancetaylor Jul 13, 2021 Maintainer

prattmic Jul 13, 2021 Maintainer

ianlancetaylor Jul 13, 2021 Maintainer

prattmic Jul 13, 2021 Maintainer

rsc Jul 13, 2021 Maintainer Author

rsc Jul 14, 2021 Maintainer Author

rsc Jul 13, 2021 Maintainer Author

robpike Jul 13, 2021 Collaborator

rsc Jul 13, 2021 Maintainer Author

ianlancetaylor Oct 17, 2021 Maintainer

thepudds Jul 13, 2021 Collaborator

erifan Jul 13, 2021 Collaborator

erifan Jul 14, 2021 Collaborator

rsc Jul 23, 2021 Maintainer Author

rhysh Jul 14, 2021 Collaborator

rsc Jul 23, 2021 Maintainer Author

mknyszek Jul 15, 2021 Maintainer

mknyszek Jul 15, 2021 Maintainer

rsc Jul 23, 2021 Maintainer Author

rsc Jul 23, 2021 Maintainer Author

changkun Jul 21, 2021 Collaborator

rsc Jul 23, 2021 Maintainer Author

rsc
Jul 12, 2021
Maintainer

Replies: 15 comments 51 replies

josharian
Jul 12, 2021
Collaborator

cespare Jul 12, 2021
Collaborator

rsc Jul 12, 2021
Maintainer Author

cespare Jul 13, 2021
Collaborator

neild Jul 13, 2021
Maintainer

cespare Jul 12, 2021
Collaborator

rsc Jul 12, 2021
Maintainer Author

cespare
Jul 12, 2021
Collaborator

ianlancetaylor Jul 13, 2021
Maintainer

prattmic Jul 13, 2021
Maintainer

ianlancetaylor Jul 13, 2021
Maintainer

prattmic Jul 13, 2021
Maintainer

rsc Jul 13, 2021
Maintainer Author

rsc Jul 14, 2021
Maintainer Author

rsc Jul 13, 2021
Maintainer Author

robpike
Jul 13, 2021
Collaborator

rsc Jul 13, 2021
Maintainer Author

ianlancetaylor Oct 17, 2021
Maintainer

thepudds Jul 13, 2021
Collaborator

erifan
Jul 13, 2021
Collaborator

erifan Jul 14, 2021
Collaborator

rsc Jul 23, 2021
Maintainer Author

rhysh
Jul 14, 2021
Collaborator

rsc Jul 23, 2021
Maintainer Author

mknyszek Jul 15, 2021
Maintainer

mknyszek Jul 15, 2021
Maintainer

rsc Jul 23, 2021
Maintainer Author

rsc Jul 23, 2021
Maintainer Author

changkun
Jul 21, 2021
Collaborator

rsc Jul 23, 2021
Maintainer Author