Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: deterministic execution #33702

Closed
mfateev opened this issue Aug 17, 2019 · 5 comments
Closed

Proposal: deterministic execution #33702

mfateev opened this issue Aug 17, 2019 · 5 comments

Comments

@mfateev
Copy link

mfateev commented Aug 17, 2019

The Cadence Workflow Go client side library uses event sourcing to reconstruct the state of the user provided program. The basic idea that a piece of code given the same set of external inputs should reach exactly the same state given that a code is deterministic.

The main sources of nondeterminism in Go programs are:

  1. range over map
  2. select
  3. goroutines
  4. rand
  5. time
  6. external API calls

So Cadence Workflow programs have to use ugly workarounds to become deterministic:

  1. range over map is prohibited
  2. Custom deterministic Selector must be used instead of native select
  3. Custom coroutines package with cooperative multithreading on top of Go goroutines is used. This requires a lot of boilerplate code like custom channel and wait group wrappers
  4. Use of rand package is prohibited
  5. Framework provides alternative API for Now and Sleep.

I believe that there are a lot of other use cases that would benefit from the deterministic execution. For example recovering database state from a transactional log and other event sourcing applications. Here is another example I found in the golang-nuts.

My strawman proposal is to provide a way to execute a piece of a Go code deterministically. The simplistic example:

	events := make(chan int, 10)
	currentTime := time.Time{}
	randomSeed := readRandomSeadFromStorage()
	dispatcher := dispatcher.ExecuteDeterministically(randomSeed,
		func() time.Time { return currentTime },
		func() {
			val := 0
			go func() {
				startFirst := time.Now()
				for i := 0; i < 10; i++ {
					val = val + rand.Intn(100) + <-events
					fmt.Printf("first: %v\n", val)
				}
				fmt.Printf("first latency: %v\n", time.Now().Sub(startFirst))
			}()
			go func() {
				startSecond := time.Now()
				for i := 0; i < 10; i++ {
					val = val + rand.Intn(100) + <-events
					fmt.Printf("second: %v\n", val)
				}
				fmt.Printf("second latency: %v\n", time.Now().Sub(startSecond))
			}()
		})

	for {
		next, nextEventTimestamp := readNextEventFromStorage()
		if next == 0 || dispatcher.Done() {
			break
		}
		currentTime = nextEventTimestamp
		events <- next
		dispatcher.executeUntilAllGoroutinesAreBlocked()
	}

Is expected to print exactly the same sequence given the same stored event log and the random seed.

For brevity I didn't include the use of range over map and select which should be also deterministic in any goroutine that was spawned from the root function passed to ExecuteDeterministically. They still need to rely on randomization to keep the current semantic. But the random would support specifying a seed value to support deterministic replay of the code.

I'm not expert on Go implementation but having that it already does implement its own cooperative scheduler I believe such feature wouldn't require too many changes to the runtime.

@gopherbot gopherbot added this to the Proposal milestone Aug 17, 2019
@mvdan
Copy link
Member

mvdan commented Aug 17, 2019

Are you suggesting to make this opt-in? If so, how would this be configurable by the user?

@mfateev
Copy link
Author

mfateev commented Aug 17, 2019

It would require to use a special API to start and control goroutine execution event loop. It is the
dispatcher := dispatcher.ExecuteDeterministically(...)
in the above example.

@ianlancetaylor
Copy link
Contributor

Your example is not valid Go code. That aside, it sounds like you are suggesting that it be possible to execute a single function literal deterministically, while not imposing and restrictions on the rest of the program. In the general case, that seems impossible, since that function literal might run channel operations with other goroutines. In fact, your example does exactly that.

These don't seem to me to be minor details. I don't know whether this is a good idea or not, but I'm quite sure that adding it to Go will require careful specification of exactly how it should work.

@mfateev
Copy link
Author

mfateev commented Aug 17, 2019

Sorry, fixed the syntax.

I agree that from the puristic point of view the only way to guarantee the determinism is to make this code fully hermetic. But for practical purposes it would be very useful as proposed.

I'm fine to work on the careful specification if there is any chance to get it landed.

@rsc
Copy link
Contributor

rsc commented Aug 20, 2019

Closing because this is not an actionable proposal: it is a feature request without a corresponding concrete implementation to evaluate. (And we don't know how to even build a hypothetical implementation.)

That said, a few comments about the general idea.

Providing this kind of deterministic execution is not a goal of Go currently. Implementing "be able to execute deterministically" is likely to be a lot of work for a code path that very few people would use, so it would be difficult to maintain and test and often buggy. We've seen this with other little-used things, like -buildmode=shared and binary-only packages. As I understand it, deterministic execution is an ongoing research topic in general, across many languages. For example, https://dedis.cs.yale.edu/2010/det/.

Taking maps as an example, any map of pointer keys can't be obviously determinized except by sorting by pointer value, but now you have to worry about non-determinism in the selection of allocated object addresses. That's difficult too and will require changes in the runtime to completely determinize (and slow down) both the allocator and the garbage collector. And more code paths that will be barely tested and often buggy.

It seems like the right path forward is to build something outside the Go distribution (by making a copy of the Go distribution and editing to suit) and learn how invasive the changes would be and how well they may or may not work. Then, once you have a system you are happy with that seems to have minimally invasive changes to Go itself, that would be a great topic for a concrete proposal.

It seems like another approach would be to declare that user code with non-deterministic behavior is buggy and focus on finding those bugs through aggressive testing instead of completely disallowing them. We already introduce new non-determinism in -race mode (for example, randomizing the scheduler a bit more than normal) and we could add more.

The Go compiler is an example of the "aggressive testing" approach. It is a program that must be deterministic and yet it has goroutines and channels and map iteration and the like. We have made it completely deterministic by having tests that check that when you run it multiple times on the same inputs you get bit-for-bit identical output and treating any such failure as a major problem to be found. (This check happens during every make.bash as part of compiler bootstrap.)

@rsc rsc closed this as completed Aug 20, 2019
@golang golang locked and limited conversation to collaborators Aug 19, 2020
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Projects
None yet
Development

No branches or pull requests

5 participants