Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Race condition in Get() #3

Closed
ssmccoy opened this issue Feb 15, 2017 · 0 comments
Closed

Race condition in Get() #3

ssmccoy opened this issue Feb 15, 2017 · 0 comments

Comments

@ssmccoy
Copy link
Contributor

ssmccoy commented Feb 15, 2017

Should Get() be called by a goroutine which yields execution between checking promise.state and calling promise.waitGroup.Wait(), it will hang as the waitgroup will never be woken up.

This issue arises approximately 25% of the time in the following example.

	waiterChan := make(chan Thenable)

	waiterPromises := make([]Thenable, 0, WAITERS)

	// Create 100 waiters...
	for i := 0; i < WAITERS; i++ {
		go (func() {
			waiterChan <- promise.Then(func(value interface{}) interface{} {
				atomic.AddUint64(&counter, 1)

				return nil
			})
		})()
	}

	go (func() {
		promise.Complete(true)
	})()

	// Wait for all of the promises created to be gathered over the channel.
	// It's important to only mutate the slice in a single goroutine, so that
	// happens here.
	for i := 0; i < WAITERS; i++ {
		waiterPromises = append(waiterPromises, <-waiterChan)
	}

	// The promises will now be in varying states of completion, observation
	// has shown ¼th of the time the final promise will be being completed
	// while this Get() method is running, causing a race condition whereby
	// Get() never returned.
	_, err := All(waiterPromises...).Get()
ssmccoy added a commit to ssmccoy/promise that referenced this issue Feb 15, 2017
Occasionally, while one process executing on one CPU was calling
`Complete()` and another was calling `Get()`, the one calling `Get()`
would observe the value of the `state` as `PENDING`.

During this time, another process on another CPU would invoke
`Complete()` which would change the state and update the waitgroup as
`Done()`. After this occurred, the earlier goroutine would continue
and invoke `Wait()` on the waitgroup — whose state would thereafter
never change.

This has been corrected, by a two phase approach. First, all access to
the state has been moved to the `atomic` package so it is delegated to
the `CMPXCHG` and `CMOVE` instructions — ensuring that the status of
the state was stable during the examination. Second, the waitgroup has
been replaced with a condition variable which uses the mutex central
to the promise. Callers of the `Get()` method now lock the promise
before entering the wait state (which unlocks the promise), ensuring
the promise MUST be in a stable state in order to wait on it.

Resolves: quantcast#3
ssmccoy added a commit to ssmccoy/promise that referenced this issue Feb 15, 2017
Occasionally, while one process executing on one CPU was calling
`Complete()` and another was calling `Get()`, the one calling `Get()`
would observe the value of the `state` as `PENDING`.

During this time, another process on another CPU would invoke
`Complete()` which would change the state and update the waitgroup as
`Done()`. After this occurred, the earlier goroutine would continue
and invoke `Wait()` on the waitgroup — whose state would thereafter
never change.

This has been corrected, by a two phase approach. First, all access to
the state has been moved to the `atomic` package so it is delegated to
the `CMPXCHG` and `CMOVE` instructions — ensuring that the status of
the state was stable during the examination. Second, the waitgroup has
been replaced with a condition variable which uses the mutex central
to the promise. Callers of the `Get()` method now lock the promise
before entering the wait state (which unlocks the promise), ensuring
the promise MUST be in a stable state in order to wait on it.

Resolves: quantcast#3
@ssmccoy ssmccoy mentioned this issue Feb 15, 2017
ssmccoy added a commit to ssmccoy/promise that referenced this issue Feb 15, 2017
Occasionally, while one process executing on one CPU was calling
`Complete()` and another was calling `Get()`, the one calling `Get()`
would observe the value of the `state` as `PENDING`.

During this time, another process on another CPU would invoke
`Complete()` which would change the state and update the waitgroup as
`Done()`. After this occurred, the earlier goroutine would continue
and invoke `Wait()` on the waitgroup — whose state would thereafter
never change.

This has been corrected, by a two phase approach. First, all access to
the state has been moved to the `atomic` package so it is delegated to
the `CMPXCHG` and `CMOVE` instructions — ensuring that the status of
the state was stable during the examination. Second, the waitgroup has
been replaced with a condition variable which uses the mutex central
to the promise. Callers of the `Get()` method now lock the promise
before entering the wait state (which unlocks the promise), ensuring
the promise MUST be in a stable state in order to wait on it.

Resolves: quantcast#3
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant