Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime,sync/atomic: deadlocks in sync/atomic_test.TestValueSwapConcurrent #45975

Closed
bcmills opened this issue May 5, 2021 · 11 comments
Closed

runtime,sync/atomic: deadlocks in sync/atomic_test.TestValueSwapConcurrent #45975

bcmills opened this issue May 5, 2021 · 11 comments

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented May 5, 2021

2021-05-05T15:54:39-69368ce/linux-386-longtest
2021-05-04T22:12:31-6a6aa32/linux-386-longtest

Also seen in a TryBot:
https://storage.googleapis.com/go-build-log/9041f75f/linux-amd64-longtest_50b82ee1.log

I'm not gonna paste the logs here because they're huge.

The deadlocked test was added in CL 241678, but given the gcMarkDone stacks I'm guessing that this is triggering an existing bug in the runtime rather than a bug in sync/atomic or the test itself. Still, it may be worth considering whether the test can be written in a way that is less allocation-intensive. 😅

CC @arnottcr @randall77 @prattmic @mknyszek @aclements

@bcmills
Copy link
Member Author

@bcmills bcmills commented May 5, 2021

(It is not clear to me whether this issue is related to #45884 and/or #45885 and/or #45916. They might all be symptoms of the same bug.)

Loading

@arnottcr
Copy link
Contributor

@arnottcr arnottcr commented May 5, 2021

I have not have enough profiling and optimisation experience to just clock allocations. I assume this is the two TestXxxConcurrent methods? would introducing pointers, or storing all the numbers to the stack before iteration help?

Loading

@arnottcr
Copy link
Contributor

@arnottcr arnottcr commented May 5, 2021

forgot about -gcflags=-m; can you confirm the efficacy of:

  • O(1): define the anonymous func outside the for loop, so that we have one closure that accepts i uint64 as a parameter
  • O(m): new must escape to the heap because another goroutine may pick it up
  • O(1): ignoring heap escape outside the for loops
  • also, I will cleanup the imports of both "sync/atomic" and . "sync/atomic"

Loading

@randall77
Copy link
Contributor

@randall77 randall77 commented May 10, 2021

I think preallocating all the interface{} that hold numbers might reduce the allocation profile a lot.

I don't think defining the func outside the loop would help much - we need to allocate a goroutine per iteration in any case.

Loading

@prattmic
Copy link
Member

@prattmic prattmic commented May 10, 2021

@dr2chase got a similar looking failure in a benchmark, for which we have a core file. So far, we've found:

  • gcMarkDone is calling forEachP, which is stuck waiting for all Ps to run the function.
  • One P hasn't run the function (i.e., runSafePointFn == 1).
  • This P is in _Pidle, so it should have been handled by forEachP (if already idle), or on transition to idle.
  • This P is in sched.pidle by the time we got the core.

I'm still investigating further.

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented May 10, 2021

Change https://golang.org/cl/318529 mentions this issue: DO NOT SUBMIT: trace forEachP, P transitions

Loading

@gopherbot
Copy link

@gopherbot gopherbot commented May 10, 2021

Change https://golang.org/cl/318569 mentions this issue: runtime: hold sched.lock across atomic pidleget/pidleput

Loading

@arnottcr
Copy link
Contributor

@arnottcr arnottcr commented May 11, 2021

I think preallocating all the interface{} that hold numbers might reduce the allocation profile a lot.

You thinking of something like this?

diff --git a/src/sync/atomic/value_test.go b/src/sync/atomic/value_test.go
index a5e717d6e0..f474f3f8b0 100644
--- a/src/sync/atomic/value_test.go
+++ b/src/sync/atomic/value_test.go
@@ -188,9 +188,11 @@ func TestValueSwapConcurrent(t *testing.T) {
                i := i
                g.Add(1)
                go func() {
+                       var box interface{}
                        var c uint64
                        for new := i; new < i+n; new++ {
-                               if old := v.Swap(new); old != nil {
+                               box = new
+                               if old := v.Swap(box); old != nil {
                                        c += old.(uint64)
                                }
                        }

Does not really seem to trace any better:

// Package main attempts to recreate atomic_test.TestValueSwapConcurrent in go1.16.
package main

import (
	"log"
	"sync"
	. "sync/atomic"
)

func main() {
	var x sync.Mutex
	var v Value
	var count uint64
	var g sync.WaitGroup
	var m, n uint64 = 10000, 10000
	m = 1000
	n = 1000
	for i := uint64(0); i < m*n; i += n {
		i := i
		g.Add(1)
		var box interface{}
		go func() {
			var c uint64
			for new := i; new < i+n; new++ {
				box = new
				x.Lock()
				old := v.Load()
				v.Store(box)
				x.Unlock()
				if old != nil {
					c += old.(uint64)
				}
			}
			AddUint64(&count, c)
			g.Done()
		}()
	}
	g.Wait()
	want, got := (m*n-1)*(m*n)/2, count+v.Load().(uint64)
	log.Printf("sum from 0 to %d was %d, want %v\n", m*n-1, got, want)
}
./main.go:38:44: inlining call to atomic.(*Value).Load
./main.go:25:11: inlining call to sync.(*Mutex).Lock
./main.go:26:18: inlining call to atomic.(*Value).Load
./main.go:28:13: inlining call to sync.(*Mutex).Unlock
./main.go:34:10: inlining call to sync.(*WaitGroup).Done
./main.go:10:6: moved to heap: x
./main.go:11:6: moved to heap: v
./main.go:12:6: moved to heap: count
./main.go:13:6: moved to heap: g
./main.go:14:9: moved to heap: n
./main.go:20:7: moved to heap: box
./main.go:21:6: func literal escapes to heap
./main.go:39:12: ... argument does not escape
./main.go:39:54: m * n - 1 escapes to heap
./main.go:39:54: got escapes to heap
./main.go:39:54: want escapes to heap
./main.go:24:9: new escapes to heap

Loading

@randall77
Copy link
Contributor

@randall77 randall77 commented May 11, 2021

I was thinking for the CompareAndSwapConcurrent test, we probably convert j and j+1 to interface{} lots of times before the CompareAndSwap succeeds. We can do those conversions only once.
(The performance might matter between making just one for every j at the start of the test (a [m*n]interface{}), or making j and j+1 only once per successful CompareAndSwap, as one will succeed with pointer equality and one needs a call into the runtime to do the comparison, but I think either way should be good enough.)

I agree for just Swap it probably doesn't matter.

Loading

@arnottcr
Copy link
Contributor

@arnottcr arnottcr commented May 11, 2021

so, every time through the inner for loop we make two more heap values, even if we are just spinning? thus, this should allocate only what we absolutely need:

diff --git a/src/sync/atomic/value_test.go b/src/sync/atomic/value_test.go
index a5e717d6e0..5f1eea7d1c 100644
--- a/src/sync/atomic/value_test.go
+++ b/src/sync/atomic/value_test.go
@@ -255,12 +257,16 @@ func TestValueCompareAndSwapConcurrent(t *testing.T) {
                m = 100
                n = 100
        }
+       vec := make([]interface{}, m*n)
+       for i := range vec {
+               vec[i] = i
+       }
        for i := 0; i < m; i++ {
                i := i
                w.Add(1)
                go func() {
                        for j := i; j < m*n; runtime.Gosched() {
-                               if v.CompareAndSwap(j, j+1) {
+                               if v.CompareAndSwap(vec[j], vec[j+1]) {
                                        j += m
                                }
                        }

otherwise I think we have to do some deeper loop in loop nesting to do the "making j and j+1 only once per successful CompareAndSwap"; but yeah, I see how this can absolutely suck resources if you are constructing new interface data structures on the heap for every iteration through.

Loading

@gopherbot gopherbot closed this in 2520e72 May 11, 2021
@randall77
Copy link
Contributor

@randall77 randall77 commented May 11, 2021

@arnottcr Yes, that's exactly the change I was contemplating.

Loading

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
6 participants