New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime: TestRuntimeLockMetricsAndProfile failures #64253
Comments
Found new dashboard test flakes for:
2023-11-17 23:04 linux-ppc64-sid-power10 go@0b31a46f runtime.TestRuntimeLockMetricsAndProfile (log)
2023-11-17 23:04 windows-386-2016 go@0b31a46f runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-17 23:16 windows-amd64-longtest go@f664031b runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-17 23:16 linux-ppc64-sid-power10 go@f664031b runtime.TestRuntimeLockMetricsAndProfile (log)
2023-11-17 23:34 linux-ppc64-sid-power10 go@631a6c2a runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-17 23:15 windows-arm64-11 go@f67b2d8f runtime.TestRuntimeLockMetricsAndProfile (log)
2023-11-17 23:16 windows-arm64-11 go@f664031b runtime.TestRuntimeLockMetricsAndProfile (log)
2023-11-17 23:34 windows-arm64-11 go@631a6c2a runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-19 02:15 windows-amd64-2016 go@aa9dd500 runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-19 07:31 linux-ppc64-sid-power10 go@d67ac938 runtime.TestRuntimeLockMetricsAndProfile (log)
2023-11-19 07:31 windows-arm64-11 go@d67ac938 runtime.TestRuntimeLockMetricsAndProfile (log)
2023-11-19 15:24 linux-ppc64-sid-power10 go@1c15291f runtime.TestRuntimeLockMetricsAndProfile (log)
2023-11-19 15:24 windows-arm64-11 go@1c15291f runtime.TestRuntimeLockMetricsAndProfile (log)
2023-11-19 17:06 linux-ppc64-sid-power10 go@2551fffd runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-19 21:11 windows-arm64-11 go@06145fe0 runtime.TestRuntimeLockMetricsAndProfile (log)
2023-11-19 22:05 windows-arm64-11 go@63828938 runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-21 21:30 linux-ppc64le-buildlet go@41f58b22 runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-21 21:27 linux-386-longtest go@48a6362d runtime.TestRuntimeLockMetricsAndProfile (log)
|
Change https://go.dev/cl/544375 mentions this issue: |
Most of the failures are from before https://go.dev/cl/544195. Following that, I've seen four failures (via fetchlogs/greplogs). https://build.golang.org/log/c733e91b4693774eb42501bbe93a8ee071ef312e addressed by https://go.dev/cl/544375 |
Found new dashboard test flakes for:
2023-11-21 21:29 linux-ppc64-sid-buildlet go@90ba4452 runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-21 21:29 aix-ppc64 go@71052169 runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-11-22 18:12 linux-arm64-longtest go@67600298 runtime.TestRuntimeLockMetricsAndProfile (log)
|
That's two more ppc64 failures from a clock mismatch (not sure what's up with ppc64's clocks), and one more of the type that https://go.dev/cl/544375 addresses. |
Found new dashboard test flakes for:
2023-11-27 17:23 solaris-amd64-oraclerel go@e158cb21 runtime.TestRuntimeLockMetricsAndProfile (log)
|
Most contention on the runtime locks inside semaphores is observed in runtime.semrelease1, but it can also appear in runtime.semacquire1. When examining contention profiles in TestRuntimeLockMetricsAndProfile, allow call stacks that include either. For #64253 Change-Id: Id4f16af5e9a28615ab5032a3197e8df90f7e382f Reviewed-on: https://go-review.googlesource.com/c/go/+/544375 Reviewed-by: Dmitri Shuralyov <dmitshur@google.com> Auto-Submit: Rhys Hiltner <rhys@justin.tv> LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com> Reviewed-by: Michael Knyszek <mknyszek@google.com>
Note that there are some additional timeouts at #55308 (comment). I'm not sure how to update the watchflakes rule to match these timeouts. |
From that issue comment, Previously, the test involved three threads and required one of them to hold the semaphore's lock, one to contend for it, and the third to notice that contention and mark the test as complete. My interpretation of the In https://go.dev/cl/544195 , I relaxed that requirement: now the test can finish early if one of the threads notices the other two have created the necessary contention, otherwise it'll give up after 10,000,000 iterations. (My darwin/arm64 system typically needed 500,000–4,000,000 iterations during the I continue to struggle with the balance between wanting thorough tests for this part of the runtime (involving concurrency and clocks), and the need for 0% flakes. Maybe the performance dashboard sets an example there, including an opportunity for more output than |
Found new dashboard test flakes for:
2023-11-29 19:13 freebsd-arm64-dmgk go@636c6e35 runtime.TestRuntimeLockMetricsAndProfile (log)
|
This failure appeared on a first-class port on https://go.dev/cl/546635. @bcmills was that CL based on a commit before https://go.dev/cl/544375 landed by any chance? EDIT: Yes, it was. Here's the failure: https://ci.chromium.org/ui/p/golang/builders/try-workers/gotip-linux-arm64-test_only/b8762881407384401921/overview |
From https://ci.chromium.org/ui/p/golang/builders/try-workers/gotip-linux-arm64-test_only/b8762881407384401921/overview, I get https://logs.chromium.org/logs/golang/buildbucket/cr-buildbucket/8762881407384401921/+/u/step/11/log/2, which shows two "metrics_test.go:1103: want stack" lines, indicating that it includes https://go.dev/cl/544375. CL 546635 at PS 4 is https://go.googlesource.com/go/+/30e6fc629529abf8da4528f4fdbb5a78363624fb, with parent https://go.googlesource.com/go/+/2e6387cbec924dbd01007421d7442125037c66b2 . I don't see a line in the failing build log matching https://go.googlesource.com/go/+/2e6387cbec924dbd01007421d7442125037c66b2/src/runtime/metrics_test.go#1271 , which means the test ran for 10,000,000 iterations without the test code noticing contention on the semaphore lock. The mutex profile result shows that the runtime itself didn't encounter contention either. (In #55160, the contention was present but the test had been unable to notice it and so would run forever.) The "TestRuntimeLockMetricsAndProfile/runtime.lock" test verifies that runtime-internal lock contention is able to be reported with the correct count, magnitude, and call stack. The role of the "TestRuntimeLockMetricsAndProfile/runtime.semrelease" test is to check that the call stack ends at a particular depth, mostly so we can notice when that changes (so we can update the skip parameter, for example). It's proven tricker to test, since the lock itself isn't under the test's control. Do you have advice on how to test semacquire/semrelease, or is the best option to |
Found new dashboard test flakes for:
2023-12-01 21:50 linux-ppc64-sid-buildlet go@3220bbe1 runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-12-06 17:29 linux-386-buster go@c80bd631 runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-12-08 20:35 android-386-emu go@9869a0ce runtime.TestRuntimeLockMetricsAndProfile (log)
|
Found new dashboard test flakes for:
2023-12-08 03:28 aix-ppc64 go@78b42a53 runtime.TestRuntimeLockMetricsAndProfile (log)
2023-12-08 18:34 aix-ppc64 go@6cdf2cca runtime.TestRuntimeLockMetricsAndProfile (log)
|
Issue created automatically to collect these failures.
Example (log):
— watchflakes
The text was updated successfully, but these errors were encountered: