Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime:cpu2: TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1 failures #68781

Open
gopherbot opened this issue Aug 8, 2024 · 7 comments
Assignees
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Milestone

Comments

@gopherbot
Copy link
Contributor

#!watchflakes
default <- pkg == "runtime:cpu2" && test == "TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1"

Issue created automatically to collect these failures.

Example (log):

=== RUN   TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1
    metrics_test.go:1065: lock contention growth in runtime/pprof's view  (0.064667s)
    metrics_test.go:1066: lock contention growth in runtime/metrics' view (0.064682s)
    metrics_test.go:1104: stack [runtime.unlock runtime_test.TestRuntimeLockMetricsAndProfile.func5.1 runtime_test.(*contentionWorker).run] has samples totaling n=199 value=57946392
    metrics_test.go:1192: mutex profile reported contention count different from the known true count (199 != 200)
--- FAIL: TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1 (0.11s)

watchflakes

@gopherbot gopherbot added the NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one. label Aug 8, 2024
@gopherbot
Copy link
Contributor Author

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "runtime:cpu2" && test == "TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1"
2024-08-07 16:08 gotip-linux-arm go@aba16d17 runtime:cpu2.TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1 (log)
=== RUN   TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1
    metrics_test.go:1065: lock contention growth in runtime/pprof's view  (0.064667s)
    metrics_test.go:1066: lock contention growth in runtime/metrics' view (0.064682s)
    metrics_test.go:1104: stack [runtime.unlock runtime_test.TestRuntimeLockMetricsAndProfile.func5.1 runtime_test.(*contentionWorker).run] has samples totaling n=199 value=57946392
    metrics_test.go:1192: mutex profile reported contention count different from the known true count (199 != 200)
--- FAIL: TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1 (0.11s)

watchflakes

@mauri870
Copy link
Member

mauri870 commented Aug 8, 2024

Tentatively closing this as a duplicate of #68453.

@mauri870 mauri870 closed this as not planned Won't fix, can't repro, duplicate, stale Aug 8, 2024
@rhysh
Copy link
Contributor

rhysh commented Aug 8, 2024

@mauri870 it looks like gopherbot is splitting the reports based on "cpu2" or "cpu4". An update to the watchflakes line on the other issue would probably keep the bot from reopening this on the next "cpu2" failure. (Though, no harm in waiting until then.)

But I also intend to close both of the issues when I merge the change I described in #68453 (comment) . Maybe later today.

@gopherbot
Copy link
Contributor Author

Change https://go.dev/cl/604355 mentions this issue: runtime: record all sampled mutex profile events

gopherbot pushed a commit that referenced this issue Aug 14, 2024
The block and mutex profiles have slightly different behaviors when a
sampled event has a negative (or zero) duration. The block profile
enforces a minimum duration for each event of "1" in the cputicks unit.
It does so by clamping the duration to 1 if it was originally reported
as being smaller. The mutex profile for app-level contention enforces a
minimum duration of 0 in a similar way: by reporting any negative values
as 0 instead.

The mutex profile for runtime-internal contention had a different
behavior: to enforce a minimum event duration of "1" by dropping any
non-conforming samples.

Stop dropping samples, and use the same minimum (0) that's in place for
the other mutex profile events.

Fixes #64253
Fixes #68453
Fixes #68781

Change-Id: I4c5d23a2675501226eef5b9bc1ada2efc1a55b9e
Reviewed-on: https://go-review.googlesource.com/c/go/+/604355
LUCI-TryBot-Result: Go LUCI <golang-scoped@luci-project-accounts.iam.gserviceaccount.com>
Reviewed-by: Mauri de Souza Meneguzzo <mauri870@gmail.com>
Reviewed-by: Michael Pratt <mpratt@google.com>
Reviewed-by: Carlos Amedee <carlos@golang.org>
Auto-Submit: Rhys Hiltner <rhys.hiltner@gmail.com>
@gopherbot gopherbot reopened this Aug 30, 2024
@gopherbot
Copy link
Contributor Author

Found new dashboard test flakes for:

#!watchflakes
default <- pkg == "runtime:cpu2" && test == "TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1"
2024-08-30 17:51 gotip-linux-ppc64_power10 go@4f327f27 runtime:cpu2.TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1 (log)
=== RUN   TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1
    metrics_test.go:1065: lock contention growth in runtime/pprof's view  (0.048452s)
    metrics_test.go:1066: lock contention growth in runtime/metrics' view (0.048448s)
    metrics_test.go:1104: stack [runtime.unlock runtime_test.TestRuntimeLockMetricsAndProfile.func5.1 runtime_test.(*contentionWorker).run] has samples totaling n=199 value=48078057
    metrics_test.go:1192: mutex profile reported contention count different from the known true count (199 != 200)
--- FAIL: TestRuntimeLockMetricsAndProfile/runtime.lock/sample-1 (0.05s)

watchflakes

@cherrymui
Copy link
Member

@rhysh could you take a look at the new failure? Is this happening again? Thanks.

@cherrymui cherrymui added this to the Backlog milestone Sep 4, 2024
@rhysh
Copy link
Contributor

rhysh commented Sep 20, 2024

Yes, will do.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
compiler/runtime Issues related to the Go compiler and/or runtime. NeedsInvestigation Someone must examine and confirm this is a valid issue and not a duplicate of an existing one.
Projects
Status: Active
Development

No branches or pull requests

4 participants