Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

net/http/pprof: TestDeltaProfile failures missing mutexHog2 on ARM architectures #50218

Open
bcmills opened this issue Dec 16, 2021 · 9 comments
Open
Assignees
Labels
arch-arm arch-arm64 NeedsDecision release-blocker
Milestone

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Dec 16, 2021

--- FAIL: TestDeltaProfile (53.30s)
    pprof_test.go:206: want mutexHog2 but no mutexHog1 in the profile, and non-zero p.DurationNanos, got PeriodType: contentions count
        Period: 1
        Time: 2021-12-16 19:51:59.42219359 +1100 AEDT
        Duration: 32.041535781s
        Samples:
        contentions/count delay/nanoseconds
        Locations
        Mappings
        1: 0x0/0x0/0x0   [FN]
    pprof_test.go:214: want both mutexHog1 and mutexHog2 in the profile, got PeriodType: contentions count
        Period: 1
        Time: 2021-12-16 19:51:59.43107526 +1100 AEDT
        Samples:
        contentions/count delay/nanoseconds
                  9   17621016: 1 2 3 
                  1     149081: 1 4 3 
        Locations
             1: 0x851d3 M=1 sync.(*Mutex).Unlock /home/gopher/build/go/src/sync/mutex.go:214 s=0
             2: 0x2d249f M=1 net/http/pprof.mutexHog1 /home/gopher/build/go/src/net/http/pprof/pprof_test.go:107 s=0
             3: 0x2d2937 M=1 net/http/pprof.mutexHog.func1 /home/gopher/build/go/src/net/http/pprof/pprof_test.go:148 s=0
             4: 0x2d24ab M=1 net/http/pprof.mutexHog1 /home/gopher/build/go/src/net/http/pprof/pprof_test.go:108 s=0
        Mappings
        1: 0x0/0x0/0x0   [FN]
FAIL
FAIL	net/http/pprof	57.491s

greplogs --dashboard -md -l -e 'FAIL: TestDeltaProfile ' --since=2021-01-01

2021-12-16T00:34:10-7f23145/openbsd-arm-jsing
2021-11-25T00:02:52-b2a5a37/openbsd-arm-jsing
2021-06-04T17:33:24-831f937/openbsd-arm-jsing

(Forked from #38544, which was mitigated in CL 229498.)

CC @4a6f656c @hyangah

@bcmills bcmills added arch-arm NeedsInvestigation OS-OpenBSD labels Dec 16, 2021
@bcmills bcmills added this to the Backlog milestone Dec 16, 2021
@gopherbot
Copy link

@gopherbot gopherbot commented Dec 16, 2021

Change https://golang.org/cl/372794 mentions this issue: net/http/pprof: skip TestDeltaProfile on openbsd/arm

@bcmills bcmills changed the title net/http/pprof: TestDeltaProfile failures with "got PeriodType: contentions count" on openbsd-arm-jsing net/http/pprof: TestDeltaProfile failures missing mutexHog2 on openbsd/arm Dec 16, 2021
gopherbot pushed a commit that referenced this issue Dec 16, 2021
It is observed to be flaky on the only openbsd/arm builder.
Skipping on that platform until someone can investigate.

For #50218

Change-Id: Id3a6dc12b93b3cec67870d8d81bd608c4589c952
Reviewed-on: https://go-review.googlesource.com/c/go/+/372794
Trust: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
Reviewed-by: Hyang-Ah Hana Kim <hyangah@gmail.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 7, 2022

greplogs --dashboard -md -l -e 'FAIL: TestDeltaProfile ' --since=2021-12-17

2022-02-04T22:34:05-f9763a6/android-arm64-corellium

@bcmills bcmills changed the title net/http/pprof: TestDeltaProfile failures missing mutexHog2 on openbsd/arm net/http/pprof: TestDeltaProfile failures missing mutexHog2 on ARM architectures Feb 7, 2022
@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 7, 2022

Given the android/arm64 failure, this does not seem specific to OpenBSD.
(CC @cherrymui)

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 8, 2022

Another android/arm64 today:

greplogs --dashboard -md -l -e 'FAIL: TestDeltaProfile ' --since=2022-02-05

2022-02-07T21:00:02-7db75b3/android-arm64-corellium

@gopherbot
Copy link

@gopherbot gopherbot commented Feb 8, 2022

Change https://go.dev/cl/383997 mentions this issue: net/http/pprof: skip TestDeltaProfile on all arm and arm64 architectures

@bcmills
Copy link
Member Author

@bcmills bcmills commented Feb 10, 2022

I suspect it may be possible to use something along the lines of the awaitBlockedGoroutine function from CL 384534 to eliminate the timing-dependence of this test.

(I do not plan to follow up on that myself.)

@bcmills bcmills removed this from the Backlog milestone Mar 2, 2022
@bcmills bcmills added this to the Go1.19 milestone Mar 2, 2022
@bcmills
Copy link
Member Author

@bcmills bcmills commented Mar 2, 2022

Marking as release-blocker for Go 1.19 due to the high failure rate on the builders (CC @golang/release).

If we don't care about this test, we can merge CL 383997 (mailed ~3 weeks ago) to encode that decision as a test-skip. Otherwise, this test failure needs to be addressed.

greplogs --dashboard -md -l -e 'FAIL: TestDeltaProfile ' --since=2022-02-08

2022-03-01T20:52:30-f4722d8/android-arm-corellium
2022-02-28T21:56:43-f04d5c1/android-arm-corellium

gopherbot pushed a commit that referenced this issue Mar 8, 2022
Given that we have seen failures with the same failure mode on both
openbsd/arm and android/arm64, it seems likely that the underlying bug
affects at least all ARM-based architectures.

It appears that either these architectures are not able to sample at
the frequency expected by the test, or the samples are for some reason
being dropped.

For #50218

Change-Id: I42a6c8ecda57448f8068e8facb42a4a2cecbbb37
Reviewed-on: https://go-review.googlesource.com/c/go/+/383997
Trust: Bryan Mills <bcmills@google.com>
Run-TryBot: Bryan Mills <bcmills@google.com>
TryBot-Result: Gopher Robot <gobot@golang.org>
Reviewed-by: Cherry Mui <cherryyz@google.com>
@aclements
Copy link
Member

@aclements aclements commented Mar 16, 2022

@golang/runtime needs to decide whether to fix and unskip this test, or to drop the test.

@prattmic prattmic assigned prattmic and unassigned jeremyfaller Mar 22, 2022
@dmitshur dmitshur added NeedsDecision and removed NeedsInvestigation labels May 18, 2022
@mknyszek
Copy link
Contributor

@mknyszek mknyszek commented May 18, 2022

We should decide what to do here. I'll take a look at this.

@mknyszek mknyszek self-assigned this May 18, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
arch-arm arch-arm64 NeedsDecision release-blocker
Projects
Status: Todo
Development

No branches or pull requests

7 participants