Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/pprof: TestCPUProfileLabel is flaky on aix-ppc64 #38316

Open
bcmills opened this issue Apr 8, 2020 · 2 comments
Open

runtime/pprof: TestCPUProfileLabel is flaky on aix-ppc64 #38316

bcmills opened this issue Apr 8, 2020 · 2 comments
Labels
Builders NeedsInvestigation OS-AIX
Milestone

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Apr 8, 2020

2020-04-08T18:37:38-d5e1b7c/aix-ppc64
2020-03-17T01:24:51-7ec4adb/aix-ppc64
2019-12-20T20:12:18-b234fdb/aix-ppc64
2019-11-19T19:21:57-e0306c1/aix-ppc64
2019-09-26T17:37:02-a37f2b4/aix-ppc64

Not clear to me whether this is a builder issue, an AIX platform issue, or a timing-sensitive test.

CC @hyangah for pprof, @trex58 @Helflym for aix, @andybons @toothrot @cagedmantis @dmitshur for builders.

@bcmills bcmills added Builders NeedsInvestigation OS-AIX labels Apr 8, 2020
@bcmills bcmills added this to the Backlog milestone Apr 8, 2020
@Helflym
Copy link
Contributor

@Helflym Helflym commented Apr 9, 2020

I think that the same than #36084. The test seems to be stuck in a usleep called by svmon as shown by the samples taken.

184: 0x10002f304 (runtime.usleep:509) 0x10003fe33 (runtime.sysmon:4333) 0x1000373e3 (runtime.mstart1:1238) 0x1000372cb (runtime.mstart:1203)

It's not just the builder as I was able to get the same failure on my local machine.
Is it possible that a deadlock occurs between sysmon and pprof functions ?

@dmitshur
Copy link
Contributor

@dmitshur dmitshur commented Nov 4, 2021

This flaky TestCPUProfileLabel failure happened at least once also on linux/amd64:

--- FAIL: TestCPUProfileLabel (5.00s)
    pprof_test.go:514: total 4 CPU profile samples collected:
        3: 0x466db7 (runtime.madvise:539) 0x416f74 (runtime.sysUnused:111) 0x4211d3 (runtime.(*pageAlloc).scavengeRangeLocked:734) 0x420cf6 (runtime.(*pageAlloc).scavengeOne:628) 0x4205cd (runtime.(*pageAlloc).scavenge:419) 0x420504 (runtime.bgscavenge.func2:310) 0x462e88 (runtime.systemstack:477) 0x4202af (runtime.bgscavenge:298)
        
        1: 0x442f0f (runtime.runqget:5975) 0x43b430 (runtime.findrunnable:2728) 0x43cad8 (runtime.schedule:3361) 0x43d02c (runtime.park_m:3510) 0x462e02 (runtime.mcall:433)
        
    pprof_test.go:530: too few samples; got 4, want at least 125, ideally 500
    pprof_test.go:549: runtime/pprof.cpuHogger;key=value: 0
    pprof_test.go:552: no samples in expected functions
    pprof_test.go:562: runtime/pprof.cpuHogger;key=value has 0 samples out of 0, want at least 1, ideally 0

(Build log.)

From a slowbot run on CL 361294. It didn't happen on the first retry.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Builders NeedsInvestigation OS-AIX
Projects
None yet
Development

No branches or pull requests

3 participants