Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime/pprof: TestMorestack failures aix-ppc64 builder #38567

Open
bcmills opened this issue Apr 21, 2020 · 4 comments
Open

runtime/pprof: TestMorestack failures aix-ppc64 builder #38567

bcmills opened this issue Apr 21, 2020 · 4 comments

Comments

@bcmills
Copy link
Member

@bcmills bcmills commented Apr 21, 2020

2020-04-21T11:41:40-664d270/aix-ppc64
2020-04-21T08:08:49-17fbc81/aix-ppc64
2020-04-20T22:42:49-2edd351/aix-ppc64

This failure mode started very recently, so it could perhaps be due to CL 228886 (CC @josharian @hyangah). Tentatively marking as release-blocker because this appears to be a regression in 1.15.

#38316 may be related (CC @trex58, @Helflym).

--- FAIL: TestMorestack (6.42s)
    pprof_test.go:394: total 6 CPU profile samples collected:
        1: 0x100032320 (runtime.syscall1:221) 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x1000416b3 (runtime.goexit0:2942) 0x10006b3a7 (runtime.mcall:202)
        
        3: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
        1: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x1000416b3 (runtime.goexit0:2942) 0x10006b3a7 (runtime.mcall:202)
        
        1: 0x10006e450 (asmsyscall6:52) 0x10006d9a7 (runtime.asmcgocall:633)
        
    pprof_test.go:429: runtime.newstack,runtime/pprof.growstack: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime.newstack,runtime/pprof.growstack has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 200ms duration
    pprof_test.go:394: total 11 CPU profile samples collected:
        11: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime.newstack,runtime/pprof.growstack: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime.newstack,runtime/pprof.growstack has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 400ms duration
    pprof_test.go:394: total 23 CPU profile samples collected:
        23: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime.newstack,runtime/pprof.growstack: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime.newstack,runtime/pprof.growstack has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 800ms duration
    pprof_test.go:394: total 46 CPU profile samples collected:
        46: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime.newstack,runtime/pprof.growstack: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime.newstack,runtime/pprof.growstack has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 1.6s duration
    pprof_test.go:394: total 92 CPU profile samples collected:
        91: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
        1: 0x100032ca3 (runtime.usleep:535) 0x10004557f (runtime.sysmon:4561) 0x10003c173 (runtime.mstart1:1150) 0x10003c05b (runtime.mstart:1115)
        
    pprof_test.go:429: runtime.newstack,runtime/pprof.growstack: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime.newstack,runtime/pprof.growstack has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 3.2s duration
    pprof_test.go:394: total 184 CPU profile samples collected:
        184: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime.newstack,runtime/pprof.growstack: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime.newstack,runtime/pprof.growstack has 0 samples out of 0, want at least 1, ideally 0
--- FAIL: TestCPUProfileLabel (6.42s)
    pprof_test.go:394: total 5 CPU profile samples collected:
        5: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime/pprof.cpuHogger;key=value: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime/pprof.cpuHogger;key=value has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 200ms duration
    pprof_test.go:394: total 11 CPU profile samples collected:
        11: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime/pprof.cpuHogger;key=value: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime/pprof.cpuHogger;key=value has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 400ms duration
    pprof_test.go:394: total 22 CPU profile samples collected:
        22: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime/pprof.cpuHogger;key=value: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime/pprof.cpuHogger;key=value has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 800ms duration
    pprof_test.go:394: total 46 CPU profile samples collected:
        46: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime/pprof.cpuHogger;key=value: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime/pprof.cpuHogger;key=value has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 1.6s duration
    pprof_test.go:394: total 90 CPU profile samples collected:
        90: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime/pprof.cpuHogger;key=value: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime/pprof.cpuHogger;key=value has 0 samples out of 0, want at least 1, ideally 0
    pprof_test.go:323: retrying with 3.2s duration
    pprof_test.go:394: total 185 CPU profile samples collected:
        185: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
        
    pprof_test.go:429: runtime/pprof.cpuHogger;key=value: 0
    pprof_test.go:432: no samples in expected functions
    pprof_test.go:442: runtime/pprof.cpuHogger;key=value has 0 samples out of 0, want at least 1, ideally 0
FAIL
FAIL	runtime/pprof	21.230s
@josharian
Copy link
Contributor

@josharian josharian commented Apr 21, 2020

Are atomic memory operations implemented using syscalls on aix/ppc64?

Almost every single sample collected has the same stack trace. It’s sitting in a semaphore syscall deep in the scheduler.

I don’t really see how the pprof CL could have caused that, but I’ll revert it to see experimentally whether it fixes it.

pprof_test.go:394: total 184 CPU profile samples collected:
        184: 0x10003343b (runtime.sem_wait:574 runtime.semasleep:75) 0x10000ad0b (runtime.notesleep:181) 0x10003da5f (runtime.stopm:1881) 0x10003f493 (runtime.findrunnable:2413) 0x100040513 (runtime.schedule:2613) 0x100040b57 (runtime.park_m:2783) 0x10006b3a7 (runtime.mcall:202)
@josharian
Copy link
Contributor

@josharian josharian commented Apr 21, 2020

I wonder whether we have somehow ended up with the profiling signals synchronizing with something else in the runtime, so that instead of getting a good random sample we are instead always sampling the scheduler.

@josharian
Copy link
Contributor

@josharian josharian commented Apr 21, 2020

https://golang.org/cl/228886 has been reverted. Let’s see whether that fixes the builders...

@Helflym
Copy link
Contributor

@Helflym Helflym commented Apr 22, 2020

Are atomic memory operations implemented using syscalls on aix/ppc64?

If you're speaking of sysMap and similar functions, yes there are using syscalls. However, these syscalls are using asmcgocall directly without a lock, so I don't think there is any semaphores involved in.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants
You can’t perform that action at this time.