-
Notifications
You must be signed in to change notification settings - Fork 17.7k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
testing: writeProfiles is not called after panic #65129
Comments
Change https://go.dev/cl/556255 mentions this issue: |
cc @prattmic |
I suspect that this could be fixed by calling But then the question is: how is |
@bcmills i implemented your suggestion (not at exactly at the same line, but that was a mistake done in haste) and it seems to work well. PTAL and let me know if you think the |
I just noticed that the fix for #65319 has fixed the problem reported in this issue for execution traces . In fact, it even fixed the problem for the case where an arbitrary goroutine panics. My approach in CL 556255 would have only worked for panics in the goroutine calling the This raises the question if CL 556255 should proceed as is. I see three options:
I don't think option 1 is workable. At the moment of a panic, the runtime is only aware of the fact that the CPU profile could be flushed. It has no idea that the testing package would also like to take snapshots of the alloc/heap, mutex, block and goroutine profiles. Option 2 is worth considering. Execution traces are by far the most valuable data source you'd like to have when a Go program crashes. And arguably they also contain almost everything one would hope to find in a CPU, block, mutex or goroutine profile. However, they don't have any information about memory beyond heap size and heap target. So IMO there is still value in getting the alloc/heap profile written out when a test crashes. Based on the above, I will go with option 3. However, I would not be heart broken if we decide to go with option 2 in order to keep the testing package simpler and b/c my CL doesn't work for arbitrary goroutines panics. |
In the event of an uncontrolled crash, I would expect that the most interesting case for a heap profile would be an OOM, but in case of an OOM failure I think the runtime throws instead of panicking. Similarly, a mutex/blocking profile would be most useful if the test times out, but that doesn't result in a panic in the test either. So I think in the interest of avoiding unnecessary complexity in the |
Linux uses the KILL signal to implement OOM killing. It gives the process no chance to dump any kind of diagnostics data. It's not even possible to get a core dump. It's a sad state of affairs 😞.
Funny you mention this. The original problem that caused me to file this issue was a test that was timing out. This test didn't properly clean up after itself ... causing another test in the package to panic further down the road. This was all happening in CI and not locally reproducible. Hence the need for execution traces and other diagnostics data from the runtime.
My original issue was solved using execution tracing. We modified CI to only run the test that was timing out, which allowed us to capture a trace. This workaround is no longer needed, thanks to #65319. The remaining problems here are hypothetical for me, so I don't feel strongly enough about them to lobby you into accepting more complexity in the That being said, if we close this, let's make it clear that this is only a partial duplicate of #65319. Writing out cpu/heap/mutex/goroutine profiles on test panics is still unique to this issue and doesn't overlap with #65319. |
Go version
go1.21.4
Output of
go env
in your module/workspace:What did you do?
Debug a test that hit a panic using
go test -trace
. Below is a greatly simplified example that reproduces the problem. Run it withgo test -trace go.trace
.What did you see happen?
What did you expect to see?
A valid trace file that I can open.
Additional Thoughts
This reproduces with tip.
The problem seems to be that the
after()
func that is supposed to callwriteProfiles
doesn't get called when a test panics becausetRunner()
runs in a different goroutine.The text was updated successfully, but these errors were encountered: