Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add return stack traces #2903

Merged
merged 1 commit into from
Jun 13, 2022
Merged

Add return stack traces #2903

merged 1 commit into from
Jun 13, 2022

Conversation

deitch
Copy link
Contributor

@deitch deitch commented Jun 10, 2022

Have an option to send the Solve() daemon-side stacktraces to the client along with an error. Enabled via --send-stacktraces or config file option.

Should not be enabled in most cases. However, when running tests, this can be very useful. If the error occurs in the daemon, the output logs now can tell where the error occurred.

Discussed with @tonistiigi here

I probably should make this a Draft PR, but it gets across what I was trying to do, and is functional. Happy to change into some other format, if that is better.

We also probably should enable it for all usage inside CI, if this is good.

@tonistiigi
Copy link
Member

The errors are already typed with stacktraces. This is removing this capability an squashing everything into a string where it loses all of its typed info.

To get all the traces in tests without using the stack package, we need to find a way to inject the stack printing into testify. Currently, the problem is that testify doesn't understand more than a local trace.

Alternatively, we could also just print the error with the full stacktrace(minus the client of course in this case) to buildkitd logs when --debug or some opt-in env is set.

@deitch
Copy link
Contributor Author

deitch commented Jun 11, 2022

Currently, the problem is that testify doesn't understand more than a local trace.

Yeah, exactly the problem I ran into when trying to debug a daemon-side issue.

This is removing this capability an squashing everything into a string where it loses all of its typed info.

Agreed, it was horribly ugly, but better than nothing. It helped me solve several issues.

Alternatively, we could also just print the error with the full stacktrace(minus the client of course in this case) to buildkitd logs when --debug or some opt-in env is set.

Haha! I was walking outside a few hours ago when this much simpler, one-line solution, hit me. I will change this PR for that. I like the idea of, "if debug is set, and an error is returned to a client, dump a stacktrace.

Ready in five mins.

@deitch
Copy link
Contributor Author

deitch commented Jun 11, 2022

Try this. Materially simpler, and fits with the CI flow, where it already runs buildkitd with --debug.

@deitch
Copy link
Contributor Author

deitch commented Jun 11, 2022

I don't understand the CI failure.

Copy link
Member

@tonistiigi tonistiigi left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

stack.Traces(err) does not print with correct formatting but "[frames:<Name:\" etc. stack.Formatter with %+v is needed for that.

The error itself is printed twice now. Only one error with formatted stacktrace should be printed.

Example of current output:

time="2022-06-12T20:29:04Z" level=debug msg="solver.Solve returned error process \"/bin/sh -c stop\" did not complete successfully: exit code: 127"
time="2022-06-12T20:29:04Z" level=debug msg="[frames:<Name:\"github.com/moby/buildkit/executor/runcexecutor.exitError\" File:\"/src/executor/runcexecutor/executor.go\" Line:357 > frames:<Name:\"github.com/moby/buildkit/executor/runcexecutor.(*runcExecutor).Run\" File:\"/src/executor/runcexecutor/executor.go\" Line:331 > frames:<Name:\"github.com/moby/buildkit/solver/llbsolver/ops.(*execOp).Exec\" File:\"/src/solver/llbsolver/ops/exec.go\" Line:353 > frames:<Name:\"github.com/moby/buildkit/solver.(*sharedOp).Exec.func2\" File:\"/src/solver/jobs.go\" Line:844 > frames:<Name:\"github.com/moby/buildkit/util/flightcontrol.(*call).run\" File:\"/src/util/flightcontrol/flightcontrol.go\" Line:121 > frames:<Name:\"sync.(*Once).doSlow\" File:\"/usr/local/go/src/sync/once.go\" Line:68 > frames:<Name:\"sync.(*Once).Do\" File:\"/usr/local/go/src/sync/once.go\" Line:59 > frames:<Name:\"runtime.goexit\" File:\"/usr/local/go/src/runtime/asm_arm64.s\" Line:1263 > cmdline:\"buildkitd\" cmdline:\"--debug\" pid:1 version:\"v0.10.0-161-g98c407cfd\" revision:\"98c407cfdaa1c090a9f3b226c965d1cc3047d0f3\"  frames:<Name:\"github.com/moby/buildkit/solver/llbsolver/ops.(*execOp).Exec\" File:\"/src/solver/llbsolver/ops/exec.go\" Line:373 > frames:<Name:\"github.com/moby/buildkit/solver.(*sharedOp).Exec.func2\" File:\"/src/solver/jobs.go\" Line:844 > frames:<Name:\"github.com/moby/buildkit/util/flightcontrol.(*call).run\" File:\"/src/util/flightcontrol/flightcontrol.go\" Line:121 > frames:<Name:\"sync.(*Once).doSlow\" File:\"/usr/local/go/src/sync/once.go\" Line:68 > frames:<Name:\"sync.(*Once).Do\" File:\"/usr/local/go/src/sync/once.go\" Line:59 > frames:<Name:\"runtime.goexit\" File:\"/usr/local/go/src/runtime/asm_arm64.s\" Line:1263 > cmdline:\"buildkitd\" cmdline:\"--debug\" pid:1 version:\"v0.10.0-161-g98c407cfd\" revision:\"98c407cfdaa1c090a9f3b226c965d1cc3047d0f3\"  frames:<Name:\"github.com/moby/buildkit/solver.(*edge).execOp\" File:\"/src/solver/edge.go\" Line:904 > frames:<Name:\"github.com/moby/buildkit/solver/internal/pipe.NewWithFunction.func2\" File:\"/src/solver/internal/pipe/pipe.go\" Line:82 > frames:<Name:\"runtime.goexit\" File:\"/usr/local/go/src/runtime/asm_arm64.s\" Line:1263 > cmdline:\"buildkitd\" cmdline:\"--debug\" pid:1 version:\"v0.10.0-161-g98c407cfd\" revision:\"98c407cfdaa1c090a9f3b226c965d1cc3047d0f3\" ]"
time="2022-06-12T20:29:04Z" level=error msg="/moby.buildkit.v1.Control/Solve returned error: rpc error: code = Unknown desc = process \"/bin/sh -c stop\" did not complete successfully: exit code: 127\n"

@deitch
Copy link
Contributor Author

deitch commented Jun 12, 2022

Like this @tonistiigi ? Take a look at updated.

@tonistiigi
Copy link
Member

The printout in main.go already uses stack.Formatter, but looks like it doesn't work because inside that interceptor, the error has already been converted to grpc error that gets sent to the client-side.

Also, printing stacks with logrus looks really ugly because all the lines get squashed together etc. I would suggest this:

diff --cc cmd/buildkitd/main.go
index 56f4998ce,56f4998ce..0ddab23a6
--- a/cmd/buildkitd/main.go
+++ b/cmd/buildkitd/main.go
@@@ -565,7 -565,7 +565,10 @@@ func unaryInterceptor(globalCtx context
  
  		resp, err = withTrace(ctx, req, info, handler)
  		if err != nil {
--			logrus.Errorf("%s returned error: %+v", info.FullMethod, stack.Formatter(err))
++			logrus.Errorf("%s returned error: %v", info.FullMethod, err)
++			if logrus.GetLevel() >= logrus.DebugLevel {
++				fmt.Fprintf(os.Stderr, "%+v", stack.Formatter(grpcerrors.FromGRPC(err)))
++			}
  		}
  		return
  	}

@deitch
Copy link
Contributor Author

deitch commented Jun 13, 2022

I would suggest this:

Does that do what we are looking for? I understand the structure of placing it in control.go, as that is in Solve(), which tells us what the stack was upon error. The formatting may be off, which we should fix, but it does give the desired result.

I don't know what the unaryInterceptor() you referenced is.

@tonistiigi
Copy link
Member

I don't know what the unaryInterceptor() you referenced is.

It's a middleware that captures all grpc requests and where the error logging is atm.

@deitch
Copy link
Contributor Author

deitch commented Jun 13, 2022

Well, if you say it will work there, I will be happy.

@tonistiigi
Copy link
Member

Check if it prints the expected result for you and solves the issue. It should.

Signed-off-by: Avi Deitcher <avi@deitcher.net>
@deitch
Copy link
Contributor Author

deitch commented Jun 13, 2022

Updated per your suggestion

@tonistiigi tonistiigi merged commit b521369 into moby:master Jun 13, 2022
@deitch deitch deleted the with-return-traces branch June 14, 2022 06:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants