New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Disable full call stacks from reconcile errors in logs #529
Comments
Example call stack that doesn't provide any additional value:
|
I am having difficulties to reproduce this issue. AFAIU, for controllers on stack side, a fail function is implemented does not return error to controller runtime, but only resulting error (if any) of Status().Update() returned. I will try to disable/break connectivity to api-server now. |
@turkenh the failure to reproduce this call stack issue with a cloud provider stack is likely related to #849. The logging framework used in Crossplane will emit these call stacks only when in debug logging mode I believe. Main crossplane's helm chart by default passes the Ideally:
|
thanks @jbw976 Now I can reproduce by putting a breakpoint in sync function of the resource, modifying object via kubectl (i.e. adding a label) while it is waiting at breakpoint which causes I am now trying to figure out a way to disable them. |
I think an easy way to reproduce this is to simply return an error from a This is the code within the controller-runtime that sets up the logger with "development" settings, which means to print the call stack: Looking at the v0.2 controller-runtime branch, it looks like this code has changed a bit since the revision we are using (at least in It looks like with that latest version, you can now specify the log level to use stack traces https://github.com/kubernetes-sigs/controller-runtime/blob/release-0.2/pkg/log/zap/zap.go#L104 Hopefully we can just use that now instead of having to fork and make other changes ourselves. This is the context around these new logging changes, which may be exactly what we were looking for. |
This is so true, for some reason, I was thinking I need to reproduce this without changing the code🤦♂️ Thanks for the links, I think I already found a workaround without not changing current controller-runtime revision but it makes more sense to update it and use builtin options as you pointed. |
Having been working on adding logging and events to our controllers for the past week I really feel this is working as intended. Or put otherwise I feel like this feature request is overindexing on making our debug logs (which are intended for just that - debugging Crossplane) a super friendly, non-threatening user experience. Quoting myself from a related PR, I think events are the place for that:
It's worth noting that crossplane/crossplane-runtime#108 increases the amount of places we log errors, and thus increases the amount of places we log stack traces. In the following example I think the stack trace is actually quite useful, in that it would allow me to find the file and line from which the error originated: 2020-01-26T21:56:55.362-0800 DEBUG stack-gcp Cannot connect to provider {"controller": "managed/cloudsqlinstance", "request": "/default-app-postgresql-sz22q", "uid": "f287809b-3295-4cd8-ade6-ec6a6331033d", "version": "17603", "external-name": "", "error": "provider could not be retrieved: Provider.gcp.crossplane.io \"example\" not found", "errorVerbose": "Provider.gcp.crossplane.io \"example\" not found\nprovider could not be retrieved\ngithub.com/crossplaneio/stack-gcp/pkg/controller/database.(*cloudsqlConnector).Connect\n\t/Users/negz/control/crossplaneio/stack-gcp/pkg/controller/database/cloudsql.go:90\ngithub.com/crossplaneio/crossplane-runtime/pkg/reconciler/managed.(*Reconciler).Reconcile\n\t/Users/negz/control/go/pkg/mod/github.com/crossplaneio/crossplane-runtime@v0.4.1-0.20200127052939-8661974d35bd/pkg/reconciler/managed/reconciler.go:496\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/Users/negz/control/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:256\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/Users/negz/control/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:232\nsigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/Users/negz/control/go/pkg/mod/sigs.k8s.io/controller-runtime@v0.4.0/pkg/internal/controller/controller.go:211\nk8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/Users/negz/control/go/pkg/mod/k8s.io/apimachinery@v0.17.0/pkg/util/wait/wait.go:152\nk8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/Users/negz/control/go/pkg/mod/k8s.io/apimachinery@v0.17.0/pkg/util/wait/wait.go:153\nk8s.io/apimachinery/pkg/util/wait.Until\n\t/Users/negz/control/go/pkg/mod/k8s.io/apimachinery@v0.17.0/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/Cellar/go/1.13.6/libexec/src/runtime/asm_amd64.s:1357", "requeue-after": "2020-01-26T21:57:25.362-0800"} |
Thank you very much @negz for the recent work on standardizing our logging approaches and adding events to the controller-runtime reconciler patterns. I think the work you've done goes a long way towards surfacing troubleshooting information in a standard way, and I agree with your sentiments on the stack traces in the approach you have been working on. Great work @negz! 💪 |
ecr: simplify tagging
Is this a bug report or feature request?
What should the feature do: Currently, a big scary call stack is logged every time a
Reconcile
loop returns an error. The error is interesting, but the huge call stack adds no value and makes the logs harder to read.Discussion from Slack:
What is use case behind this feature: As part of #331 to make the troubleshooting story better, removing these scary but unhelpful call stacks will also make the experience better.
The text was updated successfully, but these errors were encountered: