Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Disable stacktrace from reconcile errors #49

Closed

Conversation

turkenh
Copy link
Contributor

@turkenh turkenh commented Oct 10, 2019

Signed-off-by: Hasan Turken turkenh@gmail.com

Description of your changes

This PR disables stacktraces from reconcile errors (below Fatal) by using new functionality added to controller runtime with this PR.

This requires updating controller runtime dependency as v0.2.2.

Related to crossplane/crossplane#529

Checklist

I have:

  • Run make reviewable to ensure this PR is ready for review.
  • Ensured this PR contains a neat, self documenting set of commits.
  • Updated any relevant documentation, examples, or release notes.
  • Updated the dependencies in app.yaml to include any new role permissions.

Signed-off-by: Hasan Turken <turkenh@gmail.com>
Signed-off-by: Hasan Turken <turkenh@gmail.com>
@upbound-bot
Copy link
Collaborator

67% (0.0%) vs master 67%

@turkenh
Copy link
Contributor Author

turkenh commented Oct 10, 2019

This PR changes the error logs, from:

2019-10-10T13:22:52.831+0300    ERROR   controller-runtime.controller   Reconciler error        {"controller": "gke.compute.gcp.crossplane.io", "request": "gcp/kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c", "error": "failed to update cluster object: kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c: Operation cannot be fulfilled on gkeclusters.compute.gcp.crossplane.io \"kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c\": the object has been modified; please apply your changes to the latest version and try again", "errorVerbose": "Operation cannot be fulfilled on gkeclusters.compute.gcp.crossplane.io \"kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c\": the object has been modified; please apply your changes to the latest version and try again\nfailed to update cluster object: kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c\ngithub.com/crossplaneio/stack-gcp/pkg/controller/compute.(*Reconciler)._sync\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/pkg/controller/compute/gke.go:226\ngithub.com/crossplaneio/stack-gcp/pkg/controller/compute.(*Reconciler).Reconcile\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/pkg/controller/compute/gke.go:273\ngithub.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\ngithub.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192\ngithub.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\ngithub.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\ngithub.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\ngithub.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"}
github.com/crossplaneio/stack-gcp/vendor/github.com/go-logr/zapr.(*zapLogger).Error
        /Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/github.com/go-logr/zapr/zapr.go:128
github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler
        /Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:218
github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem
        /Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192
github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker
        /Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171
github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1
        /Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152
github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil
        /Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153
github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait.Until
        /Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88

to:

2019-10-10T13:21:36.521+0300    ERROR   controller-runtime.controller   Reconciler error        {"controller": "gke.compute.gcp.crossplane.io", "request": "gcp/kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c", "error": "failed to update cluster object: kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c: Operation cannot be fulfilled on gkeclusters.compute.gcp.crossplane.io \"kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c\": the object has been modified; please apply your changes to the latest version and try again", "errorVerbose": "Operation cannot be fulfilled on gkeclusters.compute.gcp.crossplane.io \"kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c\": the object has been modified; please apply your changes to the latest version and try again\nfailed to update cluster object: kubernetescluster-5c843147-069e-4a94-81d3-188c9e0fbd9c\ngithub.com/crossplaneio/stack-gcp/pkg/controller/compute.(*Reconciler)._sync\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/pkg/controller/compute/gke.go:226\ngithub.com/crossplaneio/stack-gcp/pkg/controller/compute.(*Reconciler).Reconcile\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/pkg/controller/compute/gke.go:273\ngithub.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).reconcileHandler\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:216\ngithub.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).processNextWorkItem\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:192\ngithub.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller.(*Controller).worker\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/sigs.k8s.io/controller-runtime/pkg/internal/controller/controller.go:171\ngithub.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil.func1\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:152\ngithub.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait.JitterUntil\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:153\ngithub.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait.Until\n\t/Users/hasanturken/Go/src/github.com/crossplaneio/stack-gcp/vendor/k8s.io/apimachinery/pkg/util/wait/wait.go:88\nruntime.goexit\n\t/usr/local/go/src/runtime/asm_amd64.s:1337"}

@turkenh turkenh marked this pull request as ready for review October 10, 2019 18:03
@turkenh
Copy link
Contributor Author

turkenh commented Oct 10, 2019

Another approach could be to provide the functionality from crossplane-runtime like here (which still uses old controller runtime, but can be updated to new method), but I don't see much benefit in doing so since we still need to consume it from other repos.

@jbw976, @negz, what is your take on this?

@turkenh turkenh requested a review from negz October 10, 2019 18:06
@negz
Copy link
Member

negz commented Oct 10, 2019

I don't personally have any issue with seeing stack traces when I enable debug level logging, so personally I'm not convinced this really needs fixing. ¯\_(ツ)_/¯ I think I'm probably in the minority here though.

If we make these changes to our Crossplane logger, do they also apply to the controller-runtime logger?

@turkenh
Copy link
Contributor Author

turkenh commented Oct 10, 2019

If we make these changes to our Crossplane logger, do they also apply to the controller-runtime logger?

Change on Crossplane logger introduces a Logger implementation based on controller-runtime's ZapLagger. Then we would need to use our logger like, which means, the answer is yes:

---	zl := runtimelog.ZapLogger(*debug)
+++	zl := logging.ZapLogger(*debug, true)

	logging.SetLogger(zl)
	if *debug {
		// The controller-runtime runs with a no-op logger by default. It is
		// *very* verbose even at info level, so we only provide it a real
		// logger when we're running in debug mode.
		runtimelog.SetLogger(zl)
	}

I don't personally have any issue with seeing stack traces when I enable debug level logging, so personally I'm not convinced this really needs fixing. ¯_(ツ)_/¯ I think I'm probably in the minority here though.

While working on the issue, I see that the frequency of errors with stack traces are quite low, so honestly I am also not sure that this is really something that needs fixing. May be we should wait for @jbw976 opinion in this regard.

@jbw976
Copy link
Member

jbw976 commented Oct 14, 2019

I definitely still vote for fixing this issue as I have found it to be confusing and misleading for potential users. In the current state of logging, a full call stack gets printed out for simple transient errors that will easily be resolved on the next reconcile. When users look at the logs they are immediately drawn to the big scary call stacks which are actually a red herring and distract from other true/real problems.

Beyond that, the call stacks provide no real value either. If you look at the stack, it's 90% machinery for handling reconcile callbacks and the only real information learned is that it was an error from reconciling. We don't need an entire scary and distracting call stack to tell us that 🤓

lvl := zap.NewAtomicLevelAt(zap.FatalLevel)
o.StacktraceLevel = &lvl
}
zl := runtimezap.New(o)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

i sort of like the idea of having these common options, that will get reused across all the controller based stacks we're building right now, in a common and reusable place like crossplane-runtime. Then these consumers, such as stack-gcp/stack-aws/stack-azure would just call the single common function from crossplane-runtime to get an initialized log instance with a single line.

I see you started this common effort with crossplane/crossplane-runtime#47. Perhaps we can continue pursuing that?

@negz
Copy link
Member

negz commented Oct 14, 2019

@jbw976 Are you sure debug level logging is the place we should be focusing on simplifying and making (arguably) less scary at the expense of becoming (also arguably) less informative? Perhaps I spent too much time in the Java world, but when I am debugging a thing I prefer more context, and definitely feel that stack traces are helpful. They let me easily find the code and context of the call I may be concerned about.

I agree with the sentiment of making it easier for folks to understand what is going on under the covers in Crossplane, but I don't think debug level logging is the place for that. I like the way @suskin framed this problem in https://github.com/crossplaneio/crossplane/pull/858/files#r330818106. If we're optimising for case 1 (users need to figure out why stuff isn't working as expected) I suggest we invest our time in adding support for emitting informative Kubernetes events. If we're optimising for scenarios 3 or 4 (infrastructure operators or developers deep debugging Crossplane) then I suggest more context and information (i.e. traces) is better.

@turkenh
Copy link
Contributor Author

turkenh commented Oct 14, 2019

One information that could help to reach an agreement here is, we still have stacktraces available under errorVerbose field wrapped into a single line with \ns as can be seen #49 (comment).
Even some people are not happy with that, this means not losing any information for us when we disable stacktraces.

@negz
Copy link
Member

negz commented Feb 6, 2020

I'm going to close this issue in accordance with our discussion on crossplane/crossplane#529.

@negz negz closed this Feb 6, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants