Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Accept context when logging. #116

Closed
bogdandrutu opened this issue Oct 28, 2021 · 25 comments
Closed

Accept context when logging. #116

bogdandrutu opened this issue Oct 28, 2021 · 25 comments
Assignees

Comments

@bogdandrutu
Copy link

Is it possible to extend the API to accept context when logging? The reason is that we want to extract information from the context like opentelemetry trace ids (trace_id/span_id) or other information that we propagate in the app using the context.

@wojas
Copy link
Contributor

wojas commented Oct 29, 2021

I guess adding a WithContext(ctx) method would make sense, in combination with an optional ContextLogSink interface.

@pohly
Copy link
Contributor

pohly commented Oct 29, 2021

WithContext(ctx) is doable, but it will have higher overhead than an explicit parameter and also a potential for misuse:

func doSomething(ctx context) {
   logger = logger.WithContext(ctx)
   ctx = context.WithValue(ctx, ...)
   logger.Info("I'm using the wrong context")
}

The alternative is to add InfoC and ErrorC which take an additional parameter and then call methods in a new, optional ContextLogSink interface.

Another idea: Logger construction could take a func(ctx context.Context) []interface{} parameter, either during init or via WithContextValues, and then InfoC and ErrorC can call that function to extract key/value pairs that it adds to the log message via the LogSink's WithValues call - that would work with all logging implementations.

But the problem in all of these cases is that all log calls must be updated. Do you control the entire code base? What about third-party code that accepts a Logger but then emits log messages with the existing Logger.Info?

The portable solution would be to do a bit more work when adding these values to the context:

  • add them to the context
  • add them to the logger
  • add the logger to the context

@wojas
Copy link
Contributor

wojas commented Oct 29, 2021

I would assume that the WithContext would only be called once per request and the resulting object passed around to all code that performs logging.

Passing a context for every single log call does not really make sense and makes the API uglier. It would also require updating every single call site to pass a context explicitly, or the trace_id and span_id would not be available.

It is possible to construct a Logger with the request context for every request, but then every part of the code that handles a new request needs to mess with creating loggers, instead of just defining this once in the main package.

@wojas
Copy link
Contributor

wojas commented Oct 29, 2021

I just realized that in order to pass the proper span_id, you would have to call WithContext at the span level and not just once for the request. But the overhead would probably still be small, with a span covering the execution of a non-trivial component that is worth tracing.

@pohly
Copy link
Contributor

pohly commented Oct 29, 2021

I would assume that the WithContext would only be called once per request and the resulting object passed around to all code that performs logging.

Is that sufficiently different from the already supported solution via WithValue that we need a new API, in particular when considering that the new API will only work for certain logging backends?

@bogdandrutu
Copy link
Author

I would prefer to add "InfoC and ErrorC", and here is the reason:

  • For WithValue/WithContext solution to work (with small overhead) I have to propagate the Logger with the request as well. I know that there is a way to put Logger into context, but that is a bit hacky since context is to propagate request specific things. In general I see a Logger instance to have a static scope not a request scope, so for example a component creates a Logger object and adds things like component name to the logger fields. So for me to be able to add "request specific things" and "component/instance specific things" I cannot do it without paying the overhead of one "WithValue" call.

@pohly
Copy link
Contributor

pohly commented Oct 29, 2021

I know that there is a way to put Logger into context, but that is a bit hacky since context is to propagate request specific things.

WithValues makes the logger returns a logger that is specific to a certain request. IMHO passing that then through the context is fine.

@bogdandrutu
Copy link
Author

It is not ok for me, see an example code, written for Zap but applies for this API as well:

type ComponentA struct {
  logger *zap.Logger
  compB *ComponentB
}

func NewComponentA(logger *zap.Logger, compB *ComponentB) *ComponentA {
  return &ComponentA {
    logger: logger.With(zap.String("component", "A")),
    compB: compB,
  }
}

func (a *ComponentA) DoWork(ctx context.Context, req interface{}) error {
  // do some work.
  a.logger.DebugC(ctx, "Start compB.DoWork")
  err := a.compB.DoWork(ctx, req)
  a.logger.DebugC(ctx, "Finish compB.DoWork", zap.Error(err))
  return err
}

type ComponentB struct {
  logger *zap.Logger
}

func NewComponentB(logger *zap.Logger) *ComponentB {
  return &ComponentB {
    logger: logger.With(zap.String("component", "B")),
  }
}

func (a *ComponentA) DoWork(ctx context.Context, req interface{}) error {
  // do some work.
  a.logger.InfoC(ctx, "I am zen")
  return nil
}

How would you refactor this App with "WithValue"

@pohly
Copy link
Contributor

pohly commented Oct 30, 2021

Let's also consider how such code would be used and how it would interact with other code that does logging.

Presumably these component get called in a for loop where requests are generated and spans are added, something like this:

var spanKey struct{}

func run(ctx context.Context, a ComponentA, queue <-chan interface{}) {
	for {
		select {
		case <-ctx.Done():
			return
		case req := <-queue:
			ctx := context.WithValue(ctx, spanKey, "span-id")
			a.DoWork(ctx, req)
		}
	}
}

This leads to my first question: how is logging of the context values expected to be handled? Is the expectation that the program uses a logger implementation that knows about tracing spans? That adds additional dependencies to logging implementations that not everyone needs or limits the choice of a logging implementation when this feature is needed.

Or is the tracing library also providing the logger? A full implementation or a wrapper for some other logger?

@pohly
Copy link
Contributor

pohly commented Oct 30, 2021

Let's also assume that ComponentB.DoWork calls a function that currently logs via a global logger (a problem that we have in Kubernetes):

func (b *ComponentB) DoWork(ctx context.Context, req interface{}) error {
  // do some work.
  b.logger.InfoC(ctx, "I am zen")
  DoSomething("enlightenment")
  return nil
}

func DoSomething(arg string) {
	klog.InfoF("my argument is %q", arg)
}

With the proposed logging of additional context values via InfoC, DoSomething must get a context and a logger as parameter:

func (b *ComponentB) DoWork(ctx context.Context, req interface{}) error {
  // do some work.
  b.logger.InfoC(ctx, "I am zen")
  DoSomething(ctx, b.logger, "enlightenment")
  return nil
}

func DoSomething(ctx context.Context, logger logr.Logger, arg string) {
	logger.InfoC(ctx, "Doing something", "arg", arg)
}

Agreed or would you solve this differently?

@pohly
Copy link
Contributor

pohly commented Oct 30, 2021

So here's how I would address this problem:

package main

import (
	"context"

	"github.com/go-logr/logr"
)

type ComponentA struct {
	compB *ComponentB
}

func NewComponentA(compB *ComponentB) *ComponentA {
	return &ComponentA{
		compB: compB,
	}
}

func (a *ComponentA) DoWork(ctx context.Context, req interface{}) error {
	logger := logr.FromContextOrDiscard(ctx).WithName("A")
	logger.Info("Start compB.DoWork")
	ctx = logr.NewContext(ctx, logger) // optional, see comment below
	err := a.compB.DoWork(ctx, req)
	logger.Info("Finish compB.DoWork", "err", err)
	return err
}

type ComponentB struct {
}

func NewComponentB() *ComponentB {
	return &ComponentB{}
}

func (b *ComponentB) DoWork(ctx context.Context, req interface{}) error {
	logger := logr.FromContextOrDiscard(ctx).WithName("B")
	logger.Info("I am zen")
	DoSomething(logger, "enlightenment")
	return nil
}

var spanKey struct{}

func run(ctx context.Context, a *ComponentA, queue <-chan interface{}) {
	logger := logr.FromContextOrDiscard(ctx)
	for {
		select {
		case <-ctx.Done():
			return
		case req := <-queue:
			spanID := "span-id"
			logger := logger.WithValues("span", spanID)
			ctx := context.WithValue(ctx, spanKey, spanID)
			ctx = logr.NewContext(ctx, logger)
			a.DoWork(ctx, req)
		}
	}
}

func DoSomething(logger logr.Logger, arg string) {
	logger.Info("Doing something", "arg", arg)
}

func main() {
	logger := logr.Discard() // TODO: use some real logger
	ctx := logr.NewContext(context.Background(), logger)
	b := NewComponentB()
	a := NewComponentA(b)
	var queue chan interface{}
	run(ctx, a, queue)
}

You chose to use zap.Logger.With to identify the component. Here I am using names, which can be combined. In this example, the "I am zen" and "Doing something" log messages will be logged with "A/B" (the combination of two names) to identify the call path. This is different from merely identifying just one component that is responsible for emitting the log messages. Whether it's "better" depends on the expectations - I personally find it more informative.

Regarding the handling of tracing IDs, I see the following advantages of this approach:

  • Works with all logger implementations and log calls without modifications.
  • Passing just one parameter, the logger, to DoSomething feels more natural. Besides that, we cannot just remove Logger.Info (API break in logr v1.0.0), so there won't be any obvious indication to the implementer of DoSomething that the function should accept a context and use the newer Logger.InfoC.
  • Passing a logger from the caller to ComponentA.DoSomething enables additional use-cases. The "call path" via names is one example (the caller might pass in a logger which already has some names), another is adding arbitrary additional key/value to each log message (the approach with additional context values only works for keys that the logger knows about).

The disadvantage is the overhead caused by more context.WithValue and context.Value calls.

@wojas
Copy link
Contributor

wojas commented Nov 2, 2021

After giving it some more thought, I agree that explicitly passing any trace and span IDs as values would be the best use of the interface. Binding a new context after every new span would be error prone and non-obvious.

I cannot think of any real use for methods like InfoC. For the span IDs you want to bind the information in the logger instance, or suddenly you need to update all your logging code to pass this information. This would also mean that external modules will be split between requiring a context and meticulously passing it on every log call, and modules that do not pass the span information. This would be really bad for the logr ecosystem.

@bogdandrutu
Copy link
Author

I will try to summarize here:

This leads to my first question: how is logging of the context values expected to be handled? Is the expectation that the program uses a logger implementation that knows about tracing spans? That adds additional dependencies to logging implementations that not everyone needs or limits the choice of a logging implementation when this feature is needed.

I can think of a logger implementation that allows users to config an "enhancer/hook" like func (context.Context) []interface{} that will be called to extract any information from the Context.
For the k8s example, no need to pass a logger, just the Context is enough. Users will configure the "enhancer/hook" the same way for local instance vs global instance:

func (b *ComponentB) DoWork(ctx context.Context, req interface{}) error {
  // do some work.
  b.logger.InfoC(ctx, "I am zen")
  DoSomething(ctx, b.logger, "enlightenment")
  return nil
}

func DoSomething(ctx context.Context, arg string) {
	klog.InfoC(ctx, "Doing something", "arg", arg)
}

More about why passing the "logger" via Context is an anti-pattern. See https://pkg.go.dev/context:

Use context Values only for request-scoped data that transits processes and APIs, not for passing optional parameters to functions.

As the official documentation says we should not pass arguments via Context, as well as we should only pass "request-scoped" data, the "logger" is not a request-scoped data (As a joke: this trick of passing the logger via Context started from the idea of "using global instances is bad" let's add it to the Context :)), but what do you add to the Context is still a global). The main idea is that "logger" should not be passed in the Context, and instead other "request scoped" data should be which sometimes you want to log them.

@pohly
Copy link
Contributor

pohly commented Nov 11, 2021

the "logger" is not a request-scoped data

Here I disagree. "Use this logger for this request" very much is request-scoped information and enables the caller to choose additional names and values for the request.

@bogdandrutu
Copy link
Author

bogdandrutu commented Nov 11, 2021

What about then "use this logger for this component" part? Which one is more common? you cannot have both. Traditionally logger was "per component" to allow finer control at component level.

@bogdandrutu
Copy link
Author

bogdandrutu commented Nov 11, 2021

My point being you cannot have both in an application, so I think there is a need to pick one winner and facilitate the other via API.

Imagine that I want different logging level for ComponentA than ComponentB, or to have different instances of loggers etc. then this is limited because I have to use that logger for the entire request.

@pohly
Copy link
Contributor

pohly commented Nov 11, 2021

My point being you cannot have both in an application, so I think there is a need to pick one winner

That's true.

I can see some value in having per-component loggers with different names because it's a better explanation of where a message was generated than the source file name.

But for the kube-scheduler case that I have been looking at it's more important to have per-request values attached to each log message. Using the name of the scheduler framework plugin also seems more useful to me than "framework" or "client-go" (= components called by the plugins) because it provides a much better context for the individual log message ("while processing pod xyz with plugin foo, this happened").

For Kubernetes, I intend to propose a KEP in the next development cycle about "contextual logging" that will have more use cases and details. No changes will be needed in go-logr for that.

Imagine that I want different logging level for ComponentA than ComponentB, or to have different instances of loggers

How would a user configure that? Which program allows fine-grained control over logger configuration for individual program modules?

In klog, there is -vmodule (= configure log level per source file), but it has a performance overhead and it wasn't clear how much it really is used besides manual debugging.

If different log levels are desired, are they for modules ("I am debugging client-go") or for context ("I am focusing on a specific request with certain values", or "I am interested in a certain scheduler plugin and anything it invokes, including client-go")? IMHO both are valid approaches and thus cannot be used to pick one logger setup over the other.

@bogdandrutu
Copy link
Author

bogdandrutu commented Nov 11, 2021

But for the kube-scheduler case that I have been looking at it's more important to have per-request values attached to each log message. Using the name of the scheduler framework plugin also seems more useful to me than "framework" or "client-go" (= components called by the plugins) because it provides a much better context for the individual log message ("while processing pod xyz with plugin foo, this happened").

You can have extra "request specific data" like what opentelemetry/w3c calls baggage (see https://www.w3.org/TR/baggage and https://github.com/open-telemetry/opentelemetry-go/blob/main/baggage/baggage.go), essentially [key, values] pairs associated with the Context and one of them can be "plugin" and with same "hook/enhancer" mechanism you can extract tracing data like span_id as well as other infos like "plugin" from other context entry. Also that "plugin" entry from the Context baggage may be used by other telemetry signals like a label/dimension in metrics recorded by one of the downstream component. In my opinion that is the "request specific data" the fact that the caller plugin is Foo.

Another use-case that you need to keep in mind is that by passing the context you can have applications that use a "global" logger like in the k8s example you gave to me, and if they pass the Context they can also get context data to log entries.

In klog, there is -vmodule (= configure log level per source file), but it has a performance overhead and it wasn't clear how much it really is used besides manual debugging.

I am not sure why that is slow, but I think a pretty optimal implementation can be done by having a Logger instance that supports a sampler and every time you do Logger.With("component", "foo") it can say if Logger has attribute "component = foo" sample 1/1000. Or any other attribute the application owner decides.

** EDIT: **
FYI: @pohly I really enjoy this conversation, and I am trying to brainstorm and search for the right solution. I am happy to review the KEP as well if you need help there.

@pohly
Copy link
Contributor

pohly commented Nov 11, 2021

I am not sure why that [= -vmodule] is slow

It has to do source-code lookups for each call site before even deciding about whether the log message should be logged. There's a cache, so perhaps it's okay. I don't have hard data.

every time you do Logger.With("component", "foo") it can say if Logger has attribute "component = foo" sample 1/1000

The same can be done when names get added along the call path (my scheduler example). That and configuration by values seem useful.

I really enjoy this conversation, and I am trying to brainstorm and search for the right solution. I am happy to review the KEP as well if you need help there.

Glad to hear that and thanks for the offer. I'll link it here when it's ready.

@bogdandrutu
Copy link
Author

Yes, so to summarize your proposed solution (will work):

  • It has a bit more "allocations" see (and works fine for just "logs" if same data needs to be shared between logs/metrics/tracing you do even more allocations):
	logger := logger.WithValues("span", spanID)
	ctx := context.WithValue(ctx, spanKey, spanID)
	ctx = logr.NewContext(ctx, logger)
  • It forces to avoid "global" logger (which is a good pattern in general, but also a limitation) vs the InfoC allows global logger. Passing the "Context" is the same in both solutions.
  • For any other "context/request specific data" user has to manually do some work to add it specifically to the logger in the Context, vs installing a new hook/enhancer to the logger instance. Here is more important if the data in the context is added by a third-party library see for example "grpc.Peer".

@pohly
Copy link
Contributor

pohly commented Nov 12, 2021

It forces to avoid "global" logger (which is a good pattern in general, but also a limitation) vs the InfoC allows global logger. Passing the "Context" is the same in both solutions.

I intend to have a global logger, but only as fallback. A library that supports retrieving a logger from context then still works fine inside applications which only initialize a global logger.

For any other "context/request specific data" user has to manually do some work to add it specifically to the logger in the Context, vs installing a new hook/enhancer to the logger instance. Here is more important if the data in the context is added by a third-party library see for example "grpc.Peer".

The gRPC case probably can be handled once for all calls by an interceptor which extracts the third-party data and updates the logger. Then neither gRPC nor the logger implementation need to know about each other. It's the application developer who integrates them.

@bogdandrutu
Copy link
Author

The gRPC case probably can be handled once for all calls by an interceptor which extracts the third-party data and updates the logger. Then neither gRPC nor the logger implementation need to know about each other. It's the application developer who integrates them.

I can give you other examples where libraries add things to the Context and do not offer a "interceptor" mechanism.

@thockin
Copy link
Contributor

thockin commented Nov 24, 2021

I do not want to add InfoC / ErrorC. It's too much cost for too little return, I think.

I'd probably reach for bespoke wrappers here. Either write your own:

func (a componentA) LogInfo(ctx context.Context, msg string, args ...interface{})

or make your own Logger type which embeds logr.Logger and add InfoC().

I could MAYBE be talked into something like:

log := logr.Discard() // use a real one, duh

extractFoo := func(ctx context.Context) (string, interface{}) {
    foo := foopkg.FromContext(ctx)
    return "fookey", foo
}

extractBar := func(ctx context.Context) (string, interface{}) {
    bar := barpkg.FromContext(ctx)
    return "barkey", bar
}

log = log.WithContextValues(extractFoo, extractBar)

// later

ctxlog = log.WithContext(ctx)

ctxlog.Info("message") // will call each func on the context

The point being simplicity - register any number of datum-extractors (after
writing it, maybe returning []interface{} is OK), and run those on the
context any time we want to emit a log. If there's no context set, skip the
funcs.

But I'd want to see why wrapping doesn't work?

@thockin thockin self-assigned this Nov 24, 2021
@pohly
Copy link
Contributor

pohly commented Dec 7, 2021

@bogdandrutu: I have published a PR with the KEP and intend to discuss it at this week's SIG Instrumentation meeting: kubernetes/enhancements#3078

The first step will be to decide whether the SIG wants to tackle this problem.

Under "Alternatives" I am summarizing the "per component logger" approach and refer to this issue here for a more thorough discussion.

@thockin
Copy link
Contributor

thockin commented Jan 19, 2022

Closingfor now - if there's new info or something, I'd love to hear it.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants