Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support canonical log lines & log level per request #583

Closed
mitar opened this issue Aug 20, 2023 · 16 comments · Fixed by #602
Closed

Support canonical log lines & log level per request #583

mitar opened this issue Aug 20, 2023 · 16 comments · Fixed by #602

Comments

@mitar
Copy link
Contributor

mitar commented Aug 20, 2023

Canonical log lines

There are generally two ways to organize log lines which are all related to a particular request:

  • Generate some type of request ID and then include that request ID in all log lines related with the request.
  • Issue only one log line for the whole request, accumulating all logging into it. This can be called canonical log lines.

In a way the this is standard question of normalized vs. denormalized data.

There are some existing issues I could find which seems to talk about support for canonical log lines in zerolog: #362 and #408, but I think it is worth investigating if zerolog should&could support canonical log lines out of the box.

The support for request ID is easy with sub-loggers. And I think it could be the same for canonical log lines. In fact, it could also be organized as sub-loggers, maybe only with some additional .Nested() or some similar call, which would turn on canonical log lines feature. Then, when a sub-logger would be called to send the event, instead of writing it out to the writer, the log line would accumulate inside parent's logger array field associated with the level. Only when the parent logger then issues the event, the log line would be printed out. So the idea here is:

  • At the beginning of the request handling you obtain a canonical log lines logger (e.g., call .Nested() or something).
  • You pass this around inside the context, everythhing logs using it.
  • After handling finishes, you call .Send() or something, maybe even adding more metadata (like amount of data transferred inside the request handling, etc.).

The structure could be something like:

{
  "warn": [...],
  "error": [...],
}

And so on for different levels.

Log level per request

An interesting thing with canonical log lines is also how level filtering works. I propose that the canonical log line gets a level of the most severe event level logged into it. For example, if both error and warn levels were logged for the same request, then the canonical log level gets the error level.

With a canonical log lines, the sub-logger could log at a very detailed logging level (e.g., debug) but the whole line could be issues only if there was an error logged as well by setting minimal log level to error. This would allow developers to have debugging information together with all errors, but only with errors.

Such type of logging is often called "log level per request". To my understanding zerolog does not currently support that either. This could be supported for both canonical log lines and for regular log lines. This is where we can see that this is not just a question of de/normalization of log lines, but about processing of log events in batches.

Batching log events

Both canonical log lines and log level per request show a need for some "batching of log events" feature in zerolog. Where log events would be batched together until some later event decides to write them out or not (potentially modifying them). In the case of canonical log lines, batching would batch all log events for a sub-logger and then write them out as one log event. In the case of log level per request, after batch finishes, logic would decide to write it out or not.

(I am not sure if "batching" is the right term here. Maybe "batch processing" or something?)

Alternatives

Alternatives to both are possible by post-processing logs:

  • You can store all logs into a searchable index and then group log entries by request ID.
  • You could always log at debug level and then remove debugging log entries later on from the index if they are not associated with errors. This is not very performant though and generally logging indices behave well when they are append only.
  • You could imagine a zerolog writer which does such combining of log entries by request ID into canonical log lines. This would require paring JSON back, combining, and then writing new JSON out, again not very performant.
  • Similarly a zerolog writer could filter log entries based on presence of errors per request. But again that requires JSON parsing inside the writer.
@rs
Copy link
Owner

rs commented Aug 20, 2023

I wonder of that could be implemented at a writer level where you create one logger per request with a pre-allocated (pooled) buffer writer. At the end of the request, if the request was an error, you flush the buffer (which writes to the destination writer), if not, you reset it then put it back to its pool.

We could provide such implementation or just reference another package that implements this logic.

@mitar
Copy link
Contributor Author

mitar commented Aug 20, 2023

I think the per-request logger/writer approach can work but the tricky thing is the logic for "At the end of the request, if the request was an error" - if you do not have to inspect what has been logged during the request to make this determination then this works. But if you do have to inspect (e.g., you want to flush the buffer if there was any log message at error level or above) what has been logged, then you have to parse what has been written.

@rs
Copy link
Owner

rs commented Aug 21, 2023

Make it implement LevelWriter and flush the buffer on first error+ level, then set a flag to make it a passthrough after that.

@mitar
Copy link
Contributor Author

mitar commented Aug 21, 2023

You are right. Also I realized that I can easily convert log lines written to the writer into an JSON array using RawMessage (I just have to remove trailing newlines from each line). And given that LevelWriter also gets level information, I can put them into arrays per level. The important thing though is to make this writer thread safe.

Thanks!

I can implement this. Would you consider having a writer (or two) to support this added to this package? I can make a PR. But given the amount of open issues and PRs maybe it would be too much work for you reviewing it?

(BTW, I could help do one pass over issues and PRs if you want. I do not have time to help long-term, but I went through many issues recently trying to better understand this package and found some which could be closed or commented upon. Similarly, I could leave some comments on few PRs. I would not merge anything, I might suggest to you do so though.)

@rs
Copy link
Owner

rs commented Aug 21, 2023

Why is it needed to put those messages in an array? Wouldn't a request id be sufficient to group them?

@mitar
Copy link
Contributor Author

mitar commented Aug 22, 2023

I mean, the whole idea of canonical log lines is that you have one log line per HTTP request handling. So any debug logging during that handling should be nested inside that line. The idea is that you pre-group everything into one log message. It is kinda like optimization for the common case.

My concrete proposal to implement for the zerolog package itself would be to implement the following writer:

type BufferedLine struct {
  Level Level
  Line []byte
}

type BufferedLevelWriter struct {
  lines []BufferedLine
  levelCounts map[Level]int
}

// FlushTo Writes all buffered lines to writer and clears the buffer.
func (w *BufferedLevelWriter) FlushTo(writer LevelWriter) (int, error)

// LevelCount returns the count of lines at level in the buffer.
func (w *BufferedLevelWriter) LevelCount(level Level) int

// FlushLines returns all buffered lines and clears the buffer.
func (w *BufferedLevelWriter) FlushLines() []BufferedLine

I think with such a writer one can then easily implement everything described above. Easily check for if any error logging has been made using LevelCount (I am not completely happy with function description, maybe it should return number of lines at level OR above in the buffer so that caller does not have to iterate over multiple levels themselves). Then it is easy to flush them to another writer using FlushTo. Or if they want to embed them inside some other logging message they can get lines using FlushLines.

@rs
Copy link
Owner

rs commented Aug 22, 2023

The concept of a canonical log if I understand it is that you get all the information of a request in a single line so it has all the info needed for a request.

The argument is that it's easier to query and aggregate. Merging multiple log lines into arrays and log those arrays of structured log lines is in my opinion going the opposite direction as arrays are way harder to query and aggregate in most logging systems I know.

A canonical log would be way easier to implement by the caller. TBH, without knowing about the concept of canonical log lines, that is how I always logged requests: putting all info in a single line. I don't think this should be a logging library feature, but how you are using it.

The log level per request idea though is interesting. Emitting some debug log in the request that are only written to the logger target if the canonical log line (or any log line really) emitted for the request is an error is an interesting concept.

Here is an example of what I have in mind:

type TriggerLevelWriter struct {
	// Destination writer
	io.Writer

	// ConditionalLevel is the level at which lines are buffered until a trigger
	// level line is emitted. Usually this is set to DebugLevel.
	ConditionalLevel Level

	// TriggerLevel is the level that triggers the sending of the conditional
	// level lines. Usually this is set to ErrorLevel.
	TriggerLevel Level

	// TODO: handle thread safety
	buf       bytes.Buffer
	triggered bool
}

func (w *TriggerLevelWriter) WriteLevel(l Level, p []byte) (n int, err error) {
	// Unless triggered, we buffer everything below ConditionalLevel.
	if l <= w.ConditionalLevel && !w.triggered {
		w.buf.Write(p)
		return len(p), nil
	}

	// At first trigger level log line, we flush the buffer and change the
	// trigger state.
	if l == w.TriggerLevel && !w.triggered {
		_, err = w.Write(w.buf.Bytes())
		w.buf.Reset()
		w.triggered = true
		if err != nil {
			return 0, err
		}
	}

	// Anything above ConditionalLevel is always passed through.
	// Once triggered, everything is passed through.
	return w.Write(p)
}

@mitar
Copy link
Contributor Author

mitar commented Aug 22, 2023

I reread again everything I could find about canonical log lines and I think you are right. They do not mention that sub-loggers should nested inside the canonical log line. So I think this means that the pattern to do here is to add to request context:

  • An open logging event to which various parts of the request handling code can add more data to the canonical log line. The logging event is send after the request finishes processing. This can be done in middleware. The canonical log line also introduces the request ID.
  • A sub-logger which has request ID preset so that all logging messages done during request handling can be tied back to the canonical log line (and request itself). Also any other services which are called during request handling should use that request ID ans store logs into the same centralized log storage.

Now the question is only which messages from that sub-logger we really store. The idea behind the "log level per request" is that this depends on how the request was handled/were there any errors in handling it or any errors logged. I think this can be done in the same middleware spot after request handling which writes out the canonical log line that it also decides to write out buffered log lines or discard them.

Your TriggerLevelWriter looks good, but I have few comments:

  • Should also calls to Write method be buffered until it is triggered? Given that you embedded destination writer?
  • Should destination writer be io.Writer or zerolog.LevelWriter? I am not sure what are plans here to chain writers together? I could imagine that one would like to use MultiLevelWriter and write debug logs always to the console but skip them for non-errors when storing them elsewhere? If it is zerolog.LevelWriter then more than just a simple buffer has to be used because also levels have to be stored.
  • I suggest some Flush (or Trigger?) method is added. This could allow one to write out buffered lines even if there was no error logged. For example, the request handling itself could fail (even panic) and in the middleware handling that we could then call this method to get all debugging logs.

@rs
Copy link
Owner

rs commented Aug 22, 2023

Should also calls to Write method be buffered until it is triggered? Given that you embedded destination writer?

We could by good measure, but Write should never be called by the logger when WriteLevel is available.

Should destination writer be io.Writer or zerolog.LevelWriter?

It should stay a io.Writer but this writer should probably propagate levels if the provided writer implements LevelWriter. This is probably something that should be true for all writer wrappers provided by the lib. Good call out.

I suggest some Flush (or Trigger?) method

Agreed

@mitar
Copy link
Contributor Author

mitar commented Aug 22, 2023

OK. I can make a stab at this and make a PR.

@rs
Copy link
Owner

rs commented Aug 22, 2023

Note there is an error in my code, when flushing the buffer, we need to call one Write per log line instead of writing all the lines at once (it's part of zerolog's contract).

@mitar
Copy link
Contributor Author

mitar commented Aug 22, 2023

Yea, there are more errors. :-) Like returning 0 for written bytes when there is an error in the trigger case.

In general I think we will have to store lines and levels, line by line, so then it is easy to loop over.

@rs
Copy link
Owner

rs commented Aug 22, 2023

I think it's better to store them in a (pooled) buffer so we can keep the zero allocation promise and minimal code complexity. In the case the writer is trigger (which should be the exception), we can easily read lines back from the buffer.

The level can be stored as the first byte of each line and trimmed on write.

@mitar
Copy link
Contributor Author

mitar commented Aug 22, 2023

So I am not so familiar with preventing allocations in Go, so please correct me, but my understanding is that the main allocation here is the line itself. So when WriteLevel(l Level, p []byte) is called, l and p are already allocated. I think having a pool of []struct {L level, p []byte} slices would be much better than trying to modify the p to prefix the level, or even to copy p to some other pooled buffer? And memory wise, isn't it having a struct very similar to having a linear buffer, just that you do not have to prepend/split the level yourself, but can just access it inside the structure?

So my proposal would be to simply have sync.Pool of []struct {L level, p []byte} slices where I literally store l and p into that struct? Before putting the slice back, I change the length of the slice back to 0, but leave its already allocated capacity?

@mitar
Copy link
Contributor Author

mitar commented Oct 15, 2023

Should also calls to Write method be buffered until it is triggered? Given that you embedded destination writer?

We could by good measure, but Write should never be called by the logger when WriteLevel is available.

I realized this makes no sense. NoLevel is higher than almost all other levels, so Write always triggers the writer. I think it is better if Write simply bypasses whole logic, writes always through and never triggers by itself.

mitar added a commit to mitar/zerolog that referenced this issue Oct 15, 2023
@mitar
Copy link
Contributor Author

mitar commented Oct 15, 2023

I just made PR with what we discussed above: #602

mitar added a commit to mitar/zerolog that referenced this issue Oct 16, 2023
mitar added a commit to mitar/zerolog that referenced this issue Oct 16, 2023
mitar added a commit to mitar/zerolog that referenced this issue Nov 21, 2023
mitar added a commit to mitar/zerolog that referenced this issue Nov 21, 2023
mitar added a commit to mitar/zerolog that referenced this issue Nov 21, 2023
@rs rs closed this as completed in #602 Nov 28, 2023
rs pushed a commit that referenced this issue Nov 28, 2023
autarch pushed a commit to 10gen/zerolog that referenced this issue Jan 25, 2024
madkins23 pushed a commit to madkins23/zerolog that referenced this issue Mar 2, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants