Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

GetContextTags question/improvement #51

Closed
georgysavva opened this issue Sep 18, 2020 · 13 comments
Closed

GetContextTags question/improvement #51

georgysavva opened this issue Sep 18, 2020 · 13 comments
Labels
question Further information is requested

Comments

@georgysavva
Copy link

Hi. I have a few questions about the usage of GetContextTags function.

For example, I have a package that has an important set of tags during operation processing and in case of an error, I want to attach those tags to the returned error so the caller could reveal them. It's simple, I do it like that:

return errors.WithContextTags(err, ctx)

And later up to the stack the caller decides to log the error in a structured way, it needs to extract the list of all tags from the error:

tags := errors.GetContextTags(err)

But it returns a list of Buffers (Buffer for each error with tags in the chain) and each buffer contains a list of tags.
So it's [][]Tag, instead of []Tag. Wouldn't it be better to add a function that will return merged buffers with a single list of unique tags, isn't it what the caller interested the most?

Another thing that I wanted to ask about is how context tags work with the CombineError function. As far as I can see if I combine two errors err1 and err2 (both of them have context tags):

return errors.CombineError(err1, err2)

The resulting error will contain tags only from the err1 when I call GetContextTags on it. But for example, if the error gets printed, all stack traces will be displayed for the second error two. Shouldn't GetContextTags visit all child errors (including secondaries) and follow the stack trace printing logic?

@knz
Copy link
Contributor

knz commented Sep 19, 2020

Hi Georgy,

first thank you for your interest and your many questions.

Hi. I have a few questions about the usage of GetContextTags function. [...]

tags := errors.GetContextTags(err)

But it returns a list of Buffers (Buffer for each error with tags in the chain) and each buffer contains a list of tags.
So it's [][]Tag, instead of []Tag. Wouldn't it be better to add a function that will return merged buffers with a single list of unique tags, isn't it what the caller interested the most?

Well it depends, if there were multiple contexts of interest across multiple servers (error flowing through microservices), the caller may be interested to see the grouping of tags to determine how many servers were involved.

That said it is easy and quick to get the merged list of tags: a Buffer has a Merge() method to group all the tags from another buffer. you can do that to get a combined view.

Another thing that I wanted to ask about is how context tags work with the CombineError function.

CombineError is based off WithSecondaryError().

Here what you need to know is this:

  • the Go general rule about error objects: they are singly linked lists with a single cause (via Unwrap()). All the algorithms that recurse on an error structure are linked-list algorithms, by design. The Go designers did not want to create undue complexity by allowing arbitrary error data structures.

  • the cockroachdb/errors library implements WithSecondaryError in a way compatible with the general rule: the secondary errors are mere annotations which have an informative purpose (for human eyes), but no structural purpose (i.e. they do not, intendedly, participate in recursive computations).

But for example, if the error gets printed, all stack traces will be displayed for the second error two. Shouldn't GetContextTags visit all child errors (including secondaries) and follow the stack trace printing logic?

They get printed for your human eyes -- there is no intent to make them structurally visible to algorithms.

The general problem here is that all Go code other than this library expects errors to be structured as linked lists and will never know how to look "sideways". Moreover, it's not possible to "insert" errors in this list due to how error types are implemented.
If we were to build some logic in cockroachdb/errors to do something different, the library would become incompatible with other Go code that assumes the standard Go library semantics.

Hope this helps!

Let me know if you have further questions or comments.

@knz knz added the question Further information is requested label Sep 19, 2020
@georgysavva
Copy link
Author

Thanks a lot for all the details. The CombineErrors logic is clear for me now.

But I still have some uncertainty about the GetContextTags function. I couldn't find any usage of it in the main cockroachdb repository. Is the use case that I described viable?
When low-level layers have important context tags and don't know how to handle an error and just attach those tags to the error. And later up to the stack the high-level layer decides to handle the error by logging, so it extracts the context tags via GetContextTags and uses them for structured logging?

Thanks in advance!

@knz
Copy link
Contributor

knz commented Sep 21, 2020

Your use case seems reasonable to me. But in that case, I would log the various groups of tags separately (i.e. I would make the logging format aware that errors are structured in layers, and report the groups of tags in the layers where they belong).

For example, CockroachDB uses semi-strructured logging and crash reporting: errors are reported verbosely, and the %+v formatting logic shows the context tags for different error layers on different lines.

Is there something like this that could apply to your use case?

@georgysavva
Copy link
Author

Let's consider this code:

package main

import (
	"context"
	"fmt"
	"log"

	"github.com/cockroachdb/errors"
	"github.com/cockroachdb/logtags"
)

func layer1(ctx context.Context) error {
	ctx = logtags.AddTag(ctx, "layer1_tag", "1")
	err := errors.New("root error")
	return errors.WithContextTags(err, ctx)
}

func layer2(ctx context.Context) error {
	ctx = logtags.AddTag(ctx, "layer2_tag", "2")
	err := layer1(ctx)
	err = errors.Wrap(err, "layer1 failed")
	err = errors.WithContextTags(err, ctx)
	return err
}

func main() {
	ctx := context.Background()
	err := layer2(ctx)
	tags := errors.GetContextTags(err)
	var logStmt string
	for _, t := range tags {
		logStmt += fmt.Sprintf("tags: %s\n", t)
	}
	logStmt += fmt.Sprintf("error: %s\n", err)
	log.Println(logStmt)
}

So you mean that I separate/group tags in my log statement by layer and log everything at once with the actual error message. The result is something like that:

2020/09/21 19:46:31 tags: layer2_tag=2
tags: layer2_tag=2,layer1_tag=1
error: layer1 failed: root error

But how can I achieve that with structured logging for example with library like logrus. Log event is just a flat list of key-value pairs with one special key "msg". In order to separate tags I would need to add a prefix like group1. or layer1. to each key and it would complicate the search of those logs.

@knz
Copy link
Contributor

knz commented Sep 21, 2020

yes something like this indeed:

tags := errors.GetContextTags(err)
logStmt := "tags: "
semi := ""
for _, b := range tagBuffers {
	logStmt += semi + b.String()
    semi = "; "
}

But how can I achieve that with structured logging for example with library like logrus. [...] In order to separate tags I would need to add a prefix like group1. or layer1. to each key

Yes, I was thinking about something like that too.

and it would complicate the search of those logs.

Can you explain why on this point specifically?

@georgysavva
Copy link
Author

Can you explain why on this point specifically?

Sure. For example, layer1 has a tag user_id and it logs some information during the processing with this tag. I can easily find those logs via searching by user_id tag. Also when an error occurs and gets logged in the high-level layer the tag receives a prefix: layer1.user_id and it could even be layer2.user_id depending on the particular error chain.
So I won't be able to see this error in logs if I filter by user_id tag as I would see the information logs that are logged from the layer1.

@knz
Copy link
Contributor

knz commented Sep 21, 2020

ok I have two sets of questions

  • about the behavior with separate tags:

    • does the search-by-tag feature restrict your search to exact match?
    • also, if it does prefix match you can do user_id.1, user_id.2 etc ,would this help?
  • about the behavior if you're restricted above:

    • can tag names have duplicates? Is the ordering respected? For example I could imagine pseudo-tags like this: user_id=foo; layer_1_ends_here; user_id=bar
    • if you really have just a tag pool and no ordering, then I start to see the complexity of your predicament. But then the Buffer.Merge() method would help. Yet, please enlighten me: if there are conflicting tags at different layers, what is the behavior you'd like to see in the end?

In this last point, let me illustrate with a case we have from CockroachDB: queries get processed on multiple nodes. So we have n=1, n=2, n=3 on different contexts and an error can contain all three (for different layers). In that case, what would be a useful "combined values"?

In our case, the Merge operator is not so useful because it replaces conflicting tags, so in this case (for example) we'd get just n=3 in the output.

I could perhaps imagine a custom combine logic that says that "for the node ID tag, we need to concatenate values with a separator, not just replace them" so as to get e.g. n=1+2+3 in the final result.

But then -- do you foresee a way to "do the right thing" for every tag value, in a generic way that can be included in the errors library?

@georgysavva
Copy link
Author

does the search-by-tag feature restrict your search to exact match?

No. You should get the list of all logs satisfying the filter

also, if it does prefix match you can do user_id.1, user_id.2 etc ,would this help?

Actually, I am surprised that ELK supports that type of filter: when you specify the key prefix, not the value prefix:

"user_id*" = "my-user-id"

I don't think ordering is respected. It's really like a tag pool/map. And I don't foresee tags with the same name but different values i.e conflicting tags. Actually, I am starting to understand your case with n=1, n=2, n=3. Because cockroachDB is a statufll system it has that thing as AmbientContext that has its own tags i.e context for layer instances/objects. And an error can accumulate conflicting tags from those AmbientContext. But in my case, It's a set of stateless services and I don't expect anything like AmibentContext and hence, conflicting tags. Or I missing something?

I could perhaps imagine a custom combine logic that says that "for the node ID tag, we need to concatenate values with a separator, not just replace them" so as to get e.g. n=1+2+3 in the final result.

I thought about that, when you preserve the tag name and just concatente the values, but in this case you can't understand how tags are grouped per context. So going with suffixes n.1=1, n.2=2, n.3=3might be better.

But then -- do you foresee a way to "do the right thing" for every tag value, in a generic way that can be included in the errors library?

Now I understand that its harder to do. The only thing that is quite general is to introduce a helper function on top of GetContextTags that will return merged tags something like GetContextTagsMerged and explain in the comment that it will replace conflicting tags. Because I believe that will satisfy some percent of the users.

@georgysavva
Copy link
Author

Hi. Any update on this?

@knz
Copy link
Contributor

knz commented Oct 13, 2020

I'm not sure - what update did you have in mind?
I did not see a clear "next action" in the conversation so far.

@georgysavva
Copy link
Author

Sorry I might misuse the word. There is no next action. It was more like a ping because I thought that my last comment #51 (comment) didn't reach you. It contains some hypothetical questions/ideas. If it did reach you and you don't have any further thoughts to share I am totally okay to close the issue because you already covered all my initial concerns.
Thanks a lot!

@knz
Copy link
Contributor

knz commented Oct 14, 2020

I agree with everything you wrote, so I didn't feel there was anything to answer (except maybe to say I agree).

If you have further questions I'll be happy to help further.

@georgysavva
Copy link
Author

Understood.
No further questions.
Thanks a lot for all the answers and your time. Closing the issue!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
question Further information is requested
Projects
None yet
Development

No branches or pull requests

2 participants