Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

populating request uuid from context in log util function #429

Closed
wants to merge 2 commits into from

Conversation

cl1337
Copy link
Contributor

@cl1337 cl1337 commented Aug 10, 2018

we need some container during the life-cycle of a request (tchannel server / tchannel client / http server / http client) within zanzibar gateway, to hold the background context data we care, like requestUUID, since the log might happen anytime, and we probably want to propagate such context data across the boundary of different modules and services, there're a couple of options here:

1: sub-logger spawn on each request, which we proved with benchmark that it only works better if we have very intense logging: #425, with our current requirement it's not worth the memory signature

2: bind context to a request object, so that we can access it every time we need to log, binding context rationale:

  • there's only one context per request
  • context mutation is done in request initialization, it's mostly read only during the life cycle of the request
  • it seems we're already referencing context and pass it around as param multiple places in request

questions: what's the con of binding context to request?

3: use request header as container:
pros:

  • it's already there, it will be a uniq place we find all context
  • header can cross RPC boundary (pass to another service)

cons:

  • current endpoint / client req / res have its own header schema and we mutate headers heavily in endpoint / workflow for business use cases, header as the container to carry log context may lead to nondeterministic logging behavior and logic couple

open to discussions ^

@cl1337 cl1337 changed the title WIP populating request context in log populating request uuid from context in log util function Aug 10, 2018
@ChuntaoLu
Copy link
Contributor

2: bind context to a request object, so that we can access it every time we need to log

I think adding context to the request struct would be okay if we can avoid context ambiguity and make sure the call graph is well represented by the contexts. Adding the context to the request struct should imply that methods defined on the struct use that context, but what if that method is called by a function that uses a different context? Which context should be used in this scenario? We don't want that to happen, but that might not be at our control.

We will nevertheless keep the relevant function/method signatures as is, i.e. context is passed in as the first parameter. One reason is to keep interfaces consistent across packages and enable static analysis tools to check context propagation. The other reason is that there are scopes where the abstraction of a request becomes less relevant, and there might not be a way to access the request. But from operation point of view, we still want to preserve the meta information relevant to the request. This is what makes it more viable to pass them around as arguments rather than to access them off a method call receiver.

The documentation on context explicitly points out:

Do not store Contexts inside a struct type; instead, pass a Context explicitly to each function that needs it.

It is somewhat overly restrictive, but probably for a good reason.

I think you have focused on context as a container, and from that perspective attaching it to the request does make sense. But the side effect it introduces could be detrimental to the more important aspect of the contexts, which is call graph management.

@cl1337
Copy link
Contributor Author

cl1337 commented Aug 11, 2018

The documentation on context explicitly points out:

I understand document said no, but I'm curious why? :)

Other than consistent or interface convention, the only possible go-wrong I can think of is: when context is used in blocking call, its control function like cancel or timeout is only meaningful in a blocking call. Binding it inside a struct may have some wired behavior that I'm not aware of.

Which is why I think headers may serve a better purpose as a container, it's generic way of doing this in our services with different language, prevents unnecessary context toss around just because we need to access request meta, and as we discussed before, it cross service boundary

the best choice if performance allows would be request logger, which is more of a request attribute other than a passed around object.

I just don't have confidence that we can restrict and manage request headers for headers like routing key or identify id

@ChuntaoLu
Copy link
Contributor

I understand document said no, but I'm curious why? :)

Well, the point of my last comment is to answer why.

I think headers may serve a better purpose as a container, it's generic way of doing this in our services with different language

Headers are only accessible when you have a reference to the request, in scopes where request is not accessible, the meta info still needs to be passed in.

prevents unnecessary context toss

context is already passed when scopes change, I don't see where extra contexts need to added for the sake of request meta, do you have examples?

it cross service boundary

Yes, for http, header is a good use to convey meta info across service boundaries, but it may not apply to other protocols.

I just don't have confidence that we can restrict and manage request headers for headers like routing key or identify id

What do you mean by "restrict and manage"?

@cl1337
Copy link
Contributor Author

cl1337 commented Aug 13, 2018

I was trying to solve a problem without having to generalize in zanzibar for now: a http request hit zanzibar generated service, attach user defined meta info along with this http request, including user uuid, user token, generated request uuid, mobile meta ... and such. Propagate such meta in every log entry.

Due to timeline I'll have to address this outside zanzibar, let's iterate on this in another thread. The discussion here still remains valid: logger should be collecting meta from either context or headers rather than spawn sub-logger per request

Headers are only accessible when you have a reference to the request, in scopes where request is not accessible, the meta info still needs to be passed in.

context is already passed when scopes change, I don't see where extra contexts need to added for the sake of request meta, do you have examples?

here is an example where header presents for logging while context is not

Yes, for http, header is a good use to convey meta info across service boundaries, but it may not apply to other protocols.

I will not address other protocols for current problem

What do you mean by "restrict and manage"?

header value could be overridden, user could mutate requestUUID since it's in headers, and define own identify key like user uuid or token uuid. If I want to prevent this (not every request have user uuid, but all request should have request uuid), I may want to restrict user from define certain headers ... etc

@cl1337 cl1337 closed this Aug 13, 2018
@cl1337 cl1337 deleted the cli/add-req-uuid branch August 13, 2018 21:12
@ChuntaoLu
Copy link
Contributor

here is an example where header presents for logging while context is not

context and headers are not mutually exclusive, in your example the request is accessible so you already have all the meta info needed for logging, and the function is a simple helper, there is no need to have context here. It is not like we have to have either header or context.

@cl1337
Copy link
Contributor Author

cl1337 commented Aug 13, 2018

context and headers are not mutually exclusive

I agree with this in principle, for the problem this PR was to address: if we choose one as container, theoretically we can collect meta from the other, but not necessary(split meta info in 2 types of containers)

My originally point was having every function may need to log inside takes param context seems a bit unfeasible but it's a pretty trivial from the limited scope of this problem

let's defer this to later use cases / problem for justification

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants