Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

What to do with model.Span.ParentSpanID #824

Closed
yurishkuro opened this issue May 16, 2018 · 10 comments
Closed

What to do with model.Span.ParentSpanID #824

yurishkuro opened this issue May 16, 2018 · 10 comments
Assignees

Comments

@yurishkuro
Copy link
Member

yurishkuro commented May 16, 2018

As part of #773 I am trying to replace the domain model with gogoprotobuf-generated classes. It's been going well so far, the last stumbling block is the ParentSpanID field that currently does not exist in protobuf model. The reasoning for not including that field in the new model was to simplify the handling of span references, as we have seen bugs and confusion stemming from the duality of representation where parentSpanID is equivalent to a child-of span reference, while the reference itself may or may not be present in the references array (and in some cases duplicate references are present due to the same confusion/bugs).

There is also one use case that the current model does not capture. When a span is created with a list of predecessors (span references), all Jaeger clients pick one of those predecessors as the "primary parent" from which the trace ID and flags are inherited. In practice we don't have much instrumentation today that would even pass multiple references.

Option 1 - remove ParentSpanID, use References field only

This is the current implementation in the gogoproto-model branch.

Pros:

  • unified handling of predecessors

Cons:

  • requires memory allocation for the slice in 99% of spans
  • does not capture the primary parent from which trace ID is inherited

Option 2 - replace ParentSpanID with ParentSpanRef

Pros:

  • no extra memory allocations for 99% of spans
  • captures the primary parent

Cons:

  • bifurcates processing of primary parent from other predecessors

Comments:

  • this option is similar to how OpenCensus separates "parent span id" from "links". In Census links are are directional and can be added to both sides of the edge.
  • the reason to store complete SpanRef instead of just parent span ID is to capture type of reference

@jaegertracing/jaeger-maintainers

@objectiser
Copy link
Contributor

I'm currently leaning towards option 1 - primarily because the idea of a parent implies a hierarchical structure, which although true in most current cases, may not deal with other interesting situations (e.g. joins).

One approach to determining the trace id could be:

  • if all references belong to the same trace id, then the span should also be associated with that trace id
  • otherwise it is a join across multiple trace instances, so should result in a new trace id

@yurishkuro
Copy link
Member Author

yurishkuro commented May 16, 2018

hierarchical structure, which although true in most current cases, may not deal with other interesting situations

is it worth optimizing for theoretically interesting cases, which are unlikely to emerge any time soon? In fact, even those cases are much more likely to involve joins across different traces, like a buffered write, rather than joins within a single trace. The only join case within the same trace I can think of is when we want to explicitly parallelize a computation. However, that does not require an actual join, I would instead model it all with chlid-of references like this:

image

@isaachier
Copy link
Contributor

Seeing as the ParentSpanRef can be null, isn't it an allocated pointer regardless?

@yurishkuro
Copy link
Member Author

We would use zero value instead of making it nullable, same as the existing ParentSpanID.

@isaachier
Copy link
Contributor

isaachier commented May 16, 2018

OK cool. If you cared about allocation, then don't use Go ;). In all seriousness, the allocation isn't a huge concern to me nor IMO to most Go users. It is a slight optimization but the object is small enough to be negligible. Furthermore, the Go compiler might be able to optimize it out, and if not, you can always use a sync.Pool to store them.

@isaachier
Copy link
Contributor

If I had to choose one, it would be option 1 to move logic out of the clients. The parent span selection is convoluted and it is likely for mistakes to creep into a single language's client without us catching it. I'd rather this be done on the backend.

@objectiser
Copy link
Contributor

The other reason for preferring option 1 is that the OT spec doesn't include the concept of a primary reference - and therefore applications written in a provider agnostic way could supply references in any order.

@isaachier
Copy link
Contributor

I was playing around with the new Protobuf model and realized that porting it to the clients would not make clients simpler. Although they would not need to maintain an explicit parent reference in the span context, the clients still must determine the parent for flags and baggage. Am I correct here or missing something?

@yurishkuro
Copy link
Member Author

yes, clients still need to determine the parent given multiple references, but they don't need to figure out the rules for encoding the parent ID vs. parent reference in the reported span. We've had bugs due to this logic that were causing the same parent reference to be repeated twice.

@isaachier
Copy link
Contributor

OK I understand that. Thanks for the explanation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants