Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add typed spans #502

Closed
wants to merge 6 commits into from
Closed

Conversation

thisthat
Copy link
Member

This draft proposes two API variants for typed spans addressing the issue open-telemetry/opentelemetry-java#778. It uses the HTTP and HTTP Server semantic conventions as an exemplar for how the APIs look like.
The first variant is in the flat package, whereas the second is in the hierarchy package.
Both variants provide facility methods to set relevant attributes of a given Semantic Convention. Typed spans delegate to the consumer of the API the knowledge of the source values for these attributes.
Furthermore, warnings are logged if important attributes of the Semantic Convention were not provided before terminating the span.

@iNikem
Copy link
Contributor

iNikem commented Jun 12, 2020

Thank you @thisthat for this! :) I will take a closer look and play with this during weekend.

@thisthat
Copy link
Member Author

I ran some benchmark for both variants using the SpanPipelineBenchmark as a template. On my laptop, they all perform the same.
benchmark

@iNikem
Copy link
Contributor

iNikem commented Jun 13, 2020

@thisthat How will these spans work with Context? Specifically, is it possible to get a concrete subclass of the span from the current Context in type-safe manner?

@thisthat
Copy link
Member Author

@iNikem thank you for having a look :) Typed Spans are not supposed to directly interact with the context. The concrete Span can be retrieved using the getSpan() method. I just realized that I forgot to add such a method in the "hierarchy" variant!

@iNikem
Copy link
Contributor

iNikem commented Jun 13, 2020

Will that mean that all "type" information will be lost as soon as typed span is put into context? And we cannot "convert" regular span into typed span. Will play more with this tomorrow...

@iNikem
Copy link
Contributor

iNikem commented Jun 15, 2020

Some notes and assorted thoughts.

I don't understand the hierarchy variant :) Why those BasicXXX classes are needed?

My biggest question/concern atm is why these spans does not form actual hierarchy: HttpServerSpan extends HttpSpan extends Span? The same for builders. Currently in order to set span's parent or an arbitrary attribute one has to call getSpanBuilder which completely breaks fluent interface.

Is it expected that not all attributes have setters? E.g. flat.HttpServerSpan does not have setNetPeerXXX methods.

I like the idea of checking for mandatory attributes :) But have you thought about some way to make it easier for clients to understand, what attributes are mandatory? Before they try to create a span and see warnings in logs.

Have you thought how these new classes will play with io.opentelemetry.trace.attributes.SemanticAttributes?

Other nitty-gritty comments/suggestions can be addressed later, I think.

@thisthat
Copy link
Member Author

thisthat commented Jun 16, 2020

@iNikem thank you for the valuable feedback!

I don't understand the hierarchy variant :) Why those BasicXXX classes are needed?

BasicXXX classes are used to generify over Span and Builder types, whereas the XXXSpan use concrete types for these. I could not find another way to implement this. If you have a better idea, I am open to suggestions :)

My biggest question/concern atm is why these spans does not form actual hierarchy: HttpServerSpan extends HttpSpan extends Span? The same for builders. Currently in order to set span's parent or an arbitrary attribute one has to call getSpanBuilder which completely breaks fluent interface.

Typed Spans do not form a hierarchy but encapsulate the logic of a Span to facilitate a given Semantic Convention. Implementing the Span interface would not produce any benefit than having yet another RecordEventsReadableSpan class with some extra methods. W.r.t. the parent, you are right! I did not think about this! There are instances in which the automatic one from the context is not the right one to use. To solve this, we can add another method setParent() that makes it fluent.

Is it expected that not all attributes have setters? E.g. flat.HttpServerSpan does not have setNetPeerXXX methods.

Yes! Since the HTTP Server Semantic Convention requests only net.host.[name|port], I only added these setters. Extra fields that do not strictly belong to the semantic convention can be setted using the (non fluent) getSpan()/getSpanBuilder() methods.

I like the idea of checking for mandatory attributes :) But have you thought about some way to make it easier for clients to understand, what attributes are mandatory? Before they try to create a span and see warnings in logs.

Yes. A class-level Javadoc should suffice for this purpose. However, in a utopian world, all attributes should be set ;)

Have you thought how these new classes will play with io.opentelemetry.trace.attributes.SemanticAttributes?

Yes. Typed Spans and SemanticAttributes are next to each other. Using typed spans, we can achieve the same level of type safety for attributes. I think using SemanticAttributes inside Typed Spans is not really useful since it simply encapsulates the span.setAttribute( [attribute-name], [type]) call.

@iNikem
Copy link
Contributor

iNikem commented Jun 16, 2020

BasicXXX classes are used to generify over Span and Builder types, whereas the XXXSpan use concrete types for these. I could not find another way to implement this. If you have a better idea, I am open to suggestions :)

Sorry, still don't understand. If HttpServerSpan will ever be the only subclass of BasicHttpServerSpan then why do we need BasicHttpServerSpan?

@iNikem
Copy link
Contributor

iNikem commented Jun 16, 2020

Yes! Since the HTTP Server Semantic Convention requests only net.host.[name|port], I only added these setters. Extra fields that do not strictly belong to the semantic convention can be setted using the (non fluent) getSpan()/getSpanBuilder() methods.

But they are required by https://github.com/open-telemetry/opentelemetry-specification/blob/master/specification/trace/semantic_conventions/span-general.md, aren't they?

@iNikem
Copy link
Contributor

iNikem commented Jun 16, 2020

Typed Spans do not form a hierarchy but encapsulate the logic of a Span to facilitate a given Semantic Convention. Implementing the Span interface would not produce any benefit than having yet another RecordEventsReadableSpan class with some extra methods. W.r.t. the parent, you are right! I did not think about this! There are instances in which the automatic one from the context is not the right one to use. To solve this, we can add another method setParent() that makes it fluent.

Isn't the whole idea of semantic spans: to have "some extra methods"? :)

Implementing Span and Span.Builder interfaces will allow e.g. calling setAttribute and setParent directly, without breaking the fluency. It will also allow storing HttpServerSpan into context directly. And will allow arbitrary code to use SemanticAttributes with semantic spans.

@thisthat
Copy link
Member Author

Isn't the whole idea of semantic spans: to have "some extra methods"? :)

I would argue for the contrary. Typed Spans are used to reduce the API surface to only those methods that contribute to a specific semantic convention.

Implementing Span and Span.Builder interfaces will allow e.g. calling setAttribute and setParent directly, without breaking the fluency. It will also allow storing HttpServerSpan into context directly. And will allow arbitrary code to use SemanticAttributes with semantic spans.

Currently, the setAttribute() methods must return void which will break the fluency of the API. I would argue that since the current API is not fluent, there is no much difference between typedSpan.setAttribute() and typedSpan.getSpan().setAttribute(). The same holds for SemanticAttributes.

I don't see the benefit of storing directly a typed span into the context but, I am absolutely not an expert on this topic. Could you please provide an example where this is useful to help me understand better?

@thisthat
Copy link
Member Author

BasicXXX classes are used to generify over Span and Builder types, whereas the XXXSpan use concrete types for these. I could not find another way to implement this. If you have a better idea, I am open to suggestions :)

Sorry, still don't understand. If HttpServerSpan will ever be the only subclass of BasicHttpServerSpan then why do we need BasicHttpServerSpan?

We want to generalize the setters' return type for both, Span and Builder. Since we have the possibility of "extends", we need a hierarchy of classes that has no concrete type to allow at any point in the type chain to return the general type.
HttpServerSpan and HttpSpan, instead, are used to concretize the type so the type chain as a concrete type for the setters' return type, if it makes sense.

I was not able to do this in a single class. But maybe I cannot see the easier solution!

@thisthat
Copy link
Member Author

Yes! Since the HTTP Server Semantic Convention requests only net.host.[name|port], I only added these setters. Extra fields that do not strictly belong to the semantic convention can be setted using the (non fluent) getSpan()/getSpanBuilder() methods.

But they are required by open-telemetry/opentelemetry-specification:specification/trace/semantic_conventions/span-general.md@master, aren't they?

There is no hard requirement in this document. It says only it 'may be used for any network related operation'. Or am I reading wrongly the specification?

@iNikem
Copy link
Contributor

iNikem commented Jun 16, 2020

I would argue for the contrary. Typed Spans are used to reduce the API surface to only those methods that contribute to a specific semantic convention.

So here is the main difference in our opinions :) For me, HttpServerSpan is-a Span, it just provides higher-level, more convenient API for the specific kind of a span. It makes certain things simpler, but does not make already existing things harder. Requiring one extra call to getSpan() to set arbitrary attributes is "making already existing things harder" IMO.

By the same reasoning, one can expect that HttpServerSpan represents a "network related operation" and thus has higher level API for setting net.* attributes. Because in most cases users WILL do set them for server spans. We already know that, why don't we make their life simpler here? Besides, do you plan to provide separate class for net.* related conventions? How will I be able to set both on the same span then? And if not, why provide classes for some semantic conventions but not for others?

I am probably going too esoteric here, but your proposal essentially means that HttpServerSpan is a projection of a span. For me, it should be a specification.

I don't see the benefit of storing directly a typed span into the context but, I am absolutely not an expert on this topic. Could you please provide an example where this is useful to help me understand better?

Main benefit for me is the ability to take a span from the context and be able to downcast it to a specific semantic type. E.g. take a look at this class. I wait for semantic spans to return HttpServerSpan from its startSpan method. Then it will be totally sensible to change end method as well to receive HttpServerSpan. With current approach this is not possible, because the parameter of end method may come from the context, see #507 .

@thisthat
Copy link
Member Author

@iNikem your last comment was really informative. I now see your point of view. I have added a discussion point into the Java Instrumentation agenda so we can have a discussion this Thursday :)

@iNikem
Copy link
Contributor

iNikem commented Jun 23, 2020

That looks nice! :) Please allow me a couple of days to play with this and to migrate some existing instrumentations to it. Then I can provide meaningful feedback.

@thisthat
Copy link
Member Author

Sure @iNikem! thanks for doing that :)

@iNikem
Copy link
Contributor

iNikem commented Jun 24, 2020

I like this much more than the first version :) Semantic convention interfaces are very descriptive. Much more than original spans with all their implementations 👍

Some small remarks:

  • I think this code should life in auto-bootstrap module, because all instrumented classes should access them, no?
  • HttpServerSpan will implement NetPeerSemanticConventions when they will be ready, right?
  • Have you tried to add methods like setUrl(http.scheme, http.host, http.target)?
  • HttpServerSpanBuilder does not extends Span.Builder :( And does not have setParent method.

I will definitely need more time with them, but I think they are almost ready to be merged and be battle-tested :)

@anuraaga
Copy link
Contributor

anuraaga commented Jun 25, 2020

As a high level feedback, is my understanding correct that the main result of this change is to translate the relationships defined semantic conventions from the spec repo into code? That seems like a nice idea :) I'm skimming through open-telemetry/opentelemetry-java#778 and there doesn't seem to have been a conclusion, but +several to moving this into opentelemetry-java when ready, this is about defining requirements for using the SDK / API and not really an issue of instrumentation.

I also noticed that the linked issue mentions typed tracers - is that similar to Brave's parser approach?

https://github.com/openzipkin/brave/blob/master/instrumentation/http/src/main/java/brave/http/HttpClientParser.java

I think these parsers are similar to the decorators in this repo, and I find it to be a nice pattern since wrapping library objects with an adapter into an instrumentation object generally works pretty well. I added userAgent here

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/master/agent-bootstrap/src/main/java/io/opentelemetry/auto/bootstrap/instrumentation/decorator/HttpClientDecorator.java#L42

and I'm guessing long term, we would expect HttpClientDecorator to have the same methods defined as HttpClientSpan, though as getters instead of setters?

@iNikem
Copy link
Contributor

iNikem commented Jun 25, 2020

Yeah, the main goal is to codify semantic conventions. Currently chosen path is to develop and stabilise them in instrumentation repo and then, after they are stable, move them (or part of) into main java repo.

We already have 2 examples of "typed/semantic" tracers: HttpServerTracer and DatabaseClientTracer. Their idea is to provide higher level API to instrumentation writers. They certainly will NOT have the same API as corresponding spans.

@anuraaga
Copy link
Contributor

anuraaga commented Jun 25, 2020

The typed tracers seem similar to the decorators at first glance, haven't dug enough to find the difference. But it does seem to have a similar concept of exposing getters for semantic conventions

https://github.com/open-telemetry/opentelemetry-java-instrumentation/blob/master/agent-bootstrap/src/main/java/io/opentelemetry/auto/bootstrap/instrumentation/decorator/HttpServerTracer.java#L216

Isn't the goal for this to be the complete set of semantic conventions, so would have a very similar API to the typed spans, though as getters instead of setters?

@iNikem
Copy link
Contributor

iNikem commented Jun 25, 2020

You referenced method is protected, not public. It is meant to be implemented by subclasses, which know how to get peer port from the specific library.

Decorator and tracers are indeed very similar. Main difference, et least for me, is design goal. Tracers are meant to have as small public API as possible. Ideally just 2 methods: startSpan and end. They take care of fulfilling semantic conventions, leaving their clients (actual instrumentations) to care only about "lifecycle".

@anuraaga
Copy link
Contributor

I think we're talking about the same thing :) By API, I just meant that the instrumentation will have to implement the getters, but indeed they should only be calling those two methods. Brave works similarly, may be worth a look if there are any pointers.

@thisthat
Copy link
Member Author

  • I think this code should life in auto-bootstrap module, because all instrumented classes should access them, no?

I am not familiar with this repo, I trust your opinion for the package :)

  • HttpServerSpan will implement NetPeerSemanticConventions when they will be ready, right?

Yes, exactly.

  • Have you tried to add methods like setUrl(http.scheme, http.host, http.target)?

That's an option that I did not consider yet. The design decision was to provide a straightforward API to set semantic attributes. However, handling special attributes with more fine graned methods sounds a valid approach!

  • HttpServerSpanBuilder does not extends Span.Builder :( And does not have setParent method.

Thanks for noticing it :) I will add the method!

@trask
Copy link
Member

trask commented Jul 1, 2020

  • I think this code should life in auto-bootstrap module, because all instrumented classes should access them, no?

I am not familiar with this repo, I trust your opinion for the package :)

Yeah, auto-bootstrap makes sense for this initial PR. The next step will be to pull them out into their own module, and include/shade them in the auto-bootstrap module (we can help with that). And then the next step after that will be promoting the module to otel-java repo 👍.

@iNikem
Copy link
Contributor

iNikem commented Jul 6, 2020

@thisthat have you had a chance to address latest comments?

@thisthat
Copy link
Member Author

thisthat commented Jul 6, 2020

@iNikem sorry to let you wait. Unfortunately, I am quite busy rn. I will have some free cycles next week so expect something for that timeframe 😉

@iNikem
Copy link
Contributor

iNikem commented Jul 13, 2020

@thisthat does your time estimate still hold? :) Can we help you somehow?

@thisthat
Copy link
Member Author

@iNikem I started addressing your feedback and adding all remaining semantic conventions ;) Expect something for tomorrow/Wednesday!

public DbCassandraSemanticConvention setDbUser(String dbUser);

/**
* Sets a value for db.mssql.instance_name
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If this is Cassandra convention, it cannot have db.mssql* or db.jdbc.driver_classname

@iNikem
Copy link
Contributor

iNikem commented Jul 15, 2020

Thank you, @thisthat :) This is going to rock. Semantic convention codified in Spans will make both auto and manual instrumentation much pleasant. And their generation will make syncing with convention changes much easier. I am sure, when we start actually using them a lot more suggestions will come, but this is very solid starting point.

How far are you from making generator itself public? And how do you scrape conventions now?

Even if your generator is not yet ready, I suggest bringing this PR up to date with the current Java API and merging it. We can start using it and discovering what modifications we need to introduce. We can fix them manually and feed this info back to generator.

@thisthat @trask @anuraaga what do you think?
@tylerbenson @jkwatson Had you have a chance to look at this? Do you any opinion?

@thisthat
Copy link
Member Author

How far are you from making generator itself public? And how do you scrape conventions now?

I am pretty close to release it. I am currently polishing the few remaining details. Expect something in one of the following SIG Spec meeting 😉
At the moment, I have manually created a model for that. You can see an outdated version here open-telemetry/opentelemetry-specification#571. I am using the following Jinja template for generating these classes from the yaml definitions https://gist.github.com/thisthat/7e34742f4a7f1b5df57118f859a19c3b

I suggest bringing this PR up to date with the current Java API

To what API are you referring to?

@iNikem
Copy link
Contributor

iNikem commented Jul 16, 2020

otel-java API. This PR currently does not compile against 0.6.0 version.

@iNikem
Copy link
Contributor

iNikem commented Jul 20, 2020

@thisthat I still don't see that builders extend Span.Builder, e.g. DbCassandraSpanBuilder. Or am I missing something?

@thisthat
Copy link
Member Author

@thisthat I still don't see that builders extend Span.Builder, e.g. DbCassandraSpanBuilder. Or am I missing something?

If we make the xSpanBuilder class implement Span.Builder, we'll have the same issue described here: #502 (comment)

Since typed spans exports all methods for a given semantic convention, I don't think using xAttributeSetter.set is really needed, besides the cases where additional attributes are nice to have but they are not part of a semantic convention. I think having a getter for the Span.Builder is better for such a case. WDYT?

@iNikem
Copy link
Contributor

iNikem commented Jul 21, 2020

Ok, makes sense.

@thisthat
Copy link
Member Author

If there are no more comments, I will mark it as ready for review! :)

@tylerbenson
Copy link
Member

@thisthat is there any way to split this up into smaller PRs? Given the current size, this PR is very difficult to review without getting lost.

@thisthat
Copy link
Member Author

Sure, I can split this draft into smaller PRs. One for each semantic convention group!

@thisthat
Copy link
Member Author

Mind that the current code is automatically generated from a template, see https://gist.github.com/thisthat/7e34742f4a7f1b5df57118f859a19c3b. So the changes are (mainly) only about class, method, and parameter names.

@trask
Copy link
Member

trask commented Jul 21, 2020

hey @thisthat, do we need to check in the generated code?

@thisthat
Copy link
Member Author

@trask currently, the idea is to have this code in the repo. This way, people can start using it and see problems/benefits and adapt the code if we find something that does not work or could be improved :)
Once it has been battle-tested, we could generate it at compile time.

@iNikem
Copy link
Contributor

iNikem commented Jul 22, 2020

@thisthat If you agreed on submitting smaller PRs for independent pieces, then I propose to close PR

@thisthat
Copy link
Member Author

Should I wait for open-telemetry/opentelemetry-specification#653 to be resolved before proposing typed spans in smaller PRs?

@iNikem
Copy link
Contributor

iNikem commented Jul 24, 2020

I don't think it makes sense. Spec issues may take a lot of time. And as you generate code, it will be easy to remove required checks later. So please proceed with PRs :)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants