Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

YAML Model for Semantic Conventions #571

Merged

Conversation

thisthat
Copy link
Member

@thisthat thisthat commented Apr 21, 2020

Ready for merge

YAML model for Semantic Convention

This is a proposal to define semantic conventions in a YAML format that resolves #547.

The different language implementations can use git submodules to reference these definitions, so the following can be automatically generated:

  • Constants for semantic attributes

  • Typed Spans (type-safe span builders with named methods/functions rather than having users type or copy-paste strings from the spec doc for attribute keys)

For the opentelemetry-specification repo, markdown tables will be generated from the YAML files to serve as written documentation.

Markdown Tables

Semantic conventions use Markdown tables to describe the list of attributes that belong to them.

With a common format, all tables for the semantic conventions will have the same appearance and requirements. It is expected that special HTML tags are used in skeleton/template Markdown files to tag where and which table must be generated, as is with markdown-toc that is currently used to generate the TOCs.

Constants for Semantic Attributes

Currently, for each instrumentation, attribute names must be copied manually from the specification, which is an error-prone activity. Using a common YAML format, languages can automatically generate constant keys for Semantic Attributes, e.g. JS, and Java.

Typed Spans

Semantic Attributes are an important aspect of OpenTelemetry because they allow to precisely define the information that provides insights into a specific part of a system. Defining the data format using semantic conventions is the only reliable way to allow back ends to reason about the telemetry data they receive and to analyze them further. To support OpenTelemetry users in properly supplying this data, we can provide additional APIs that aid building spans for specific operations by correctly setting the semantic attributes for a given semantic convention.

For more information, please refer to the prior art in OpenCensus, OTEP25, original specifications, and the Java proposal.

Note that, each implementation will define its own API structure for the Typed Spans to use the common patterns offered by the language.

Description of the model

The fields and their expected values are presented in allowed_syntax.md.

For a basic attribute description, please refer to the General Attributes example.

Here, each attribute is specified in a list providing information for the key, which is computed by joining the semantic convention prefix and the attribute name. Furthermore, the type expected for each attribute is defined. In some cases, attributes are relevant to perform sampling decisions. These attributes can be marked as sampling_relevant, for example in the gRPC and FaaS semantic conventions the attributes service and trigger, respectively. Marking an attribute as sampling_relevant will help the Typed Spans to request such information before the Span is started so it is available via attributes to the Sampler. Furthermore, generated Markdown tables will also have a column to indicate if an attribute is sampling relevant or not.

In the case that an enumeration is expected, the pre-defined enum values are listed and we can additionally set the flag allow_custom_values to true to allow other, custom values to be provided. This will help the Typed Span APIs to properly restrict the allowed attribute values while building the span.

In some semantic conventions, a common structure is shared by specialized semantic conventions for specific cases, e.g. HTTP and gRPC, where a common description is given and then additional attributes are described depending on whether it is a server or client operation. To express this, the extends field is added and set to the id of the semantic convention to extend, e.g gRPC Client.

Another common case is to reference attributes of another semantic convention. For example, the gRPC semantic convention references many Network Attributes. To cross-reference attributes between different semantic conventions the ref field is used. ref also allows overriding the description of the attributes to properly fit it to the current use case. When using ref, the constraints that specify if the referenced attribute is required are not taken over, this always has to be explicitly defined for each convention in which it is used.

An example is the gRPC Client. Using ref will allow the Markdown table generation to create links between the different semantic conventions.

Semantic Conventions might have special constraints, for example in gRPC at least one of net.peer.ip or net.peer.port must be set. This can be expressed in the YAML format using the any_of constraint. An example is the RPC Semantic Convention.

Another constraint is that a semantic convention requires that another semantic convention is, in turn, followed as well. For example, in the FaaS Semantic Convention when a function is used to respond an inbound HTTP request, the HTTP Server Semantic Convention must also be followed. This can be expressed in the YAML format using the include constraint. An example is available here.

Versioning

Since OpenTelemetry has reached the beta status, it is important to introduce the notion of versioning. Whenever we introduce a semantic change, the version should be bumped by one. This way, backends can correctly ingest OpenTelemetry data.
What we consider a semantic change, is out of the scope of this PR and should be discussed in a separate issue.

@bogdandrutu
Copy link
Member

I think this is a very nice PR and well documented, but would like to rise some concerns:

  • In this PR we are adding some new properties to the attributes such as sampling_relevant and allow_custom_values. I would prefer these to be added later as enhancements to the current model.
  • We are talking about Typed Spans which has not been defined yet.

So I would focus for the moment on not adding new functionality.

yaml/allowed_syntax.md Outdated Show resolved Hide resolved
yaml/allowed_syntax.md Outdated Show resolved Hide resolved
yaml/allowed_syntax.md Outdated Show resolved Hide resolved
yaml/allowed_syntax.md Outdated Show resolved Hide resolved
yaml/span_attribute/faas.yaml Outdated Show resolved Hide resolved
@thisthat
Copy link
Member Author

thisthat commented Apr 22, 2020

I think this is a very nice PR and well documented, but would like to rise some concerns:

  • In this PR we are adding some new properties to the attributes such as sampling_relevant and allow_custom_values. I would prefer these to be added later as enhancements to the current model.
  • We are talking about Typed Spans which has not been defined yet.

So I would focus for the moment on not adding new functionality.

I do agree with you @bogdandrutu that sampling_relevant is a new concept that we should not introduce here. However, allow_cutom_values is not necessarily new. I would argue that currently, it is an implicitly concept (see db.type, faas.trigger, http.flavor, messaging.system, etc) that now is explicit and properly documented.

Regarding your second point, this PR proposes the vision of having Typed Spans. I would argue that this is a (side-effect) benefit we can have for free using the proposed YAML model and not a concept that we are introducing with this PR.

@bogdandrutu
Copy link
Member

Regarding your second point, this PR proposes the vision of having Typed Spans. I would argue that this is a (side-effect) benefit we can have for free using the proposed YAML model and not a concept that we are introducing with this PR.

Based on the description it seems like the main reason to do this.

@carlosalberto carlosalberto added the area:semantic-conventions Related to semantic conventions label Jun 12, 2020
@reyang reyang added the release:required-for-ga Must be resolved before GA release, or nice to have before GA label Jul 6, 2020
@andrewhsu andrewhsu added this to Required for GA, needs action. Add label:release:required-for-ga to issues and PRs when moving them to this column. in GA Spec Burndown Jul 7, 2020
@thisthat thisthat requested review from a team as code owners July 27, 2020 10:55
@thisthat
Copy link
Member Author

@open-telemetry/specs-approvers @open-telemetry/specs-trace-approvers please take another look.
I add a small example of how we could model with YAML files network and HTTP semantic conventions. I am using these two semantic conventions to showcase how markdown tables can be automatically generated from such definitions. Similarly, I am proposing typed spans in the java-instrumentation repository that are automatically generated from these definitions (open-telemetry/opentelemetry-java-instrumentation#502).

In this PR, there is also the source code used to generate markdown tables and code. It is not perfect and it requires some more polishing, but we are using it for quite a while internally at Dynatrace. It can be used as a first draft to start avoiding to manually replace tables and code after every semantic convention change (e.g. https://github.com/heyams/opentelemetry-java/blob/master/api/src/main/java/io/opentelemetry/trace/attributes/SemanticAttributes.java).
The idea is to donate this code to the OpenTelemetry project so everyone can use and improve it.

Copy link
Member

@bogdandrutu bogdandrutu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As discussed during the SIG meeting I created a special repo for build tools:
https://github.com/open-telemetry/build-tools

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 12, 2020
@arminru
Copy link
Member

arminru commented Aug 12, 2020

The semantic convention generator which makes use of the YAML model proposed here is available for review in the new build-tools repo: open-telemetry/build-tools#5

GA Spec Burndown automation moved this from Required for GA, needs action. Add label:release:required-for-ga to issues and PRs when moving them to this column. to Required for GA, in progress Aug 17, 2020
Makefile Outdated

.PHONY: table-generation
table-generation:
docker run --rm -v $(PWD)/yaml:/source -v $(PWD)/specification:/spec thisthatdc/semconvgen -f /source markdown -md /spec
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I will leave this under my personal account for the moment. This way, people can experiment and try the tool. As soon as the tool is published with the opentelemety account, I will update the make file.

Copy link
Member

@SergeyKanzhelev SergeyKanzhelev left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't find any semantic change in this PR. It introduces automation with the promise to iterate over it's design and output. So lgtm as a solid starting point.

GA Spec Burndown automation moved this from Required for GA, in progress to Required for GA, approved Aug 19, 2020
Copy link
Member

@Oberon00 Oberon00 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, but before merging this, the build needs to become green:

  • Either Semantic convention generator build-tools#5 needs to be merged first, or the GH action needs to be disabled/commented out for now.
  • The docfx check needs to be fixed (the nettransport-attribute in span-general.md anchor is missing):
[20-08-18 11:34:37.018]Warning:[Postprocess.HandlePostProcessorsWithIncremental.HandlePostProcessors.Processing html.ValidateBookmark](specification/trace/semantic_conventions/span-general.md#L12)Invalid link: '[`net.transport` attribute](#nettransport-attribute)'. The file specification/trace/semantic_conventions/span-general.md doesn't contain a bookmark named 'nettransport-attribute'.
[20-08-18 11:34:37.018]Warning:[Postprocess.HandlePostProcessorsWithIncremental.HandlePostProcessors.Processing html.ValidateBookmark](specification/trace/semantic_conventions/rpc.md#L64)Invalid link: '[net.transport][]'. The file specification/trace/semantic_conventions/span-general.md doesn't contain a bookmark named 'nettransport-attribute'.
[20-08-18 11:34:37.018]Warning:[Postprocess.HandlePostProcessorsWithIncremental.HandlePostProcessors.Processing html.ValidateBookmark](specification/trace/semantic_conventions/messaging.md#L126)Invalid link: '[`net.transport`][]'. The file specification/trace/semantic_conventions/span-general.md doesn't contain a bookmark named 'nettransport-attribute'.
[20-08-18 11:34:37.018]Warning:[Postprocess.HandlePostProcessorsWithIncremental.HandlePostProcessors.Processing html.ValidateBookmark](specification/trace/semantic_conventions/database.md#L42)Invalid link: '[`net.transport`][]'. The file specification/trace/semantic_conventions/span-general.md doesn't contain a bookmark named 'nettransport-attribute'.
	4 Warning(s)

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Aug 27, 2020
@arminru
Copy link
Member

arminru commented Aug 27, 2020

Still pending on open-telemetry/build-tools#5 to be merged (or the build action to be disabled for now).

@bogdandrutu
Copy link
Member

Rebase?

Makefile Outdated Show resolved Hide resolved
@SergeyKanzhelev
Copy link
Member

it shows me this strange UI without Update Branch or Merge button:

image

@bogdandrutu does it show the same for you?

@arminru
Copy link
Member

arminru commented Aug 27, 2020

@SergeyKanzhelev
@bogdandrutu enabled a setting that all branches have to be brought up to date manually before merging until all PRs picked up some linting fixes he recently added (see #512). I updated the branch manually just now.

@SergeyKanzhelev SergeyKanzhelev merged commit a107e69 into open-telemetry:master Aug 27, 2020
GA Spec Burndown automation moved this from Required for GA, approved to Required for GA, done Aug 27, 2020
@SergeyKanzhelev
Copy link
Member

@SergeyKanzhelev
@bogdandrutu enabled a setting that all branches have to be brought up to date manually before merging until all PRs picked up some linting fixes he recently added (see #512). I updated the branch manually just now.

Understood. On other PRs I see a button to update a branch so I can do it myself:

image

Maybe this is because I don't have write permissions on DynaTrace repo to update the branch myself before merging.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions release:required-for-ga Must be resolved before GA release, or nice to have before GA Stale
Projects
No open projects
GA Spec Burndown
  
Required/Allowed for GA, resolved.
Development

Successfully merging this pull request may close these issues.

Proposal: Define semantic conventions in JSON/YAML
8 participants