Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add semantic conventions for process metrics #2061

Merged
merged 2 commits into from
Jan 25, 2022

Conversation

djaglowski
Copy link
Member

@djaglowski djaglowski commented Oct 26, 2021

Changes

Adds semantic conventions for process metrics

Resolves #1817

Prerequisite for (collector) #4232

Notes

I would appreciate perspectives on process.cpu.time, in regards to nuances described here.

@djaglowski djaglowski marked this pull request as ready for review October 26, 2021 19:07
@djaglowski djaglowski requested review from a team as code owners October 26, 2021 19:07
@tigrannajaryan
Copy link
Member

Some additional context: we need to use this in 2 places right now:

  1. The hostmetrics receiver in the Collector, which already collects process metrics.
  2. The Collector's own process metrics, which it already emits.

Sadly 1 and 2 emit the process metrics in somewhat different ways (different metric names, different attributes, see Dan's comparison). We need to make this part of the spec and follow it ourselves.

@jmacd
Copy link
Contributor

jmacd commented Oct 27, 2021

💯

@tigrannajaryan
Copy link
Member

Do we also need to make the "process id" an attribute for all process metrics?

@djaglowski
Copy link
Member Author

Do we also need to make the "process id" an attribute for all process metrics?

It looks like it is already defined as a resource attribute.

@tigrannajaryan
Copy link
Member

Do we also need to make the "process id" an attribute for all process metrics?

It looks like it is already defined as a resource attribute.

Right!

@djaglowski
Copy link
Member Author

djaglowski commented Oct 29, 2021

@tigrannajaryan I misunderstood your point. I think you are suggesting we should explicitly state the requirement in this document. I'll add this!

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems to me that the process collector + the general naming conventions for metrics don't really line up. If we're going to create semconv, we should figure out how to line up with those naming conventions or purposefully deviate.

@djaglowski
Copy link
Member Author

It seems to me that the process collector + the general naming conventions for metrics don't really line up. If we're going to create semconv, we should figure out how to line up with those naming conventions or purposefully deviate.

I'm not sure what you're saying here. Are you saying that the proposed metric names don't line up with general naming conventions? (Certainly there are some metrics within the collector that do not, but those will be changed based on the outcome of this PR.)

@jsuereth
Copy link
Contributor

jsuereth commented Nov 1, 2021

I'm not sure what you're saying here. Are you saying that the proposed metric names don't line up with general naming conventions? (Certainly there are some metrics within the collector that do not, but those will be changed based on the outcome of this PR.)

Yes, the proposed names don't line up. See: https://github.com/open-telemetry/opentelemetry-specification/blob/main/specification/metrics/semantic_conventions/README.md#general-metric-semantic-conventions

@djaglowski
Copy link
Member Author

Yes, the proposed names don't line up.

I tried to follow those guidelines as closely as possible, but I don't think they cover every scenario or generalize well enough to be taken as clear rules in all cases.

Here's how I'm looking at each of the proposed names. I'm happy to reconsider but you'll have to be more specific.

Clearly following the guidelines:

  • process.cpu.time
  • process.disk.io

Naming guidelines appear to contradict aggregation rule of thumb:

  • process.memory.physical_usage
  • process.memory.virtual_usage

I took it for granted that uptime is a special case, but the naming guidelines suggest this should be process.time. It seems to me that uptime communicates the meaning much more intuitively though.

  • process.uptime

There's an open question here about whether there is an issue with the root namespace. Otherwise, it's not clear to me how these go against the naming guidelines.

  • process.runtime.memory.heap.allocated
  • process.runtime.memory.heap.allocated_cumulative
  • process.runtime.memory.system

@jsuereth
Copy link
Contributor

jsuereth commented Nov 2, 2021

Naming guidelines appear to contradict aggregation rule of thumb:

process.memory.physical_usage
process.memory.virtual_usage

Yeah, I found this last week but didn't get a chance to open a bug on it.

Otherwise, it's not clear to me how these go against the naming guidelines.

I'll try to be less lazy (more clear) in my objections going forward. I'm only objecting to the physical vs. virtual usage. I think you could mark an argument that "limit" doesn't make sense for virtual usage but does for physical, so they need to be called out and I could by that (and would want that decision recorded somewhere for all time).

If physical_usage and virutal_usage are better documented, then I have no objections besides runtime.process vs. process.runtime. Specifically, I suspect that distinction should be covered by instrumentation-library going forward, but we just don't have enough prototypes to justify such a stringent change to semconv.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@djaglowski
Copy link
Member Author

I think you could mark an argument that "limit" doesn't make sense for virtual usage but does for physical, so they need to be called out and I could by that (and would want that decision recorded somewhere for all time).

@jsuereth I've renamed the memory metrics, so that physical memory is clearly described as a usage metric, while virtual is not. I've also proposed some additional language on the general guidelines for usage metrics. Let me know what you think.

@github-actions github-actions bot removed the Stale label Nov 11, 2021
@tigrannajaryan
Copy link
Member

@open-telemetry/specs-approvers please review.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Dec 16, 2021
@djaglowski
Copy link
Member Author

@open-telemetry/specs-approvers please review.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@github-actions github-actions bot added the Stale label Dec 25, 2021
@github-actions
Copy link

github-actions bot commented Jan 1, 2022

Closed as inactive. Feel free to reopen if this PR is still being worked on.

@tigrannajaryan
Copy link
Member

@open-telemetry/specs-metrics-approvers please review.

Copy link
Contributor

@jsuereth jsuereth left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just one minor nit remaining.

@tigrannajaryan
Copy link
Member

@djaglowski please resolve any comments that are addressed. If none are remaining we can start the countdown to merge the PR.

@open-telemetry/specs-approvers please review.

@tigrannajaryan
Copy link
Member

This has enough approvals and the last change was more than 2 days ago, this is good to merge. @djaglowski please rebase from main so that we can merge.

@github-actions
Copy link

This PR was marked stale due to lack of activity. It will be closed in 7 days.

@tigrannajaryan
Copy link
Member

@djaglowski please resolve the conflicts and ping me to merge.

@djaglowski
Copy link
Member Author

@tigrannajaryan, should be good to go.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:semantic-conventions Related to semantic conventions spec:metrics Related to the specification/metrics directory
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Define metric semantic conventions for OS processes
8 participants