Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement metrics #7

Closed
6 tasks done
Tracked by #14
unflxw opened this issue Mar 24, 2023 · 6 comments
Closed
6 tasks done
Tracked by #14

Implement metrics #7

unflxw opened this issue Mar 24, 2023 · 6 comments
Assignees
Labels

Comments

@unflxw
Copy link
Contributor

unflxw commented Mar 24, 2023

Add metrics to the Python integration using OpenTelemetry.

To do

  • Research how to implement metrics
    • OpenTelemetry metrics
      • Add OpenTelemetry metrics support to the agent (include the OpenTelemetry metrics protobuf)
  • Convert absolute counters to incremental counters. Now it's stored as a gauge, but we should calculate the delta and send that. -- https://github.com/appsignal/appsignal-agent/pull/1050
  • Add metrics exporter to the Python package - Add OpenTelemetry metrics exporter #121
  • Document how to use OpenTelemetry metrics to report custom metrics.
    • Document which OpenTelemetry metric types we convert to which AppSignal metric types.
    • Add note that we don't support histograms.

When this is done we have custom metrics support which is good enough. Close the issue and spin the next steps off into separate issues.

Next steps

@backlog-helper
Copy link

backlog-helper bot commented Mar 24, 2023

✔️ All good!

New issue guide | Backlog management | Rules | Feedback

@unflxw
Copy link
Contributor Author

unflxw commented Jun 28, 2023

Made some progress on this task, as mentioned on Slack.

First, a note: the OpenTelemetry docs say that there is no OTLP/HTTP exporter for metrics, but this isn't actually true. The exporter was implemented months ago and it's functional, but it's not documented.

Once that was figured out, I managed to get some metrics sent from the Python integration to the agent, and parsed on the agent’s side. Here’s the debug output from the agent: https://gist.github.com/unflxw/d34dd4c37a72269d5b1e1615a53b4411

Here’s the changes in the Python integration and in the Flask test setup, and here’s the (extremely work-in-progress!) changes in the agent.

The next step will be to make the agent turn these into AppSignal metrics. Some, such as counters, have a clear mapping to existing metrics. Others, such as histograms, not so much.

Finally, a caveat: in the current work-in-progress implementation, the agent's HTTP handler tries to parse the same bytes with different protobuf definitions, first as traces, then as metrics. What we're actually supposed to do, according to the OTLP/HTTP spec, is to differentiate them by the path that is requested (/v1/traces or /v1/metrics), which makes sense. We may or may not need to add these paths manually to the HTTP endpoint setting in the exporters -- it's unclear. Regardless, this will require some changes to the agent's HTTP handler.

@unflxw
Copy link
Contributor Author

unflxw commented Jun 30, 2023

Made some more progress on this. OpenTelemetry delta sum metrics are now reported to AppSignal as counters, and both OpenTelemetry cumulative sum metrics and OpenTelemetry gauges are now reported to AppSignal as gauges:

The agent branch linked in the above comment should contain the latest (still work in progress) changes.

Screenshot 2023-06-30 at 16 17 19

Short term to do list

  • Use the HTTP path to determine whether to attempt parsing as traces or metrics (see above comment)
    • For compatibility, empty path (server root) should mean traces
  • Use data point attributes as metric tags
    • There will need to be a denylist for attributes, as OpenTelemetry metrics throw a lot of attributes of very little relevance.
    • We may want to always report a metric with and without the hostname
    • We may want to report combinatorials of the attributes, for small enough numbers of attributes? So users can filter the metrics by different attribute sets. This is probably only meaningful to do for (monotonic?) counters, not gauges, as they do not have defined aggregation semantics.
  • See if we can make magic dashboards out of this
    • There may not be much to report, using only counters/gauges, in the Python instrumentations we support. If so, do histograms first (see below)
  • Figure out a way to support histograms (see Basecamp)

@tombruijn tombruijn self-assigned this Jul 3, 2023
@backlog-helper

This comment has been minimized.

1 similar comment
@backlog-helper

This comment has been minimized.

tombruijn pushed a commit that referenced this issue Aug 25, 2023
Add the metrics exporter by default so users are able to send metrics
using OpenTelemetry to AppSignal.

This requires an agent update.

Tracking issue: #7
tombruijn pushed a commit that referenced this issue Aug 30, 2023
Add the metrics exporter by default so users are able to send metrics
using OpenTelemetry to AppSignal.

This requires an agent update.

Tracking issue: #7
@backlog-helper
Copy link

  • This issue has not had any activity in 14 days. Please provide a status update if it is still relevant. Closed it if it is no longer relevant. Or move it to another column if it's blocked or requires another look at it. - (More info)

This is a message from the daily scheduled checks.

New issue guide | Backlog management | Rules | Feedback

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants