Skip to content

Improve telemetry output#186

Merged
rubberduck203 merged 1 commit intoGusto:mainfrom
rubberduck203:cmac/telemetry
Feb 11, 2025
Merged

Improve telemetry output#186
rubberduck203 merged 1 commit intoGusto:mainfrom
rubberduck203:cmac/telemetry

Conversation

@rubberduck203
Copy link
Copy Markdown
Contributor

@rubberduck203 rubberduck203 commented Feb 11, 2025

  1. Actions and groups are now marked as errored when they fail
  2. Set the otel.name so it includes the group or action name

This lets us easily query/aggregate which groups or actions are failing most often.

This fixes #185 and improves on the solution for #154 implemented in #157.

We accomplish the dynamic span naming by leveraging one of the "special fields" detailed here
https://docs.rs/tracing-opentelemetry/latest/tracing_opentelemetry/

The naming convention is loosely based on the OTEL http semantic convention of {method} {url}.
Several other of the semantic convention standards are similar as well. https://opentelemetry.io/docs/specs/semconv/http/http-spans/

In order to access the OTEL Span's set_status() method, I needed to upgrade all of our opentelemetry dependencies to the latest version.
When I did this, traces were no longer being properly flushed before exiting the application, so I updated the implementation based off of this example from the opentelemetry-otlp crate.
https://github.com/tokio-rs/tracing-opentelemetry/blob/v0.1.x/examples/opentelemetry-otlp.rs

Screenshot of what the traces look like now
image

1. Actions and groups are now marked as errored when they fail
2. Set the otel.name so it includes the group or action name

This lets us easily query/aggregate which groups or actions are failing most often.

This fixes #185 and improves on the solution for #154 implemented in #157.

We accomplish the dynamic span naming by leveraging one of the "special fields" detailed here
https://docs.rs/tracing-opentelemetry/latest/tracing_opentelemetry/

The naming convention is loosely based on the OTEL http semantic convention of "{method} {url}".
Several other of the semantic convention standards are similar as well.
https://opentelemetry.io/docs/specs/semconv/http/http-spans/

In order to access the OTEL Span's `set_status()` method, I needed to upgrade all of our opentelemetry dependencies to the latest version.
When I did this, traces were no longer being properly flushed before exiting the application,
so I updated the implementation based off of this example from the opentelemetry-otlp crate.
https://github.com/tokio-rs/tracing-opentelemetry/blob/v0.1.x/examples/opentelemetry-otlp.rs
@rubberduck203 rubberduck203 enabled auto-merge (squash) February 11, 2025 20:44
@rubberduck203 rubberduck203 merged commit 16842f9 into Gusto:main Feb 11, 2025
@rubberduck203 rubberduck203 deleted the cmac/telemetry branch February 11, 2025 20:51
rubberduck203 added a commit that referenced this pull request Mar 18, 2025
Between opentelemetry-otlp v0.15 and 0.16 a breaking change landed.
-
https://github.com/pitoniak32/opentelemetry-rust/blob/main/opentelemetry-otlp/CHANGELOG.md#v0160
- open-telemetry/opentelemetry-rust#1706

So when we upgraded to 0.27 (#186),
we inadvertantly broke people using `SCOPE_OTEL_PROTOCOL=http`. It did
not affect people using `grpc`.

To maintain consistency with grpc (and compatibility with version prior
to PR #186), we append `v1/traces` to the `SCOPE_OTEL_ENDPOINT` when the
protocol is set to `http`.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

Telemetry: Actions and Groups aren't properly marked as errored

2 participants