Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Experimental OpenTelemetry Core #437

Merged
merged 5 commits into from
Mar 12, 2024
Merged

Experimental OpenTelemetry Core #437

merged 5 commits into from
Mar 12, 2024

Conversation

mtmk
Copy link
Collaborator

@mtmk mtmk commented Mar 8, 2024

The idea with this PR is to introduce OTEL with a minimal set of changes covering only Core so that we can get feedback and improve in follow up PRs. We would mark this feature as experimental to start with.

Bulk of the work here is thanks to @Cooksauce's excellent work in #124 and feedback from the community.

resolves #53

Example Trace

See the instructions in the example to run it on your machine.

image

@mtmk mtmk requested a review from caleblloyd March 8, 2024 12:35
Copy link
Collaborator

@caleblloyd caleblloyd left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM - probably need to run some perf tests to make sure nothing is out of the ordinary but then should be good to merge

@oising
Copy link

oising commented Mar 8, 2024

Superb!

@stebet
Copy link
Collaborator

stebet commented Mar 8, 2024

Might be good to get input from @lmolkova and @martinjt on the span names to be in sync with the OTel Semantic Conventions (https://opentelemetry.io/docs/specs/semconv/messaging/messaging-spans/)

@mtmk mtmk added this to the 2.1.3 milestone Mar 9, 2024
@mtmk
Copy link
Collaborator Author

mtmk commented Mar 9, 2024

Might be good to get input from @lmolkova and @martinjt on the span names to be in sync with the OTel Semantic Conventions (https://opentelemetry.io/docs/specs/semconv/messaging/messaging-spans/)

activity source: NATS.Net

Span names are inspired from the NATS protocol somewhat mimicking HTTP calls e.g. POST /url/path

Publish: PUB <subject-or-inbox>
Subscribe: SUB <subject-or-inbox>
Subscriber receive: MSG <subject-or-inbox>
Request-reply Publish: REQ <subject-or-inbox>

happy to changed them based on your recommendations.

@martinjt
Copy link

martinjt commented Mar 9, 2024

Have you checked out the spec for these yet?

https://opentelemetry.io/docs/specs/semconv/messaging/messaging-spans/

There are formats recommended there for the span names, and also what attributes must be present.

@mtmk
Copy link
Collaborator Author

mtmk commented Mar 10, 2024

New span naming:

Publish: <dest> publish
Subscribe: <dest> subscribe
Subscriber receive: <dest> receive
Request-reply Publish: <dest> request

<dest>: inbox or first two tokens of the subjects for lower cardinality but should still be useful.

@mtmk
Copy link
Collaborator Author

mtmk commented Mar 11, 2024

Performance should not be affected for applications without OTel or when they are not listening to NATS.Net activities. We should investigate in follow up PRs if can improve the performance when OTel is turned on.

main:

Method Iter Mean Error StdDev Allocated
PublishAsync 64 132.0 us 16.80 us 0.92 us 367 B
PublishAsync 512 277.9 us 33.37 us 1.83 us 437 B
PublishAsync 1024 439.6 us 53.63 us 2.94 us 566 B

this PR:

Method Iter Mean Error StdDev Allocated
PublishAsync 64 131.3 us 17.47 us 0.96 us 391 B
PublishAsync 512 280.1 us 35.17 us 1.93 us 419 B
PublishAsync 1024 443.8 us 77.65 us 4.26 us 405 B

with OTel:

Method Iter Mean Error StdDev Gen0 Gen1 Gen2 Allocated
PublishAsync 64 311.8 us 1,506.9 us 82.60 us 8.7891 7.8125 - 129.37 KB
PublishAsync 512 1,248.2 us 2,190.4 us 120.06 us 70.3125 27.3438 11.7188 946.12 KB
PublishAsync 1024 2,300.7 us 3,095.8 us 169.69 us 148.4375 93.7500 31.2500 1994.66 KB

mtmk added a commit that referenced this pull request Mar 11, 2024
@mtmk mtmk mentioned this pull request Mar 11, 2024
@martinjt
Copy link

For looking at performance improvements, the biggest ones are

  • Test ActivitySource.HasListeners instead of calling StartActivity and checking for nulls
  • Prefer adding tags during StartActivity wherever possible.

This is a good read on reducing allocations in Akka, definitely relevant beyond just actors though.

https://petabridge.com/blog/phobos-2.4-opentelemetry-optimizations/

@mtmk
Copy link
Collaborator Author

mtmk commented Mar 12, 2024

This is a good read on reducing allocations in Akka, definitely relevant beyond just actors though.

https://petabridge.com/blog/phobos-2.4-opentelemetry-optimizations/

It looks like fair amount of performance drop is expected e.g. from ~6M down to ~2M - ~2.5M in Akka.Net's case. Numbers or ratios aren't directly comparable but I feel we aren't far off the expected drop in this case. I can't immediately see any low hanging fruit to improve the performance at this stage and if necessary we'd need to profile to see if the allocations can be reduced to start with in a follow up PR in the future.

@mtmk mtmk merged commit a371623 into main Mar 12, 2024
10 checks passed
@mtmk mtmk deleted the experimental-otel-core branch March 12, 2024 08:46
mtmk added a commit that referenced this pull request Mar 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Support distributed tracing / OpenTelemetry
5 participants