-
Notifications
You must be signed in to change notification settings - Fork 131
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Optimize getContext and getContextAndTags #253
Conversation
It is not necessary to do multiple allocations and copying, single pass is enough.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a lot for this PR @martin-sucha !
I added a comment about a edge case. Beside this it looks ready to be merged.
With zero tags, the code did two allocations instead of one because len(tagSeparatorSymbol)*(len(tags)-1) evaluated to a negative number, so the buffer was short. Thanks hush-hush for pointing this edge case in code review.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One last nit-pick
There is at least one tag because we checked for len(tags) == 0 in the beginning.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for the PR !
I'll merge and will release it with the next version of the client (hopefully next week).
I'm curious of your use case and what bring you to optimizing this part of the client. Would you mind sharing how many points per second you're sending, what type of metrics ...
We switched away from datadog-go several years ago to another statsd library because datadog-go did not have aggregation support at the time and was spending too much time sending packets. Now datadog-go has aggregation support, so I was checking if we can switch back to datadog-go as the other library does not support distribution metrics, which I'd like to use in some places. As part of that experiment, I profiled both versions. As you can see in the image from the profiler in the original post, datadog-go was about 2.2% CPU time in the staging environment and getContext/getContextAndTags was majority of that. At the same time it was obvious from the flamegraph that the function can be optimized pretty easily. Now the profiler shows about 1.6% CPU for datadog-go. I also tried Prometheus Go client (with counter vectors only) for comparison and that is around 1.4% CPU, so much closer now.
In the staging environment where I tested this about 42k metrics per second in one pod before aggregation (as shown by
|
It is not necessary to do multiple allocations and copying,
single pass is enough.