New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. Weβll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(metrics): collect gauge values asynchronously #213
Conversation
765be8d
to
8aae09b
Compare
When using the "otel" collector for metrics-reporter. You will get a log message like: WARNING: Instrument X has recorded multiple values for the same attributes. This is because the Gauge builder in the OpenTelemetry Java SDK can only collect gauge values asynchronously. The old code essentially redefined the gauge every time with a callback that produces the same value. The first registration succeeds since there is no gauge defined; but subsequent gauge updates failed because of the duplicate attributes. This fix attempts to solve this by using a synchronous map and only defining the gauge once. The callback uses the sync map to get the latest value. Subsequent calls to gauge update the value in the map rather than re-defining the gauge again. This sort of a hack. The way that the API of OpenTelemetry is designed is such that you should define your gauge up-front, passing a callback which would perform the measurements periodically by the SDK. However, this API does not conform to the MetricClient interface so this adapter is necessary unless the metrics client is refactored to support this use-case. Fixes #15623
8aae09b
to
4d84fcd
Compare
@xiaohansong - would perhaps be the best person to review according to git history. |
- Add some comments to explain the implementation and why it is done this way. - A little refactoring of the method body
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your contribution!
I'm thinking maybe gauge registration should happen in initialization
function, while the gauge
function is simply registering the metric/attributes into map you defined above.
MetricClient is supposed to be singleton. The callback from gauge initialization will query from the map, just like the way you implemented. What do you think?
airbyte-metrics/metrics-lib/src/main/java/io/airbyte/metrics/lib/OpenTelemetryMetricClient.java
Outdated
Show resolved
Hide resolved
We could create all gauges in initialization; however I think that does mean that it's possible to call the gauge method with a metric name that does not exist in the map. It also means that we'll have to remember to add new gauges here or it will fail (perhaps silently, depending on implementation). I'll defer to your judgement if that's acceptable. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@justenwalker Ah I misunderstood the change. Sounds like each metric will need to register onto opentelemetry gauge and their corresponding callback functions will be triggered periodically. This makes sense, please ignore my previous comment. Thanks!
Hi, OP for airbytehq/airbyte#15623 π We tested the PR using a local Airbyte v0.42.1 stack (similar to the one documented on the Discourse thread), and:
Gauges are properly emitted, and we see no errors in the |
/create-oss-pr |
I run the format fix and then merged the PR. 13b08e6 Thanks for the contribution! |
Thanks @justenwalker for the contribution and @xiaohansong for merging π One quick remark though: the resulting commit, 13b08e6 , does not reference the original PR ( I guess The commit does not even reference the original author, and there are 20 people identified as co-authors, which feels kind of weird? I have wondered at this several times since the Could such commits credit the original author(s), and reference the original issues and PRs, e.g. by adding links in the commit message? |
Ack - sorry about that, I think we identified a bug in our merging command. These 20 people are mostly generated from empty commits by merging main branch into this one. I understand it's frustrating and discouraging to lose author's information. We are pretty new when converting to this |
Your branch is not currently up-to-date with |
What
When using the "otel" collector for metrics-reporter. You will get a log message like:
This is because the Gauge builder in the OpenTelemetry Java SDK can only collect gauge values asynchronously.
The old code essentially redefined the gauge every time with a callback that produces the same value. The first registration succeeds since there is no gauge defined; but subsequent gauge updates failed because of the duplicate attributes.
How
This fix attempts to solve this by using a synchronous map and only defining the gauge once. The callback uses the sync map to get the latest value. Subsequent calls to gauge update the value in the map rather than re-defining the gauge again.
This sort of a hack. The way that the API of OpenTelemetry is designed is such that you should define your gauge up-front, passing a callback which would perform the measurements periodically by the SDK. However, this API does not conform to the
MetricClient
interface so this adapter is necessary unless the metrics client is refactored to support this use-case.Fixes airbytehq/airbyte#15623
Recommended reading order
airbyte-metrics/metrics-lib/src/main/java/io/airbyte/metrics/lib/OpenTelemetryMetricClient.java
airbyte-metrics/metrics-lib/src/test/java/io/airbyte/metrics/lib/OpenTelemetryMetricClientTest.java
Can this PR be safely reverted / rolled back?
π¨ User Impact π¨
Should be no user impact (aside from fixing the bug)