-
Notifications
You must be signed in to change notification settings - Fork 231
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
Consider OTLP export failures handleable errors #1565
Conversation
@@ -61,6 +61,7 @@ def initialize(endpoint: nil, | |||
@http = http_connection(@uri, ssl_verify_mode, certificate_file) | |||
|
|||
@path = @uri.path | |||
@uri_string = @uri.to_s |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Doesn't feel worthwhile caching this just for use in the error message.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log_request_failure(response.code) | ||
FAILURE | ||
when Net::HTTPRequestTimeOut, Net::HTTPGatewayTimeOut, Net::HTTPBadGateway | ||
response.body # Read and discard body | ||
redo if backoff?(retry_count: retry_count += 1, reason: response.code) | ||
log_request_failure(response.code) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
These two calls will double count if backoff?
returns false
because backoff?
also does @metrics_reporter.add_to_counter('otel.otlp_exporter.failure' ...)
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice catch 馃槄 - will 馃 and 馃捇
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Co-authored-by: Francis Bogsanyi <francis.bogsanyi@shopify.com>
Co-authored-by: Francis Bogsanyi <francis.bogsanyi@shopify.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm so glad you submitted this PR!
I've manually added logging like this around these errors so many times, and it'll be great to have to have that baked in. 馃崻
Hey! 馃憢
Going back a bit to what you said there, we appreciate the visibility 馃挴 but at least for us it would make more sense if these failures coming from retries were logged at debug, info or warning level, since it's retried anyway (a different situation would be if it's still failing after all retry attempts). What do you think? Are you all still open to discussing it? 馃槃 |
馃槵 yeah, that鈥檚 bad. We shouldn鈥檛 call |
Sorry for the trouble @muripic, we will address this. |
Thank you! 鉂わ笍 |
Of course! Should be addressed by #1589 |
This PR attempts to address #1160.
The general idea is that:
handle_error
method decide whether or not these particular failures are importantNot everyone has a metrics reporter installed, so some log-level visibility might be desirable.
Alternatively, we could
OpenTelemetry::Logger.warn
(or.debug
orinfo
whatever) in these scenarios. I'm happy with that outcome too. Or whatever other great ideas folks may have 馃槃.