You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
The Opentelemetry extension allows us to view telemetry data and export it to prometheus
I was thinking about: #342 and observability in general. I was able to identify the root cause of the issue using custom opentelemetry spans to track the tasks placed in the taskqueue.
How do we want to observe a running prod instance of horreum, want info do we want to track and how do we want to access it? For example, there is not enough information to understand what is happening in #342 from the logs even at a debug level.
Do we want to rely on grepping debug logs to find information, or do we want some form of observability tool to be able to query recorded events and obtain the necessary information to understand what is happening the running instance?
It is relatively simple to add OpenTelemetry as a telemetry backend, but the neccesary infrastructure is required to process the telemetry data (otel collector, promethues, jaeger etc). the flip side is the Quarkus OpenTelemetry plugin provides insight and error tracing in some of the subsystems in Quarkus. Including parts of the system we would not naturally think to instrument, e.g.;
I think this would help us a lot when we encounter issues in Horreum and we have trouble finding the errors in the logs. Even though log reporting should be better with 0.10 this would make it even better.
The Opentelemetry extension allows us to view telemetry data and export it to prometheus
I was thinking about: #342 and observability in general. I was able to identify the root cause of the issue using custom opentelemetry spans to track the tasks placed in the taskqueue.
How do we want to observe a running prod instance of horreum, want info do we want to track and how do we want to access it? For example, there is not enough information to understand what is happening in #342 from the logs even at a debug level.
Do we want to rely on grepping debug logs to find information, or do we want some form of observability tool to be able to query recorded events and obtain the necessary information to understand what is happening the running instance?
It is relatively simple to add OpenTelemetry as a telemetry backend, but the neccesary infrastructure is required to process the telemetry data (otel collector, promethues, jaeger etc). the flip side is the Quarkus OpenTelemetry plugin provides insight and error tracing in some of the subsystems in Quarkus. Including parts of the system we would not naturally think to instrument, e.g.;
Originally posted by @johnaohara in #365
The text was updated successfully, but these errors were encountered: