Horreum observability #845

johnaohara · 2023-10-30T08:01:16Z

The Opentelemetry extension allows us to view telemetry data and export it to prometheus

I was thinking about: #342 and observability in general. I was able to identify the root cause of the issue using custom opentelemetry spans to track the tasks placed in the taskqueue.

How do we want to observe a running prod instance of horreum, want info do we want to track and how do we want to access it? For example, there is not enough information to understand what is happening in #342 from the logs even at a debug level.

Do we want to rely on grepping debug logs to find information, or do we want some form of observability tool to be able to query recorded events and obtain the necessary information to understand what is happening the running instance?

It is relatively simple to add OpenTelemetry as a telemetry backend, but the neccesary infrastructure is required to process the telemetry data (otel collector, promethues, jaeger etc). the flip side is the Quarkus OpenTelemetry plugin provides insight and error tracing in some of the subsystems in Quarkus. Including parts of the system we would not naturally think to instrument, e.g.;

Originally posted by @johnaohara in #365

stalep · 2023-10-30T11:39:45Z

I think this would help us a lot when we encounter issues in Horreum and we have trouble finding the errors in the logs. Even though log reporting should be better with 0.10 this would make it even better.

johnaohara added type/enhancement An enhancement to an existing feature branch/master The master branch area/backend labels Oct 30, 2023

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Horreum observability #845

Horreum observability #845

johnaohara commented Oct 30, 2023

stalep commented Oct 30, 2023

Horreum observability #845

Horreum observability #845

Comments

johnaohara commented Oct 30, 2023

stalep commented Oct 30, 2023