-
Notifications
You must be signed in to change notification settings - Fork 3
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Telemetry with two implementations: basic usage logging and opentelemetry.io observability #92
Conversation
telemetry.setHeaders(headersMap); | ||
if (headers != null) { | ||
for (String header : headers) { | ||
String lookup = parser.get(String.format(TELEMETRY_FIELDS_HEADER, header), header); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Took a while to understand how this works, but it should.
if (collectionsWildcard == null) { | ||
return ServiceTelemetry.NOP; | ||
|
||
} else if (service != null && "*".equals(collectionsWildcard)) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
if service is null here then everything is probably messed up badly and we should just fail.
Should use TELEMETRY_COLLECTIONS_WILDCARD
instead of "*"
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
|
||
public interface TelemetrySpan extends AutoCloseable { | ||
|
||
default void counts(int count) {}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Not sure if the default implementation should be a NOP. I would've marked these as required to implement and only implemented these as empty functions in the NOP implementation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll remove default method declarations
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
request.getFormat().getResponseHeaders(request).forEach((k, v) -> builder.header(k, v)); | ||
builder.entity(baos.toByteArray()); | ||
return builder.build(); | ||
span.counts(1); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If telemetry fails then the whole operation fails. This seems more auditlog-gy (logging the operation is an essential part of the operation) - this is probably by design, just wanted to make sure.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think telemetry should not prevent operation.
Maybe we'll need a bunch of try catch-all additions to telemetry?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In GetCollectionItemById it's probably simple as we prefetch the feature into memory and always call counts(1), so probably something like:
final Response response = builder.entity(baos.toByteArray()).build();
// Nothing can fail after this line, except the Telemetry, but we can catch it
try (TelemetrySpan span = ftt.span()) {
span.counts(1);
} catch (Exception e) {
// log or ignore the exception
}
return response;
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Spans have the ability to record the duration of operations.
We would lose that if we are not wrapping the actual operations with spans.
Maybe there should be a contract that "telemetry shall never fail"?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In GetCollectionItemsOperation is probably a bit trickier due to streaming. I guess we could move WriteReport report
to outside of the inner try block, e.g.:
WriteReport report = null;
try (FeatureStream features = producer.getFeatures(request, c);
FeatureCollectionWriter writer = request.getFormat().getFeatureCollectionWriter()) {
...
report = SimpleFeatureWriter.writeFeatureCollection(writer, ft, c.getProperties(), features, request.getOffset(), request.getLimit());
...
}
if (report != null) {
try (TelemetrySpan span = ftt.span()) {
span.counts(report);
} catch (Exception e) {
// Log or ignore
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comment as before - we would lose duration tracking with that change
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh okay. Yes then it would be best that telemetry implementations can never throw exceptions (checked or unchecked)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
kept this as is to have the possibility to trace spans and span durations
|
||
int srid = request.getSRID(); | ||
int srid = request.getSRID(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some whitespace changes here make it harder to tell what actually changed in this file. Not necessary to change, should look into moving towards some projection wide linting solution.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes, that would be nice.
I had to reindent everything when coding jsonfg due missing indentation settings, sorry about that
public class TracingConfiguration { | ||
|
||
/** The number of milliseconds between metric exports. */ | ||
private static final long METRIC_EXPORT_INTERVAL_MS = 800L; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should this be configurable?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes
I'll add that to parse()
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
Added lifecycle methods for ServiceTelemetry and added telemetry to toClose collection in context listener |
Looks good to me |
Fixes #90
This pull request adds telemetry to hakunapi.
Example configuration
-- https://opentelemetry.io/docs/collector/
-- https://github.com/open-telemetry/opentelemetry-java-instrumentation/tree/main/instrumentation
Example logging (using io.opentelemetry.exporter.logging.LoggingSpanExporter )