Support span event listeners #8619

tjquinno · 2024-04-05T04:45:05Z

Description

Resolves #8352 using restrictive adapters around existing types (Span, Span.Builder, Scope) passed to the developer's callback methods so user code cannot change the state of the span, span builder, or scope.

How it works

Helidon adds the io.helidon.tracing.SpanListener interface for the life cycle listener behavior.
At runtime our provider implementations (OpenTelemetry, OpenTracing, Zipkin, Jaeger) of the tracing types (Tracer, Span.Builder, Span, Scope) invoke the known listeners at the right times, passing proxies for the relevant objects--span builders, spans, and scopes--to prevent user code in the listener from interfering with the key characteristics or life cycle of the actual objects.
How does Helidon know about listener instances?
- Explicit registration
  
  Developer explicitly invokes aTracer.register(myListener). Many users would probably use Tracer.global() anyway but we wouldn't force that.
- Service loading
  
  Developer implements SpanListener and prepares the META-INF/services/io.helidon.tracing.SpanListener file to refer to the implementation and, if appropriate, adds a provides...with to the application or library module-info.java. Using the service loader mechanism makes it very easy for developers to register listeners with all tracers without needing to modify the code that creates individual tracers so the listeners can be registered with all of them explicitly.
Usually developers use Tracer.global() rather than creating individual Tracer instances in their code, but it's useful to allow developers to explicitly register listeners on individual Tracer objects. It also simplifies unit testing.

Key added type

`io.helidon.tracing.SpanListener`

Method	When Invoked	Main Usage of parameter(s)
`starting(Span.Builder<?>)`	Before a span is started from its builder. (formerly `beforeStart`)	Assign tags (name/value pairs).
`started(Span)`	After a span has started. (formerly `afterStart`)	Assign tags, add events, update baggage.
`activated(Span, Scope)` †	After a span has been activated, creating a new scope in the process. (formerly `afterActivate`)	Note span becoming active.
`closed(Span, Scope)` †	After a scope has been closed. (formerly `afterClose`)	Note span is no longer active.
`ended(Span)` *	After a span has ended successfully. (formerly `afterEnd`)	Note completion of span.
`ended(Span, Throwable)` *	After a span has ended unsuccessfully. (formerly `afterEnd`)	Note completion of span.

† Not all spans are activated; it is up to the application or library code that creates and manages the span. As a result Helidon might not invoke the listeners' activated and closed methods for every span.

* The successful or unsuccessful nature of a span's end is not about whether the tracing or telemetry system failed to end the span. Rather, it indicates whether the code that ended the span indicated some error in the processing which the span represents.

Error handling

Helidon catches any exception thrown by listener methods or the Helidon-provided proxies and logs a warning message describing the exception. This approach insulates the original developer's code which deals with the span from having to know whether listeners are present, whether they might throw exceptions, etc.

Limitations on what operations the listener can invoke on parameters

Lifecycle listeners cannot alter the lifecycle or the essential nature of the parameters they are passed. The arguments passed are proxies which implement the interfaces (Span.Builder, Span, Scope) but the state-changing or other forbidden operations throw UnsupportedOperationException; permitted operations delegate to the actual underlying object.

`io.helidon.tracing.Span.Builder`

Method	Purpose	OK?
`build()`	Starts the span.	-
`end` methods	Ends the span.	-
`get()`	Starts the span.	-
`kind(Kind)`	Sets the "kind" of span (server, client, internal, etc.)	-
`parent(SpanContext)`	Sets the parent of the span to be created from the builder.	-
`start()`	Starts the span.	-
`start(Instant)`	Starts the span.	-
`tag` methods	Add a tag to the builder before the span is built.	✓
`unwrap(Class)`	Cast the builder to the specified implementation type. †	✓

† Helidon returns the unwrapped object, not a safe adapter to it.

`io.helidon.tracing.Span`

Method	Purpose	OK?
`activate()`	Makes the span "current", returning a `Scope`. *	-
`addEvent` methods	Associate a string (and optionally other info) with a span.	✓
`baggage()`	Returns the `Baggage` instance associated with the span.	✓
`context()`	Returns the `SpanContext` associated with the span.	✓
`status(Status)`	Sets the status of the span.	-
any `tag` method	Add a tag to the span.	✓
`unwrap(Class)`	Cast the span to the specified implementation type. †	✓

* Helidon throws UnsupportedActivationException, a Helidon exception which extends UnsupportedOperationException and adds Scope scope(). This allows the caller to catch the exception and close the Scope which was created before a problem occurred in invoking the listeners.

† Helidon returns the unwrapped object, not a safe adapter to it.

`io.helidon.tracing.Scope`

Method	Purpose	OK?
`close()`	Close the scope.	-
`isClosed()`	Reports whether the scope is closed.	✓

`io.helidon.tracing.SpanContext`

Method	Purpose	OK?
`asParent(Span.Builder)`	Sets this context as the parent of a new span builder.	✓
`baggage()`	Returns `Baggage` instance associated with the span context.	✓
`spanId()`	Returns the span ID.	✓
`traceId()`	Returns the trace ID.	✓

Documentation

Included in the PR.

Test changes

The PR includes two new general testing types added to helidon-common-testing-junit5.

`InMemoryLoggingHandler`

Tests use static factory methods to create a handler and add it to a logger. The handler stores each LogRecord in a list and exposes the list so tests can examine the accumulated log records.

`LogRecordMatcher`

A Hamcrest matcher that deals with the LogRecord type, currently allowing for matching the thrown type or for matching the log message contents in a log record.

ljnelson

Most of my comments are probably preference-oriented, so choosing Request changes feels weird but I'm ticking that box. Nice work.

docs/src/main/asciidoc/includes/tracing/common-callbacks.adoc

ljnelson · 2024-04-16T03:02:42Z

docs/src/main/asciidoc/includes/tracing/common-callbacks.adoc

+
+|====
+
+{empty}* Helidon throws the link:{tracing-javadoc}/UnsupportedActivationException.html[`UnsupportedActivationException`] if a listener attempts an illegal operation from inside its `afterActivation` method. This Helidon exception extends `UnsupportedOperationException` and adds the `Scope scope()` method. Callers should catch this exception and close the `Scope`; Helidon will have activated the span and created the scope _before_ it invoked the listeners.


Controversial opinion: perhaps the Exception itself should implement AutoCloseable?

I guess I'm not seeing what problem this solves and how it does so.

Fair enough; a fair warning which I know you know already which is that no one is ever going to close that Scope! I was looking for some way to make it more obvious that you have to.

If a developer fails to close that Scope and asks why the scope remains active in the error situation we have the mechanism in place they can use to fix their code.

This approach also gives the developer the option to not immediately close the Scope. After all, the problem indicated by the exception is that an ill-mannered listener improperly tried to close the scope from its activated method. By throwing the exception Helidon both lets the developer's code know that and also communicates the scope that was created so the meaningful work that was supposed to take place within the scope could proceed (if that makes sense in the developer's use case) and then the developer's code can close the scope at the right time.

It's messy and involved but, I think, covers the corners.

A completely different alternative would be for Helidon to detect the listener methods' attempts to improperly alter the life cycle of the span or scope (as it does in the PR) but then log a warning instead of throw an exception.

An advantage to this approach is that the developer who wrote the main code being traced does not need to ever account for the possible presence of errant listeners. The life cycle that developer wrote is always followed regardless of what happens in the listeners, if there even are any. No special error handling code for dealing with bad listeners needs to clutter up the developer's code.

docs/src/main/asciidoc/includes/tracing/common-callbacks.adoc

tracing/tracing/src/main/java/io/helidon/tracing/SpanLifeCycleListener.java

ljnelson · 2024-04-16T03:22:58Z

tracing/tracing/src/main/java/io/helidon/tracing/Tracer.java

+     * @param listener the {@link SpanLifeCycleListener} to register
+     * @return the updated {@code Tracer}
+     */
+    Tracer register(SpanLifeCycleListener listener);


I'm old school but I'd prefer addSpanListener(SpanListener listener). I'm sure others will disagree.

ljnelson · 2024-04-16T03:23:38Z

tracing/tracing/src/main/java/io/helidon/tracing/UnsupportedActivationException.java

+ *     when Helidon throws this exception due to an error in a listener, the caller has no access to the {@code Scope} return value
+ *     return value.
+ */
+public class UnsupportedActivationException extends UnsupportedOperationException {


Controversial opinion: consider implementing AutoCloseable directly.

UnsupportedOperationException is designed for collection that do not support a subset of operations (such as write operations on a read only collection).
I do not think it is relevant for this use case, as all operations must be (by design of the API) supported.
I think extending RuntimeException is more aligned with what this does

ljnelson

Naming and such, primarily.

docs/src/main/asciidoc/includes/tracing/common-callbacks.adoc

ljnelson · 2024-04-16T17:10:36Z

docs/src/main/asciidoc/includes/tracing/common-callbacks.adoc

+
+|====
+
+{empty}* Helidon throws the link:{tracing-javadoc}/UnsupportedActivationException.html[`UnsupportedActivationException`] if a listener attempts an illegal operation from inside its `afterActivation` method. This Helidon exception extends `UnsupportedOperationException` and adds the `Scope scope()` method. Callers should catch this exception and close the `Scope`; Helidon will have activated the span and created the scope _before_ it invoked the listeners.


Fair enough; a fair warning which I know you know already which is that no one is ever going to close that Scope! I was looking for some way to make it more obvious that you have to.

docs/src/main/asciidoc/includes/tracing/common-callbacks.adoc

...ovider-tests/src/main/java/io/helidon/tracing/providers/tests/TestSpanLifeCycleListener.java

...providers/jaeger/src/main/java/io/helidon/tracing/providers/jaeger/JaegerTracerProvider.java

tracing/providers/zipkin/src/main/java/io/helidon/tracing/providers/zipkin/ZipkinTracer.java

tracing/tracing/src/main/java/io/helidon/tracing/SpanListener.java

tracing/tracing/src/main/java/io/helidon/tracing/Tracer.java

tracing/tracing/src/main/java/io/helidon/tracing/UnsupportedActivationException.java

ljnelson

A typo, a couple of nits, and a question on the documentation of SpanListener. Nothing big.

ljnelson · 2024-04-17T19:57:53Z

...on/testing/junit5/src/main/java/io/helidon/common/testing/junit5/InMemoryLoggingHandler.java

+ *     test--using try-with-resource--will automatically clear the handler's log records and detach the handler from the logger.
+ * </p>
+ */
+public class InMemoryLoggingHandler extends Handler implements AutoCloseable {


(Hackles go up; there are all sorts of classloading issues with logging that I never remember. Be careful this doesn't introduce some leak somewhere.)

There is also https://docs.oracle.com/en/java/javase/21/docs/api/java.logging/java/util/logging/MemoryHandler.html in case you want to extend something that presumably does everything correctly in this area.

The issue I know about is that the LogManager keeps only weak refs to loggers, so if you don't keep a ref yourself a given named logger might come and go seemingly randomly. Not sure if that's what you mean by the class loading issues.

I'm glad to know of MemoryHandler! I'll look at leveraging it.

Upon further review, the MemoryHandler seems designed as a front-end to a downstream handler target to which log records are pushed upon certain conditions.

The constructor uses either logging config to set it up or explicit constructor arguments, in either case rejecting a null target handler. The test use case for InMemoryLoggingHandler certainly does not need or want the target or pushing logic.

The only class loading action I see in MemoryHandler is when it locates the target handler class by name when it is specified in logging config.

I'll plan to stick with InMemoryLoggingHandler as-is unless other information comes to light.

ljnelson · 2024-04-17T20:00:20Z

docs/src/main/asciidoc/includes/tracing/common-callbacks.adoc

+A listener cannot affect the lifecycle of a span or scope it is notified about, but it can add tags and events and update the baggage associated with a span.
+Often a listener does additional work that does not change the span or scope such as logging a message.
+
+When Helidon invokes the listener's methods it passes proxies for the parameter types. These proxies limit the access the listener has to the span builder, span, or scope, as summarized in the following table. If a listener method tries to invoke an forbidden operation, the proxies throw an `UnsupportedOperationException` and Helidon then logs a `WARNING` message describing the invalid operation invocation.


"proxies for the parameter types" ▶️ "proxies for the arguments"

Perhaps not thinking clearly, but if the outcome of this sort of thing is a warning message, then do you even have to construct/throw the UnsupportedOperationException? Can't you just do a no-op instead?

Text fixed.

Throwing the exception easily captures the stack trace which handily incriminates the offending listener code and it's consistent with the general practice of "forbidden" methods throwing UnsupportedOperationException.

docs/src/main/asciidoc/includes/tracing/common-callbacks.adoc

...providers/jaeger/src/main/java/io/helidon/tracing/providers/jaeger/JaegerTracerProvider.java

tracing/providers/zipkin/src/main/java/io/helidon/tracing/providers/zipkin/ZipkinSpan.java

...ng/providers/zipkin/src/main/java/io/helidon/tracing/providers/zipkin/ZipkinSpanBuilder.java

tracing/providers/zipkin/src/main/java/io/helidon/tracing/providers/zipkin/ZipkinTracer.java

tracing/tracing/src/main/java/io/helidon/tracing/SpanListener.java

...telemetry/src/main/java/io/helidon/tracing/providers/opentelemetry/HelidonOpenTelemetry.java

tracing/providers/opentelemetry/src/main/java/module-info.java

...roviders/opentracing/src/main/java/io/helidon/tracing/providers/opentracing/OpenTracing.java

tomas-langer · 2024-04-18T11:58:33Z

tracing/tracing/src/main/java/io/helidon/tracing/UnsupportedActivationException.java

+ *     when Helidon throws this exception due to an error in a listener, the caller has no access to the {@code Scope} return value
+ *     return value.
+ */
+public class UnsupportedActivationException extends UnsupportedOperationException {


UnsupportedOperationException is designed for collection that do not support a subset of operations (such as write operations on a read only collection).
I do not think it is relevant for this use case, as all operations must be (by design of the API) supported.
I think extending RuntimeException is more aligned with what this does

tracing/tracing/src/main/java/module-info.java

…ests

…tale context in place

…an close the scope

…pping

…file

…e new test types

…nd update description of exeption handling and WARNING logging

tomas-langer · 2024-04-22T18:11:12Z

tracing/provider-tests/pom.xml

+    See the License for the specific language governing permissions and
+    limitations under the License.
+  -->
+<project xmlns:xsi="http://www.w3.org/2001/XMLSchema-instance"


This module is in a strange location.
In general integration tests should be under module tracing/tests or tests/integration as in other modules. This module should not be here.

This is the same approach we chose for metrics. These are tests which all providers need to pass:

metrics/ provider-tests providers/ micrometer

and, in this PR following the same pattern:

tracing/ provider-tests providers/ jaeger opentelemetry opentracing zipkin

...entelemetry/src/main/java/io/helidon/tracing/providers/opentelemetry/OpenTelemetryScope.java

tracing/tracing/src/main/java/module-info.java

tjquinno self-assigned this Apr 5, 2024

oracle-contributor-agreement bot added the OCA Verified All contributors have signed the Oracle Contributor Agreement. label Apr 5, 2024

tjquinno marked this pull request as draft April 5, 2024 05:11

tjquinno marked this pull request as ready for review April 7, 2024 16:06

tjquinno requested review from tomas-langer, ljnelson, spericas, barchetta, arjav-desai and ljamen April 15, 2024 20:26

ljnelson requested changes Apr 16, 2024

View reviewed changes

tjquinno force-pushed the 4.x-span-listeners branch from b9f6e91 to 41b860a Compare April 16, 2024 16:10

tjquinno requested a review from ljnelson April 16, 2024 17:07

ljnelson requested changes Apr 16, 2024

View reviewed changes

tjquinno requested a review from ljnelson April 16, 2024 19:34

tjquinno marked this pull request as draft April 16, 2024 22:27

tjquinno marked this pull request as ready for review April 17, 2024 15:55

ljnelson requested changes Apr 17, 2024

View reviewed changes

tjquinno requested a review from ljnelson April 17, 2024 21:26

tomas-langer reviewed Apr 18, 2024

View reviewed changes

tjquinno requested a review from tomas-langer April 19, 2024 10:43

tjquinno added 6 commits April 19, 2024 06:51

Support span event listeners

bf2e824

Accept registrations of listeners directly to the Tracer

a594d5b

Do not add tracing providers tests to bom quite yet

0bae59c

No need to use unmodifiable list internally

2e70d73

Add provider-tests run for all tracing providers; add span listener t…

7fc8edc

…ests

Improve tests; add auto-loaded listener; fix up old tests that left s…

3c3e0eb

…tale context in place

tjquinno added 13 commits April 19, 2024 06:51

Introduce our own exception to indicate failed activation so caller c…

49a1e87

…an close the scope

Update javadoc and doc pages

b47da07

Remove commented code

5870fe4

Correct refcs to SpanInfo, etc. types - replace with Span, etc.

9801765

Enhance unwrap; add test to check unwrapping as a stand-in for JFR ma…

d3a2b28

…pping

Review comments

22b0e1f

Mild doc improvements; change copyright again in otherwise untouched …

94f23b4

…file

More review comments

a629bc1

Move test support types to common/testing/junit5; enhance tests to us…

476f9c5

…e new test types

Remove unused type

c9b3086

Revise doc to remove references to the now-defunct custom exception a…

1a8c945

…nd update description of exeption handling and WARNING logging

Review comments

cf16b52

Review comments

05cfd39

tjquinno force-pushed the 4.x-span-listeners branch from c6a20a8 to 05cfd39 Compare April 19, 2024 10:52

tomas-langer reviewed Apr 22, 2024

View reviewed changes

Fix formatting (review comments)

66388a0

tjquinno requested a review from tomas-langer April 22, 2024 19:58

Use fully-qualified name in uses (review comment)

0fc66e3

tomas-langer approved these changes Apr 24, 2024

View reviewed changes

ljnelson approved these changes Apr 24, 2024

View reviewed changes

tjquinno merged commit e93ee0c into helidon-io:main Apr 25, 2024
12 checks passed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Support span event listeners #8619

Support span event listeners #8619

tjquinno commented Apr 5, 2024 •

edited

ljnelson left a comment

ljnelson Apr 16, 2024

tjquinno Apr 16, 2024

ljnelson Apr 16, 2024

tjquinno Apr 16, 2024

ljnelson Apr 16, 2024

ljnelson Apr 16, 2024

tomas-langer Apr 18, 2024

ljnelson left a comment

ljnelson Apr 16, 2024

ljnelson left a comment

ljnelson Apr 17, 2024

tjquinno Apr 17, 2024

tjquinno Apr 17, 2024 •

edited

ljnelson Apr 17, 2024

tjquinno Apr 17, 2024 •

edited

tomas-langer Apr 18, 2024

tomas-langer Apr 22, 2024

tjquinno Apr 22, 2024 •

edited


		\|====

		{empty}* Helidon throws the link:{tracing-javadoc}/UnsupportedActivationException.html[`UnsupportedActivationException`] if a listener attempts an illegal operation from inside its `afterActivation` method. This Helidon exception extends `UnsupportedOperationException` and adds the `Scope scope()` method. Callers should catch this exception and close the `Scope`; Helidon will have activated the span and created the scope _before_ it invoked the listeners.

Support span event listeners #8619

Support span event listeners #8619

Conversation

tjquinno commented Apr 5, 2024 • edited

Description

How it works

Key added type

io.helidon.tracing.SpanListener

Error handling

Limitations on what operations the listener can invoke on parameters

io.helidon.tracing.Span.Builder

io.helidon.tracing.Span

io.helidon.tracing.Scope

io.helidon.tracing.SpanContext

Documentation

Test changes

InMemoryLoggingHandler

LogRecordMatcher

ljnelson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ljnelson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

ljnelson left a comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tjquinno Apr 17, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tjquinno Apr 17, 2024 • edited

Choose a reason for hiding this comment

Choose a reason for hiding this comment

Choose a reason for hiding this comment

tjquinno Apr 22, 2024 • edited

Choose a reason for hiding this comment

tjquinno commented Apr 5, 2024 •

edited

`io.helidon.tracing.SpanListener`

`io.helidon.tracing.Span.Builder`

`io.helidon.tracing.Span`

`io.helidon.tracing.Scope`

`io.helidon.tracing.SpanContext`

`InMemoryLoggingHandler`

`LogRecordMatcher`

tjquinno Apr 17, 2024 •

edited

tjquinno Apr 17, 2024 •

edited

tjquinno Apr 22, 2024 •

edited