Skip to content

add initial logback-access-jetty-12 instrumentation #13927

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 11 commits into
base: main
Choose a base branch
from

Conversation

cuichenli
Copy link
Contributor

part of #12597

one basic instrumentation for logback-access-jetty-12. it only records limited attributes and will add more in follow up pr.

@cuichenli cuichenli requested a review from a team as a code owner May 27, 2025 08:44
@otelbot-java-instrumentation
Copy link
Contributor

🔧 The result from spotlessApply was committed to the PR branch.

@cuichenli
Copy link
Contributor Author

@jaydeluca have updated the code accordingly and the ci pass now . please take another look. thanks

Co-authored-by: Jay DeLuca <jaydeluca4@gmail.com>
@cuichenli
Copy link
Contributor Author

@laurit both updated, please take another look. thanks!

Copy link
Contributor

@laurit laurit left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Comment on lines +39 to +50
.setAttribute(
HttpAttributes.HTTP_RESPONSE_STATUS_CODE, Long.valueOf(accessEvent.getStatusCode()))
.setAttribute(ClientAttributes.CLIENT_ADDRESS, accessEvent.getRemoteAddr())
.setAttribute(ServerAttributes.SERVER_ADDRESS, accessEvent.getServerName())
.setAttribute(
UserAgentAttributes.USER_AGENT_ORIGINAL,
accessEvent.getRequestHeader("HTTP User-Agent"));
if (uri != null) {
builder
.setAttribute(UrlAttributes.URL_PATH, uri.getPath())
.setAttribute(UrlAttributes.URL_SCHEME, uri.getScheme());
}
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Our convention is that by default we only emit attributes that have semantic conventions. Although these attributes are define in semantic conventions there are no conventions specifically for logs or access logs. @trask could you clarify whether such usage is ok or whether there should be a flag or whether this instrumentation should be disabled by default.
Assuming that using attributes from http server semantic conventions is acceptable here you should try to follow them. Have a look at HttpServerAttributesExtractor there are some nuances on how these attributes are filled. For example as defined in spec http.request.method should only contain know request methods and otherwise use _OTHER. Perhaps here we should also try to use HttpServerAttributesExtractor for collecting the attributes and implement the getter interfaces for extracting info from the logging event?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there is an attempt to define access logs in semconv (open-telemetry/semantic-conventions#982), though I wouldn't follow the current draft state of it, and instead would follow @laurit's advice here to follow http semconv as much as possible

I suspect this should be disabled by default, since much of it is duplicative of the span data, and probably somewhat duplicative of the log data captured via the logback appender

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've been thinking about those comments for a while, and here are my two cents. Please feel free to correct me if I misunderstood anything.

I absolutely agree that we should aim to follow the semantic conventions as much as possible. However, I feel that strict adherence should mainly apply to the semantic convention names themselves. Since these are logs—not spans or metrics—the situation is a bit different. Only access logs are somewhat "structured," and they don’t have a true body, so we need to construct the log records from attributes.

If we adjust the attribute values to align too closely with the semantic conventions, I worry it might cause confusion. For example, take http.request.method: in the console, a method might be logged as XXX, and naturally, a user would search for XXX in the exported OpenTelemetry logs. But if they only see _OTHER instead, they may be confused—why doesn’t it match what was actually logged?

So my feeling is that we should treat the actual log content as the source of truth and avoid altering it too much just to conform to the semantic conventions.

Again, just my two cents—happy to be corrected if I'm off base. 🙇

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I absolutely agree that we should aim to follow the semantic conventions as much as possible. However, I feel that strict adherence should mainly apply to the semantic convention names themselves.

google defines the word semantic as relating to meaning in language or logic. They key part of semantic conventions is that besides a name they define a meaning for the attributes. To me assigning a different meaning to an attribute defined in the semantic conventions seems confusing.

If we adjust the attribute values to align too closely with the semantic conventions, I worry it might cause confusion. For example, take http.request.method: in the console, a method might be logged as XXX, and naturally, a user would search for XXX in the exported OpenTelemetry logs. But if they only see _OTHER instead, they may be confused—why doesn’t it match what was actually logged?

Semantic conventions define http.request.method_original for the original value of the request method.

So my feeling is that we should treat the actual log content as the source of truth and avoid altering it too much just to conform to the semantic conventions.

I think you should open a discussion in semantic conventions repo to get this clarified.

Copy link
Contributor Author

@cuichenli cuichenli Jun 10, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

google defines the word semantic as relating to meaning in language or logic. They key part of semantic conventions is that besides a name they define a meaning for the attributes. To me assigning a different meaning to an attribute defined in the semantic conventions seems confusing.

i think that makes sense. given we have attributes like http.request.method_original, do you think it is reasonable for this PR to start with some basic semantic convention attributes, which uses the original value would be sufficient? otherwise, the scope of this pr would be out of my expectation. thanks for your response

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do you think it is reasonable for this PR to start with some basic semantic convention attributes

sure, though I'd still urge you to consider implementing HttpServerAttributesGetter and using HttpServerAttributesExtractor to do the attribute extraction. If that works you'd get the same attributes as the http server span.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd still urge you to consider implementing HttpServerAttributesGetter and using HttpServerAttributesExtractor to do the attribute extraction.

Please feel free to correct me if I’m mistaken. My current understanding is that HttpServerAttributesGetter is tightly coupled with the Instrumenter class—so far, I haven't seen it used outside of that context. However, the Instrumenter class is specifically designed for spans (at least that is my understanding, still, it could be wrong), whereas we are dealing with logs. If we were to use it, it would result in new spans being created, which is not ideal, as they would duplicate the spans already generated by the jetty instrumentation.


builder
.setSeverity(Severity.UNDEFINED_SEVERITY_NUMBER)
.setEventName(ACCESS_EVENT_NAME)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

does event name need to be speced?

}

builder
.setSeverity(Severity.UNDEFINED_SEVERITY_NUMBER)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

undefined is the default, no need to explicitly set it

// here.
}

builder
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd imagine that setting the context with setContext(Context.current()) could also be useful here to allow for correlating traces and logs. Idk whether the context is available here.


@Advice.OnMethodExit(onThrowable = Throwable.class, suppress = Throwable.class)
public static void methodExit(@Advice.Local("otelCallDepth") CallDepth callDepth) {
callDepth.decrementAndGet();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

there will be a NPE here when event isn't an instance of IAccessEvent

static JettyServer jettyFixture;
static String jettyFixtureUrlStr;

static final int RANDOM_SERVER_PORT = RandomUtil.getRandomServerPort();
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ideally you let the server bind to port 0 and later ask what was the actual port the os gave it. If that is not an option then use our PortUtils.findOpenPort(). RandomUtil.getRandomServerPort() just picks a random number, it may collide with a port that is used in some other test.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants