Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add trace.id to request trace logs #91772

Merged
merged 14 commits into from Nov 23, 2022
Merged

Add trace.id to request trace logs #91772

merged 14 commits into from Nov 23, 2022

Conversation

thecoop
Copy link
Member

@thecoop thecoop commented Nov 21, 2022

Very dirty solution to #88174 - just add the trace id to the log message, rather than using thread context

@elasticsearchmachine elasticsearchmachine added needs:triage Requires assignment of a team area label v8.7.0 labels Nov 21, 2022
@thecoop thecoop added :Core/Infra/Core Core issues without another label and removed needs:triage Requires assignment of a team area label labels Nov 21, 2022
@elasticsearchmachine
Copy link
Collaborator

Pinging @elastic/es-core-infra (Team:Core/Infra)

@elasticsearchmachine elasticsearchmachine added the Team:Core/Infra Meta label for core/infra team label Nov 21, 2022
@thecoop thecoop added >bug Team:Core/Infra Meta label for core/infra team and removed Team:Core/Infra Meta label for core/infra team labels Nov 21, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @thecoop, I've created a changelog YAML for you.

@thecoop thecoop added :Core/Infra/REST API REST infrastructure and utilities and removed :Core/Infra/Core Core issues without another label labels Nov 21, 2022
server/src/main/java/org/elasticsearch/tasks/Task.java Outdated Show resolved Hide resolved
restRequest.getRequestId(),
restRequest.header(Task.X_OPAQUE_ID_HTTP_HEADER),
restRequest.method(),
restRequest.uri(),
restRequest.getHttpChannel()
restRequest.getHttpChannel(),
Task.extractTraceId(restRequest.header(Task.TRACE_PARENT_HTTP_HEADER)).map(t -> " trace.id: " + t).orElse("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a test ensuring the new code produces the expected result? I know logger is hard to test, so perhaps we can extract the code into a method? I also recall we had a way to ensure logs had a certain message in some tests, but I can't seem to find it now.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of those tests are in JsonLoggerTests

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Unfortunately that's in a different package, and HttpTracer is package private. I'll see what I can do

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I've added a separate test

Copy link
Contributor

@pgomulka pgomulka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the workaround is ok.
bear in mind that this still will produce a different result in request than in response (the trace.id will be inside a message field". Therefore it will be harder to automatically parse.
I suspect that HttpTracer class was meant to be used for human debugging - hence the trace level which will be unfeasible for investigating problems happening in real clusters.
request after changes:
{ ... "message": "[24][SomeUniqueValue][GET][/test-index/_search] received request from [Netty4HttpChannel{localAddress=/172.18.0.2:9200, remoteAddress=/172.18.0.1:62278}] trace.id: 4bf92f3577b34da6a3ce929d0e0e4736",
response
"message": "[24][SomeUniqueValue][OK][application/json; charset=UTF-8][1005] sent response to [Netty4HttpChannel{localAddress=/172.18.0.2:9200, remoteAddress=/172.18.0.1:62278}] success [true]",..., "trace.id": "4bf92f3577b34da6a3ce929d0e0e4736"

restRequest.getRequestId(),
restRequest.header(Task.X_OPAQUE_ID_HTTP_HEADER),
restRequest.method(),
restRequest.uri(),
restRequest.getHttpChannel()
restRequest.getHttpChannel(),
Task.extractTraceId(restRequest.header(Task.TRACE_PARENT_HTTP_HEADER)).map(t -> " trace.id: " + t).orElse("")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some of those tests are in JsonLoggerTests

@@ -57,12 +57,13 @@ HttpTracer maybeLogRequest(RestRequest restRequest, @Nullable Exception e) {
if (logger.isTraceEnabled() && TransportService.shouldTraceAction(restRequest.uri(), tracerLogInclude, tracerLogExclude)) {
logger.trace(
() -> format(
"[%s][%s][%s][%s] received request from [%s]",
"[%s][%s][%s][%s] received request from [%s]%s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should the new %s be also in square brackets?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was aiming to get it looking as similar as possible to the response log format

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we actually don't want to keep this the same in both request and response. so that it would be easier to parse. the same as it is done for x-opaque-id. That means that trace.id or x-opaque-id are duplicated in a response. Present in both message and its own field.
I leave that up to you, or feel free to bring it up with the team

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Getting the trace id in there is not completely simple, so I would prefer to leave this hack as small a footprint as possible - if people do complain about the difference, we can then spend more time on this looking at it properly

@thecoop
Copy link
Member Author

thecoop commented Nov 22, 2022

This was originally reported on 7.17, should this fix be backported at all?

Copy link
Contributor

@pgomulka pgomulka left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, I left one comment. Feel free to apply this or merge as it is.

@@ -57,12 +57,13 @@ HttpTracer maybeLogRequest(RestRequest restRequest, @Nullable Exception e) {
if (logger.isTraceEnabled() && TransportService.shouldTraceAction(restRequest.uri(), tracerLogInclude, tracerLogExclude)) {
logger.trace(
() -> format(
"[%s][%s][%s][%s] received request from [%s]",
"[%s][%s][%s][%s] received request from [%s]%s",
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wonder if we actually don't want to keep this the same in both request and response. so that it would be easier to parse. the same as it is done for x-opaque-id. That means that trace.id or x-opaque-id are duplicated in a response. Present in both message and its own field.
I leave that up to you, or feel free to bring it up with the team

Copy link
Contributor

@grcevski grcevski left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM! Great test!

@thecoop thecoop added v7.17.8 auto-backport-and-merge Automatically create backport pull requests and merge when ready labels Nov 23, 2022
@elasticsearchmachine
Copy link
Collaborator

Hi @thecoop, I've updated the changelog YAML for you.

@thecoop thecoop merged commit f62db9f into elastic:main Nov 23, 2022
@thecoop thecoop deleted the http-trace-id branch November 23, 2022 15:41
thecoop added a commit to thecoop/elasticsearch that referenced this pull request Nov 23, 2022
This adds trace.id, extracted from traceparent header, so its present in the request trace log. It is already included in response trace logs via the threadcontext, but that is not set at request log time.
@elasticsearchmachine
Copy link
Collaborator

💔 Backport failed

Status Branch Result
7.17 Commit could not be cherrypicked due to conflicts
8.6
8.5

You can use sqren/backport to manually backport by running backport --upstream elastic/elasticsearch --pr 91772

thecoop added a commit to thecoop/elasticsearch that referenced this pull request Nov 23, 2022
This adds trace.id, extracted from traceparent header, so its present in the request trace log. It is already included in response trace logs via the threadcontext, but that is not set at request log time.
elasticsearchmachine pushed a commit that referenced this pull request Nov 23, 2022
This adds trace.id, extracted from traceparent header, so its present in the request trace log. It is already included in response trace logs via the threadcontext, but that is not set at request log time.
thecoop added a commit that referenced this pull request Nov 24, 2022
* Add trace.id to request trace logs (#91772)

This adds trace.id, extracted from traceparent header, so its present in the request trace log. It is already included in response trace logs via the threadcontext, but that is not set at request log time.
thecoop added a commit that referenced this pull request Nov 24, 2022
* Add trace.id to request trace logs (#91772)

This adds trace.id, extracted from traceparent header, so its present in the request trace log. It is already included in response trace logs via the threadcontext, but that is not set at request log time.
@thecoop
Copy link
Member Author

thecoop commented Nov 24, 2022

All backports applied

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
auto-backport-and-merge Automatically create backport pull requests and merge when ready >bug :Core/Infra/REST API REST infrastructure and utilities Team:Core/Infra Meta label for core/infra team v7.17.8 v8.5.3 v8.6.1 v8.7.0
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants