-
Notifications
You must be signed in to change notification settings - Fork 1.4k
refactor query logging to allow separate log file #9513
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Jackie-Jiang
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is awesome!
| private final RateLimiter _droppedLogRateLimiter; | ||
| private final AtomicLong _numDroppedLogs = new AtomicLong(0L); | ||
|
|
||
| public QueryLogger(PinotConfiguration config) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(format) Can you please reformat all the changes with Pinot Style
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yup :) thanks for the review
| // Extract source info from incoming request | ||
| logBrokerResponse(requestId, query, requestContext, tableName, numUnavailableSegments, serverStats, brokerResponse, | ||
| totalTimeMs, requesterIdentity); | ||
| _queryLogger.log( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am guessing (hoping actually) the log response format / structure is not changed by this PR ? I ask because we have few internal scripts / tools for debug etc that parse the broker query log and so are aware of the log pattern / regex.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes this is correct, I triple/quadruple checked (and going forward there's a test to make sure it doesn't)!
| * entry, add it to the end of the existing list. | ||
| */ | ||
| private enum QueryLogEntry { | ||
| REQUEST_ID("requestId") { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think my previous question is addressed by this. Seems like the order is same as today.
Codecov Report
@@ Coverage Diff @@
## master #9513 +/- ##
=============================================
+ Coverage 28.31% 68.56% +40.25%
- Complexity 53 5205 +5152
=============================================
Files 1912 1927 +15
Lines 102058 102673 +615
Branches 15501 15566 +65
=============================================
+ Hits 28893 70395 +41502
+ Misses 70348 27293 -43055
- Partials 2817 4985 +2168
Flags with carried forward coverage won't be shown. Click here to find out more.
📣 We’re building smart automated test selection to slash your CI/CD build times. Learn more |
| <!-- Direct most logs to the log file --> | ||
| <AppenderRef ref="brokerLog"/> | ||
| </Root> | ||
| <!-- Output query logs to a dedicated file--> |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't see pinot-broker-log4j2.xml being used in any pom.xml (maybe I'm missing it). Did we check if this resulted in a separate log file for Querylogger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You're right, this specific change isn't used anywhere (though I did test this locally with a log4j file I constructed myself to make sure that the change works).
The intent of this change was to have a sample configuration that people can pick up, I suspect most deployments run with their own log4j configuration anyway (or the docker files, which now include the change).
vvivekiyer
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
fixes #9042
This PR refactors the query logging capabilities into its own log4j logger so that production servers can choose to configure that logging capability separately from the rest of the production logging. Example changes have been done in two
log4j.xmlfiles in this PR (please let me know if any other files need to update as well).As part of this refactor I also cleaned up the formatting of query logging so that adding new lines to the log isn't error-prone (which will also make it easier in the future if we want to add an option to log in JSON instead of a custom CSV format), and I fixed a race condition around logging the number of dropped queries.
Lastly, I added test coverage :)
cc @npawar