New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Only log events at debug level #37229
Only log events at debug level #37229
Conversation
This pull request does not have a backport label.
To fixup this pull request, you need to add the backport labels for the needed
|
The Elasticsearch client was logging raw events in error and warn level, this commit makes it only log the raw events or the returned message from Elasticsearch, that might contain some fields from the event, on debug level. Log messages containing only the returned status from Elasticsearch are maintained on their original log level.
7ae0bb9
to
9b4430c
Compare
client.log.Warnf("Cannot index event (status=%v): dropping event!", status) | ||
client.log.Debugf("Cannot index event %#v (status=%v): %s, dropping event!", data[i], status, msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are we able to safely log the msg
? Does it contain event data if you simulate a mapping conflict (map something as keyword and then write a json object to it or something similar)? I would hope just the field names involved.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If it is just the fields names then I guess we can remove the debug log and only log the msg right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Correct, I want us to keep the error message if we can since that is the most valuable part. It usually tells you enough to know exactly what is wrong. We just need to do a quick test to confirm this is safe.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unfortunately Elasticsearch returns both the field name and its value, I updated the PR description with a detailed reasoning, but here is the TLDR:
Given the event:
{"message":"index failure","int":"not a number","string":10}
The field int
is supposed to be an number (see the "how to test this PR above"), that causes the mapping failure with the following message:
{
"type": "document_parsing_exception",
"reason": "[1:437] failed to parse field [int] of type [long] in document with id 'CdYVHIwBAsOaLnJkwT8o'. Preview of field's value: 'not a number'",
"caused_by": {
"type": "illegal_argument_exception",
"reason": "For input string: \"not a number\""
}
}
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We have seen instances of payload events showing up on the RLC logs on Synthetics service when there were mapping conflicts (Mostly likely customer has modified mappings on their ES stack). Especially happened for some of the network payload events which had cookies/auth headers which shouldn't be logged.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@belimawr I agree with what Craig just said above. Let's change the log level for now but remove the mention of Preview of field's value: 'xxx'"
in a follow up PR.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@belimawr I think we should open an issue in the Elasticsearch repo to change this error message to not include this information by default. This would benefit not only Beats, but also Logstash and language clients by providing a more secure-by-default behavior.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear - we can continue with this PR as written to get a fix in the short-term, but I do think losing the error message is going to make debugging issues quite painful. We should have Elasticsearch change this (or add a different field with just the error message without the field value, or an opt out flag) and then come back and log this information again when it's sure it doesn't contain sensitive information.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agreed, there is already a follow up issue in the security repository about having ES make a change here as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I added @cmacknz's suggestions, the way it is now:
- The original log entries are in debug level
- For all entries I changed the level, there is a shorter one in the original log level stating that more details are available in debug level.
Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
The Elasticsearch client was logging raw events in error and warn level, this commit makes it only log the raw events (or any event data) at debug level. This means the error message returned by Elasticsearch is now only available at debug level because it can contain the whole value of a field causing a mapping conflict. --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> (cherry picked from commit ac7309a)
The Elasticsearch client was logging raw events in error and warn level, this commit makes it only log the raw events (or any event data) at debug level. This means the error message returned by Elasticsearch is now only available at debug level because it can contain the whole value of a field causing a mapping conflict. --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> (cherry picked from commit ac7309a)
The Elasticsearch client was logging raw events in error and warn level, this commit makes it only log the raw events (or any event data) at debug level. This means the error message returned by Elasticsearch is now only available at debug level because it can contain the whole value of a field causing a mapping conflict. --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> (cherry picked from commit ac7309a) Co-authored-by: Tiago Queiroz <tiago.queiroz@elastic.co>
The Elasticsearch client was logging raw events in error and warn level, this commit makes it only log the raw events (or any event data) at debug level. This means the error message returned by Elasticsearch is now only available at debug level because it can contain the whole value of a field causing a mapping conflict. --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co> (cherry picked from commit ac7309a)
One question, and one request ! Request : Question :
This isn't useful (sorry !). There is plenty of useful information about this error in the response from elasticsearch which (as an operator of elastic-agent/beats) I would like to see. Needing to enable debug level logs for every 400 we may see isn't a great experience. |
Agreed we need a better solution for this. The fundamental problem is we don't want secrets or PII ending up in logs that are shipped to remote clusters that do not want to store this information, but the Elasticsearch errors themselves can contain field values. What we want is a sanitized version of Example:
|
The Elasticsearch client was logging raw events in error and warn level, this commit makes it only log the raw events (or any event data) at debug level. This means the error message returned by Elasticsearch is now only available at debug level because it can contain the whole value of a field causing a mapping conflict. --------- Co-authored-by: Craig MacKenzie <craig.mackenzie@elastic.co>
Proposed commit message
The Elasticsearch client was logging raw events in error and warn level, this commit makes it only log the raw events (or any event data) at debug level. This means the error message returned by Elasticsearch is now only available at debug level because it can contain the whole value of a field causing a mapping conflict.
Why is Elasticsearch message only logged at debug level?
The new code removes two things from any level other then debug:
Given the steps to test below, we get the following raw logs indicating indexing failure:
The first line is at level warn with no event data, the second one is at level debug with the raw event and the response sent by Elasticsearch.
It is referring to the third log (line from the example below)
Making the second log entry more human readable we have:
Looking at the code that generates the debug line, we're interested in the
msg
argument.beats/libbeat/outputs/elasticsearch/client.go
Line 451 in 9b4430c
This happens to be in JSON format, making it more human readable we have:
Which contains the whole value of the field that caused the index failure.
Checklist
- [ ] I have commented my code, particularly in hard-to-understand areas- [ ] I have made corresponding changes to the documentation- [ ] I have made corresponding change to the default configuration files- [ ] I have added tests that prove my fix is effective or that my feature worksCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.## Author's ChecklistHow to test this PR locally
Start Filebeat with the following configuration
Create the log file
/tmp/flog.log
with the following content:Raw events should only be logged at debug level.
## Related issues## Use cases## ScreenshotsLogs
How the new log lines look like when there is an event dropped