New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve syslog parser/processor error handling #31305
Improve syslog parser/processor error handling #31305
Conversation
- Use multierr to catch multiple errors during processing. - Priority field for RFC 3164 messages are now optional, as some syslog providers will omit the priority part of the header. - If an error occurs, any fields that were already parsed will be returned, alongside error.message. - Added flag for timestamp set, which prevents setting the event timestamp to the zero time in case it was omitted or not parsed.
faae2d5
to
b6ebd19
Compare
b6ebd19
to
9f88332
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Initial comments.
libbeat/reader/syslog/message.go
Outdated
} | ||
if v, ok := mapIndexToString(m.facility, facilityLabels); ok { | ||
_, _ = f.Put("log.syslog.facility.name", v) | ||
if m.prioritySet { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd replace all these , with a single nolint on the if statements
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If you're referring to removing the explicit discards on the return values, I'd rather keep them in place. While it is more verbose, it makes it more clear to the reader that return values are being ignored here.
This pull request is now in conflicts. Could you fix it? 🙏
|
Pinging @elastic/security-external-integrations (Team:Security-External Integrations) |
if layout == time.Stamp { | ||
t = t.AddDate(time.Now().In(loc).Year(), 0, 0) | ||
} | ||
t = t.AddDate(time.Now().In(loc).Year(), 0, 0) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
t = t.AddDate(time.Now().In(loc).Year()-t.Year()+1, 0, 0)
or add a comment about why this is being done (the parameter provided context, but now that is lost).
Alternatively, the parameter could come back and the two functions be merged into func mustParseTimeLoc(layout string, value string, loc *time.Location) time.Time
where the loc
is not used for formats that provide it (e.g. called like mustParseTime(time.RFC3339Nano, "2003-10-11T22:14:15.003Z", nil)
and mustParseTime(time.Stamp, "Oct 11 22:14:15", cfgtype.MustNewTimezone("America/Chicago").Location())
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is one of the reasons why I didn't want to remove the parameter. Context is being lost on why this is being done. What I had before wasn't perfect and only added the year when time.Stamp
was used. Alternatively, I can look at the time's year and if it's zero I'll enrich it with the current year. I'd rather do that than blindly apply this to all timestamps that are seen (no need to do it if the time already has a valid year).
I like what you're suggesting in the second paragraph. I never liked getting rid of the layout parameter since meaning was lost on the caller side. I'll combine the two functions into one and update the calls, and add some doc strings as well.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is the unfortunate consequence of including style linting in the CI failure path; it is entirely possible for a linter to push towards worse code.
|
||
return m.fields(), m.timestamp, nil | ||
return m.fields(), m.timestamp, err |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Can we differentiate error states that are recoverable from error states that give us no usable information. Also add a not on the godoc to the effect that the error doesn't necessarily preclude use of the fields.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
At the moment, I don't think so. The parser isn't set up to recover from an error state yet. It can either try again for the exact same result, or it tumbles out with a bunch of errors and partially filled fields, all depends on where and how it fails. That's something I plan to address in the next round of changes.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks.
@@ -99,10 +110,6 @@ var parseRFC3164Cases = map[string]struct { | |||
In: "<-1>Oct 11 22:14:15 test-host this is the message", | |||
WantErr: ErrPriority, | |||
}, | |||
"err-pri-missing-brackets": { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why did you remove this test case? It is true that priority might be missing, but what if the priority is supposed to be there but formatted incorrectly? I think this test case is still valid.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It isn't, at least in terms of checking if the priority is correct or not. The parser looks for a starting <
, if it's not present, then it goes to the next state, which is the timestamp. It can do this now since priority values are now optional. Since we start with a number here, it will try to parse it as an RFC3339 timestamp, which will ultimately fail.
I'll restore the test, but it will have to switch to checking for an ErrTimestamp
instead of ErrPriority
for the reason stated above.
/test |
1 similar comment
/test |
I'm going to abandon this PR in favor of my next round of changes that I'm working on, which are nearly done. I'll apply the discussions we've had here towards the new code as well (it was building on everything we talked about here). |
What does this PR do?
syslog providers will omit the priority part of the header.
be returned, alongside error.message.
timestamp to the zero time in case it was omitted or not parsed.
Why is it important?
We should be giving back as much data to the user as possible, even if an error occurs. This will also lay the foundation for further improvements to the parser itself. Currently, when a parsing error is encountered, the parser will stop processing and exit early. While this is fine for early EOFs, it can be problematic if we hit a more minor problem, such as a slight deviation from RFC. I attempted to change the parser to do this, but it proved to be too time consuming and complicated, at least for right now.
Checklist
I have commented my code, particularly in hard-to-understand areasI have made corresponding changes to the documentationI have made corresponding change to the default configuration filesCHANGELOG.next.asciidoc
orCHANGELOG-developer.next.asciidoc
.How to test this PR locally
Related issues