Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Improve syslog parser/processor error handling #31305

Closed

Conversation

taylor-swanson
Copy link
Contributor

@taylor-swanson taylor-swanson commented Apr 14, 2022

What does this PR do?

  • Use multierr to catch multiple errors during processing.
  • Priority field for RFC 3164 messages are now optional, as some
    syslog providers will omit the priority part of the header.
  • If an error occurs, any fields that were already parsed will
    be returned, alongside error.message.
  • Added flag for timestamp set, which prevents setting the event
    timestamp to the zero time in case it was omitted or not parsed.

Why is it important?

We should be giving back as much data to the user as possible, even if an error occurs. This will also lay the foundation for further improvements to the parser itself. Currently, when a parsing error is encountered, the parser will stop processing and exit early. While this is fine for early EOFs, it can be problematic if we hit a more minor problem, such as a slight deviation from RFC. I attempted to change the parser to do this, but it proved to be too time consuming and complicated, at least for right now.

Checklist

  • My code follows the style guidelines of this project
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation
  • I have made corresponding change to the default configuration files
  • I have added tests that prove my fix is effective or that my feature works
  • I have added an entry in CHANGELOG.next.asciidoc or CHANGELOG-developer.next.asciidoc.

How to test this PR locally

go test github.com/elastic/beats/v7/libbeat/reader/syslog
go test github.com/elastic/beats/v7/libbeat/processors/syslog

Related issues

- Use multierr to catch multiple errors during processing.
- Priority field for RFC 3164 messages are now optional, as some
syslog providers will omit the priority part of the header.
- If an error occurs, any fields that were already parsed will
be returned, alongside error.message.
- Added flag for timestamp set, which prevents setting the event
timestamp to the zero time in case it was omitted or not parsed.
@taylor-swanson taylor-swanson requested a review from a team April 14, 2022 14:31
@taylor-swanson taylor-swanson self-assigned this Apr 14, 2022
@botelastic botelastic bot added needs_team Indicates that the issue/PR needs a Team:* label and removed needs_team Indicates that the issue/PR needs a Team:* label labels Apr 14, 2022
@elasticmachine
Copy link
Collaborator

elasticmachine commented Apr 14, 2022

💚 Build Succeeded

the below badges are clickable and redirect to their specific view in the CI or DOCS
Pipeline View Test View Changes Artifacts preview preview

Expand to view the summary

Build stats

  • Start Time: 2022-04-26T13:50:44.667+0000

  • Duration: 31 min 40 sec

Test stats 🧪

Test Results
Failed 0
Passed 3978
Skipped 915
Total 4893

💚 Flaky test report

Tests succeeded.

🤖 GitHub comments

To re-run your PR in the CI, just comment with:

  • /test : Re-trigger the build.

  • /package : Generate the packages and run the E2E tests.

  • /beats-tester : Run the installation tests with beats-tester.

  • run elasticsearch-ci/docs : Re-trigger the docs validation. (use unformatted text in the comment!)

Copy link
Contributor

@efd6 efd6 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Initial comments.

libbeat/reader/syslog/message.go Outdated Show resolved Hide resolved
libbeat/reader/syslog/message.go Outdated Show resolved Hide resolved
}
if v, ok := mapIndexToString(m.facility, facilityLabels); ok {
_, _ = f.Put("log.syslog.facility.name", v)
if m.prioritySet {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd replace all these , with a single nolint on the if statements

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you're referring to removing the explicit discards on the return values, I'd rather keep them in place. While it is more verbose, it makes it more clear to the reader that return values are being ignored here.

libbeat/reader/syslog/syslog.go Outdated Show resolved Hide resolved
@mergify
Copy link
Contributor

mergify bot commented Apr 15, 2022

This pull request is now in conflicts. Could you fix it? 🙏
To fixup this pull request, you can check out it locally. See documentation: https://help.github.com/articles/checking-out-pull-requests-locally/

git fetch upstream
git checkout -b syslog-error-handling upstream/syslog-error-handling
git merge upstream/main
git push upstream syslog-error-handling

@taylor-swanson taylor-swanson marked this pull request as ready for review April 19, 2022 15:24
@taylor-swanson taylor-swanson requested a review from a team as a code owner April 19, 2022 15:24
@taylor-swanson taylor-swanson requested review from cmacknz and kvch and removed request for a team April 19, 2022 15:24
@elasticmachine
Copy link
Collaborator

Pinging @elastic/security-external-integrations (Team:Security-External Integrations)

@taylor-swanson taylor-swanson requested a review from a team April 19, 2022 15:35
if layout == time.Stamp {
t = t.AddDate(time.Now().In(loc).Year(), 0, 0)
}
t = t.AddDate(time.Now().In(loc).Year(), 0, 0)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

t = t.AddDate(time.Now().In(loc).Year()-t.Year()+1, 0, 0) or add a comment about why this is being done (the parameter provided context, but now that is lost).

Alternatively, the parameter could come back and the two functions be merged into func mustParseTimeLoc(layout string, value string, loc *time.Location) time.Time where the loc is not used for formats that provide it (e.g. called like mustParseTime(time.RFC3339Nano, "2003-10-11T22:14:15.003Z", nil) and mustParseTime(time.Stamp, "Oct 11 22:14:15", cfgtype.MustNewTimezone("America/Chicago").Location())).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is one of the reasons why I didn't want to remove the parameter. Context is being lost on why this is being done. What I had before wasn't perfect and only added the year when time.Stamp was used. Alternatively, I can look at the time's year and if it's zero I'll enrich it with the current year. I'd rather do that than blindly apply this to all timestamps that are seen (no need to do it if the time already has a valid year).

I like what you're suggesting in the second paragraph. I never liked getting rid of the layout parameter since meaning was lost on the caller side. I'll combine the two functions into one and update the calls, and add some doc strings as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the unfortunate consequence of including style linting in the CI failure path; it is entirely possible for a linter to push towards worse code.


return m.fields(), m.timestamp, nil
return m.fields(), m.timestamp, err
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we differentiate error states that are recoverable from error states that give us no usable information. Also add a not on the godoc to the effect that the error doesn't necessarily preclude use of the fields.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At the moment, I don't think so. The parser isn't set up to recover from an error state yet. It can either try again for the exact same result, or it tumbles out with a bunch of errors and partially filled fields, all depends on where and how it fails. That's something I plan to address in the next round of changes.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks.

@@ -99,10 +110,6 @@ var parseRFC3164Cases = map[string]struct {
In: "<-1>Oct 11 22:14:15 test-host this is the message",
WantErr: ErrPriority,
},
"err-pri-missing-brackets": {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why did you remove this test case? It is true that priority might be missing, but what if the priority is supposed to be there but formatted incorrectly? I think this test case is still valid.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It isn't, at least in terms of checking if the priority is correct or not. The parser looks for a starting <, if it's not present, then it goes to the next state, which is the timestamp. It can do this now since priority values are now optional. Since we start with a number here, it will try to parse it as an RFC3339 timestamp, which will ultimately fail.

I'll restore the test, but it will have to switch to checking for an ErrTimestamp instead of ErrPriority for the reason stated above.

@taylor-swanson
Copy link
Contributor Author

/test

1 similar comment
@taylor-swanson
Copy link
Contributor Author

/test

@taylor-swanson
Copy link
Contributor Author

I'm going to abandon this PR in favor of my next round of changes that I'm working on, which are nearly done. I'll apply the discussions we've had here towards the new code as well (it was building on everything we talked about here).

@taylor-swanson taylor-swanson deleted the syslog-error-handling branch May 3, 2022 19:03
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport-v8.2.0 Automated backport with mergify enhancement :Processors
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants