Skip to content
This repository was archived by the owner on Aug 20, 2025. It is now read-only.

Comments

METRON-1750 Metron Parser for valid RFC 5424 Syslog messages#1175

Closed
ottobackwards wants to merge 11 commits intoapache:masterfrom
ottobackwards:syslog-5424
Closed

METRON-1750 Metron Parser for valid RFC 5424 Syslog messages#1175
ottobackwards wants to merge 11 commits intoapache:masterfrom
ottobackwards:syslog-5424

Conversation

@ottobackwards
Copy link
Contributor

This is a simple parser for valid RFC 5424 messages.
It produces JSON for the messages, including support for structured data.
All structured data will be flattened.

It's configuration allows for specifying how missing fields are handled ( "-" in the spec ).
They can be:

  • DASH ( the default and same as spec)
  • OMIT : the field will not be in the return set
  • NULL : the field will be in the return set, but as null

The parser uses the simple-syslog-5424 library.

Testing:

  • Build should work ;)
  • in full dev, parser configuration should be available
  • if you have a syslog 5424 compliant data source, hook it up

In order to streamline the review of the contribution we ask you follow these guidelines and ask you to double check the following:

For all changes:

  • Is there a JIRA ticket associated with this PR? If not one needs to be created at Metron Jira.
  • Does your PR title start with METRON-XXXX where XXXX is the JIRA number you are trying to resolve? Pay particular attention to the hyphen "-" character.
  • Has your PR been rebased against the latest commit within the target branch (typically master)?

For code changes:

  • [-] Have you included steps to reproduce the behavior or problem that is being changed or addressed?

  • Have you included steps or a guide to how the change may be verified and tested manually?

  • Have you ensured that the full suite of tests and checks have been executed in the root metron folder via:

    mvn -q clean integration-test install && dev-utilities/build-utils/verify_licenses.sh 
    
  • Have you written or updated unit tests and or integration tests to verify your changes?

  • If adding new dependencies to the code, are these dependencies licensed in a way that is compatible for inclusion under ASF 2.0?

  • Have you verified the basic functionality of the build by building and running locally with Vagrant full-dev environment or the equivalent?

For documentation related changes:

  • [-] Have you ensured that format looks appropriate for the output in which it is rendered by building and verifying the site-book? If not then run the following commands and the verify changes via site-book/target/site/index.html:

    cd site-book
    mvn site
    

Note:

Please ensure that once the PR is submitted, you check travis-ci for build issues and submit an update to your PR as soon as possible.
It is also recommended that travis-ci is set up for your personal repository such that your branches are built there before submitting a pull request.

@ottobackwards ottobackwards changed the title METRON-1453 Metron Parser for valid RFC 5424 Syslog messages METRON-1750 Metron Parser for valid RFC 5424 Syslog messages Aug 25, 2018

String originalString = new String(rawMessage);

SyslogParser parser = new SyslogParserBuilder().withNilPolicy(nilPolicy).build();
Copy link
Contributor

@justinleet justinleet Aug 27, 2018

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is half comment, half question.

It seems odd to recreate the SyslogParserBuilder every parse. I dug in a bit, and I expected (personally expected, not "Metron itself expects x condition") this process to be somewhat similar to the enrichment where the bolt implements the reloadCallback method (e.g. UnifiedEnrichmentBolt) which would then delegate updating the config of the underlying pieces.

E.g. here I would expect the SyslogParser to be created a priori and then when the parser config gets updated reloadCallback would be called and this would be updated (and in this case recreated).

Looking into it a bit further, it looks like ParserBolt has the appropriate method passed down to it, but chooses not to implement it. I suspect it's because nothing none of the underlying parsers don't update configs, although I didn't check all of them.

Would it be reasonable to get that setup in the ParserBolt and then handle the SyslogParser object that way, rather than recreating it every time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You ok with that as part of this pr?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm okay with a follow-on (particularly if it's a lot of work, or if it's risky). I'd just make sure the ticket includes updating any parsers to use the functionality.

It's probably not too hard to add (and I may be being incredibly blasé about something here). I'd expect it to be:

  • Add update method to MessageParser
  • Implement reloadCallback to update the parser configs and make sure anything else in the bolt gets updated as necessary.
  • Tests and docs

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think I want to touch all the parsers for this PR , there might be more than one parser follow on depending on how many have configurations. Can you create a jira or shall I?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth moving the SyslogParserBuilder to the configure() method and just storing it off, rather than recreating every time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

yeah

Copy link
Member

@cestella cestella left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great!

// be sure to put in the original string, and the timestamp.
// we wil just copy over the timestamp from the syslog
jsonObject.put("original_string", originalString);
jsonObject.put("timestamp", jsonObject.get(SyslogFieldKeys.HEADER_TIMESTAMP.getField()));
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If we aren't able to parse the timestamp here, I presume there will be an exception in the parser, right? I just want to make sure there's no way for the parser to fail to return a timestamp.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Based on looking at the docs for the syslog library, it looks like it's possible to not have a timestamp (or to not validly parse a timestamp). If we have a nil for timestamp here, we probably want to default like we do elsewhere, which is to current time. What do you think?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great point, latest commit has the change and the test

@JonZeolla
Copy link
Member

JonZeolla commented Aug 31, 2018

It looks like the upstream palindromicity/simple-syslog-5424 assumes that the PRI will be included in a log. While this is in the spec/RFC and sent on the network, it is standard practice to not write this to disk, but instead it is used by syslog software to choose which file to write it to, and strip it before writing to disk so the first component of the log is the date/timestamp. Situations where syslog is pulled from disk and sent into Metron will all fail with a syntax error. I would suggest that you work with the upstream lib (yourself) to make the PRI field optional =)

Some evidence of my claims:

  • rsyslog documentation explaining that PRI fields are sent but not recorded here.
  • The rsyslog built-in templates for writing to disk exclude PRI (details). Note that they are retained in some cases, but in situations such as forwarding to another syslog server or in the still-draft ietf-syslog-protocol-23 format.
  • Even legacy file formats only include PRI when forwarding (details).
  • Back in 2010 the SUSE syslog-ng format defaults to writing without PRI.

@ottobackwards
Copy link
Contributor Author

Can you log an issue in upstream with your excellent description please?

@kylerichardson
Copy link
Contributor

kylerichardson commented Sep 2, 2018 via email

@JonZeolla
Copy link
Member

Filed an issue upstream.

palindromicity/simple-syslog-5424#15

@ottobackwards
Copy link
Contributor Author

@kylerichardson Let's talk over on the upstream issue

@ottobackwards
Copy link
Contributor Author

Fixed in upstream 0.0.8
I will update when it posts / tomorrow

@ottobackwards
Copy link
Contributor Author

New upstream integrated now.

@JonZeolla
Copy link
Member

Can you deconflict please? package-lock.json is not happy.

@ottobackwards
Copy link
Contributor Author

Hopefully it is all set now

@cestella
Copy link
Member

This looks good to me; I'm +1 on it by inspection. Good job, otto ;)

@justinleet
Copy link
Contributor

I agree, +1 by inspection, pending Travis. This is definitely something I'm glad to see go in.

@ottobackwards
Copy link
Contributor Author

@JonZeolla let me know if you are all set

Copy link
Member

@JonZeolla JonZeolla left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Minor nit, otherwise +1

%changelog
* Thu Aug 30 2018 Apache Metron <dev@metron.apache.org> - 0.6.1
- Update compiled css file name for Alerts UI
* Fri Aug 24 2018 Apache Metron <dev@metron.apache.org> - 0.5.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't this be 0.6.1?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

- Add syslog5424 parser
* Tue Aug 21 2018 Apache Metron <dev@metron.apache.org> - 0.6.1
- Add Profiler for REPL
* Tue Aug 14 2018 Apache Metron <dev@metron.apache.org> - 0.5.1
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It looks like all of these are newer than 0.6.0, but we haven't updated to 0.6.1, 0.7.0, etc. across the codebase. @nickwallen does it sound right to change the Spark profiler line item to be 0.6.1?

@JonZeolla
Copy link
Member

LGTM +1, thanks Otto this is awesome!

@mmiklavc mmiklavc mentioned this pull request Dec 18, 2018
6 tasks
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

5 participants