add filename based sorting to pdbtool merge in a similar vein to what run-parts does #294

faxm0dem · 2014-11-06T12:09:54Z

https://bugzilla.balabit.com/show_bug.cgi?id=211

This issue is still open and IMHO very important.
The use case I have is when merging a pam_unix ruleset which matches PROGRAM==sshd and an openssh ruleset matching PROGRAM==sshd there are unpredictable results unless both rulesets are in the same file.

The text was updated successfully, but these errors were encountered:

bazsi · 2014-11-11T10:15:57Z

I've tried to reproduce the issue in the original bugzilla ticket, but so far without success. db-parser() tries to find the longest match if there are multiple rules with the same prefix.

this is what I've done with the PostgreSQL sample:

patterndb:

<patterndb version='3' pub_date='2010-02-22'>

 <ruleset name='testset' id='1'>
   <patterns>
     <pattern>pgsql</pattern>
   </patterns>
   <rule provider='test' id='11' class='system'>
     <patterns>
       <pattern>xlog: backup pg_xlog/@SET:xid:0123456789ABCDEF@</pattern>
     </patterns>
   </rule>
 </ruleset>
 <ruleset name='testset' id='2'>
   <patterns>
     <pattern>pgsql</pattern>
   </patterns>
   <rule provider='test' id='12' class='system'>
     <patterns>
       <pattern>xlog: backup pg_xlog/@SET:xid:0123456789ABCDEF@ failed</pattern>
     </patterns>
   </rule>
 </ruleset>

</patterndb>

It always matched the longer pattern, regardless of the order of the ruleset blocks. I've tested it via:

$ pdbtool match -p pgsql.xml  -M 'xlog: backup pg_xlog/000000010000014700000076 failed' -P pgsql

bazsi · 2014-11-11T10:16:04Z

I'll check the ssh example.

bazsi · 2014-11-11T10:30:59Z

Do you have an sshd sample message that I should test with?

bazsi · 2014-11-11T10:46:52Z

I think I've found a sample message:

          <pattern>Accepted @ESTRING:usracct.authmethod: @for @ESTRING:usracct.username: @from @ESTRING:usracct.device: @port @ESTRING:: @@ANYSTRING:usracct.service@</pattern>

patterndb would merge these patterns iff the variable extraction is the same. If they are different, then the order in which the patterns were loaded is significant.

I think we could (and probably should) report this as an error and refuse to load the patterns or explicitly ignore the second and subsequent conflicting rules.

What do you think?

bazsi · 2014-11-12T11:11:35Z

I was giving this issue a bit more thought, and I think. I came up with a way to solve this issue.

The problem is caused by the fact the patterndb doesn't do an exhaustive search of the rules, but rather it tries to locate the first one to match.

However rules can overlap, sometimes two distinct rules use different parsers to match the same input. This complicates things, as this is not immediately obvious.

Also, dbparser is willing to accept a partial match, if the rule matches the prefix of the message.

The problem is at its worst, when two rules collide (with a different pattern), and at the same time the first of these is partial. I would expect that the longer pattern prevails, but that only happens currently if the prefix of the two rules are completely identical.

The solution I came up with is twofold:

to always prefer a complete match over a partial match.
This slows down dbparser a bit, but more specific rules will clearly be preferred.
To add conflict search capabilities to pdbtool test.

Pdbtool test would do an exhaustive search of all potentially matches, and warn if there are more than one.

What do you think?

faxm0dem · 2014-11-12T12:26:33Z

What is a partial match? do you have an example?

bazsi · 2014-11-12T12:59:01Z

A partial match is when the message is longer than the pattern.

It happens when dbparser feels there are no further rules to process (for a
longer match), but in reality there might be other rules, which use
slightly different parsers.

That's why prioritizing complete matches help, as you can always add a more
specific rule.
On Nov 12, 2014 1:26 PM, "Fabien Wernli" notifications@github.com wrote:

What is a partial match? do you have an example?

—
Reply to this email directly or view it on GitHub
#294 (comment).

faxm0dem · 2014-11-12T13:25:32Z

you mean longer than in bytes?
Sorry, still confused - do you have an example?

bazsi · 2014-11-12T13:48:34Z

Sorry, I was typing my last entry from my phone, and that's why it was so concise. Sorry about that.

If you have this as a pattern:

   <pattern>This is a message</pattern>

Then if you have a message, which has the pattern above as a prefix would match, as long as there's no better rule.

msg="This is a message with a tail"

Would match the rule above, unless you have a more specific rule. The issue is that the radix tree behind patterndb may not be perfectly right whether there's a more specific rule.

As long as the prefix of two rules are completely the same, patterndb would properly merge them internally:

   <pattern>This is a @ESTRING:foo: @</pattern>
   <pattern>This is a @ESTRING:foo: @with a tail</pattern>

In this case, db-parser would always match the 2nd rule, as the ESTRING parser is merged and they fork out two two alternatives with the literal string following the parser node.

However, if any of the parameters of the parsers differ, for instance the name of the name-value pair, they wouldn't get merged:

   <pattern>This is a @ESTRING:foo1: @</pattern>
   <pattern>This is a @ESTRING:foo2: @with a tail</pattern>

This would already fork into two branches at the parser nodes, and db-parser iterates those sequentially. In this case the first rule would match (partially, as the "with a tail" portion is not in the rule, only its prefix matches), however db-parser accepts this as a hit.

With the change I propose (and which is implemented on the branch f/patterndb-prefer-complete-match), db-parser would always aim for a complete match first, and only if that is not found does it fall back to partial matching.

Is this clearer now?
Thanks for the feedback.

faxm0dem · 2014-11-12T14:21:05Z

yes, much clearer thanks for taking the time :-)
What's still not clear in my mind is how rulesets relate to one another. You mention "rule merging". Does that happen independantly of the rulesets?

bazsi · 2014-11-12T14:27:41Z

It happens in the data structure behind db-parser. It's a radix tree:

http://en.wikipedia.org/wiki/Radix_tree

Although the plain radix tree is extended with parser nodes that extract
information from the input and represent a series of characters.

Bazsi

On Wed, Nov 12, 2014 at 3:21 PM, Fabien Wernli notifications@github.com
wrote:

yes, much clearer thanks for taking the time :-)
What's still not clear in my mind is how rulesets relate to one another.
You mention "rule merging". Does that happen independantly of the rulesets?

—
Reply to this email directly or view it on GitHub
#294 (comment).

ihrwein · 2015-03-09T16:03:29Z

@faxm0dem do you have any questions? I'm checking the old pending issues.

faxm0dem · 2015-03-09T16:12:49Z

sure @Baszi can you confirm this is solved using recent changes in pdb?

bazsi · 2015-03-10T05:59:58Z

Well, rule conflicts were improved a lot with a recent fix and the complete
match approach taken in db-parser. But I am afraid it is so much we can do,
if there are two independent rules in two separate files, the load order
will matter and only one of them will match (the first one).

I am afraid update-patterndb will not sort the input filenames which it
should, so you could at least prioritize one file over another by naming it
something else.

So we should probably file an issue to sort the input files in pdbtool
merge, it is not an issue if you are using a filesystem that implicitly
sorts filenames within directories (which it does for a number of
filesystems), however this is not the best thing to rely on.

I would suggest to rename this issue to or refile a new issue with the
title: "add filename based sorting to pdbtool merge in a simimlar vein to
what run-parts does"

Bazsi

On Mon, Mar 9, 2015 at 5:12 PM, Fabien Wernli notifications@github.com
wrote:

sure @Baszi can you confirm this is solved using recent changes in pdb?

—
Reply to this email directly or view it on GitHub
#294 (comment).

faxm0dem · 2015-03-11T11:35:46Z

Fair enough

faxm0dem mentioned this issue Nov 6, 2014

Additions balabit/syslog-ng-patterndb#4

Open

faxm0dem changed the title ~~dbparser pattern order breaks validation~~ add filename based sorting to pdbtool merge in a simimlar vein to what run-parts does Mar 11, 2015

faxm0dem changed the title ~~add filename based sorting to pdbtool merge in a simimlar vein to what run-parts does~~ add filename based sorting to pdbtool merge in a similar vein to what run-parts does Mar 11, 2015

dougburks mentioned this issue Jul 10, 2015

Added Windows and Cisco VPN ELSA parsing Security-Onion-Solutions/securityonion-elsa-extras#3

Merged

presidento added the bug label Jun 23, 2017

alltilla mentioned this issue Apr 8, 2019

patterndb: Add sort option to pdbtool merge #2664

Merged

MrAnno closed this as completed in #2664 Apr 27, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

add filename based sorting to pdbtool merge in a similar vein to what run-parts does #294

add filename based sorting to pdbtool merge in a similar vein to what run-parts does #294

faxm0dem commented Nov 6, 2014

bazsi commented Nov 11, 2014

bazsi commented Nov 11, 2014

bazsi commented Nov 11, 2014

bazsi commented Nov 11, 2014

bazsi commented Nov 12, 2014

faxm0dem commented Nov 12, 2014

bazsi commented Nov 12, 2014

faxm0dem commented Nov 12, 2014

bazsi commented Nov 12, 2014

faxm0dem commented Nov 12, 2014

bazsi commented Nov 12, 2014

ihrwein commented Mar 9, 2015

faxm0dem commented Mar 9, 2015

bazsi commented Mar 10, 2015

faxm0dem commented Mar 11, 2015

add filename based sorting to pdbtool merge in a similar vein to what run-parts does #294

add filename based sorting to pdbtool merge in a similar vein to what run-parts does #294

Comments

faxm0dem commented Nov 6, 2014

bazsi commented Nov 11, 2014

bazsi commented Nov 11, 2014

bazsi commented Nov 11, 2014

bazsi commented Nov 11, 2014

bazsi commented Nov 12, 2014

faxm0dem commented Nov 12, 2014

bazsi commented Nov 12, 2014

faxm0dem commented Nov 12, 2014

bazsi commented Nov 12, 2014

faxm0dem commented Nov 12, 2014

bazsi commented Nov 12, 2014

ihrwein commented Mar 9, 2015

faxm0dem commented Mar 9, 2015

bazsi commented Mar 10, 2015

faxm0dem commented Mar 11, 2015