Matches as an array (4.0)#3948
Conversation
|
Build SUCCESS |
1 similar comment
|
Build SUCCESS |
7c1cd66 to
255d394
Compare
|
Build FAILURE |
|
@kira-syslogng test this please; |
|
Build SUCCESS |
255d394 to
124b385
Compare
|
Build SUCCESS |
|
Just a note from our discussion: investigate the possibility of unsetting the stale/obsolete matches from earlier (i.e. when we have fewer matches in a subsequent matching than before). |
124b385 to
6c792a6
Compare
It seems that I have implemented this already and then forgot about it. It is this patch: c67006b63d05d60530f1a435750b423281935ffe This means that if we set non-consecutive match values then anything between the old range and the new one are unset explicitly. I've now noticed that there's a function to reset the number of matches: log_msg_clear_matches() so I've committed changes to |
|
Build SUCCESS |
gaborznagy
left a comment
There was a problem hiding this comment.
I've checked it, and it looks good to me!
I have only some minor remarks about commit history, about a unit test, and some general remarks.
I don't have a better, more general alternative to the term "matches" in set-matches or unset-matches.
I'm not sure of their general nature either: users of $1 or set-matches() should be careful of regex capture groups that could overwrite their explicitly set matches.
In Bash, $*, $1, etc. all refer to positional parameters.
In Perl it was related to multiline matching in regular expressions, but not since Perl 5.10.
|
I've sent a PR about the news entries. |
|
Build FAILURE |
|
@kira-syslogng retest this please; |
|
Build FAILURE |
|
@bazsi Kira fails due to the git security patch to CVE-2022-24765 (https://github.blog/2022-04-12-git-security-vulnerability-announced/) |
|
We'll be adding the suggested |
2d7925e to
1089ad1
Compare
|
Thanks @gaborznagy for the good feedback. I've now addressed all your review notes and pushed out another iteration. There seems to be some style-check related issue in the CI about a missing Python command. Otherwise things seem to be green (unless I am breaking something with the last force-push :) Reviewing this one would be appreciated. |
|
Build SUCCESS |
gaborznagy
left a comment
There was a problem hiding this comment.
Thanks for resolving my comments.
|
@bazsi this needs one last rebase to fix the Light style-check issue in Github actions. |
modules/json/json-parser.c
Outdated
There was a problem hiding this comment.
log {
source(s_src);
rewrite {
set("foo bar" value('1'));
unset-matches();
};
destination { file("/dev/stdout" template("$(format-json --scope nv-pairs --key 1)\n")); };
};
With this patch we can still query the $1 value after the unset-matches(); is it an intended behaviour?
I would argue that after unset-matches() is called every $n should be empty (""), or at least that is what I expect.
There was a problem hiding this comment.
Good catch. The log_msg_values_foreach() did not filter out out-of-range matches. I'm pushing a fix for this, however I am starting to feel that it'd be much better to explicitly unset all matches explitictly, rather than trying to track the number of matches and then checking their number in all the code paths that query name-value pairs.
Might even be faster doing it only once, instead of at every log_msg_values_foreach() iteration.
There was a problem hiding this comment.
This is the fix for this: e234146b1ae44af0cdfff06e65917d40fa14c275
There was a problem hiding this comment.
I don't want to be nitpicking especially since I've approved this PR once, but this fix in e234146 only fixes the case when a cleared match is referenced through value-pairs, e.g. through format-json.
If the match (e.g. $1) is used later as an input in a parser or rewrite rule, then it would still hold the previously (cleared) value.
Maybe this scenario is rather a programming error, e.g. the writer of an SCL block should be aware of this.
A dummy example SCL:
parser example-parser() {
....
set("foo bar" value('1'));
unset-matches();
syslog-parser(template('$1'));
};
TLDR: how about using log_msg_unset_match() instead of log_msg_clear_matches()?
There was a problem hiding this comment.
In the last iteration, I have changed the approach of resetting values, so the branch now contains a patch that changes things - again.
With that said, this case has worked correctly before, '$1' was unset because of this branch in log_msg_get_match():
+const gchar *
+log_msg_get_match_if_set_with_type(const LogMessage *self, gint index_, gssize *value_len,
+ LogMessageValueType *type)
+{
+ g_assert(index_ >= 0 && index_ < LOGMSG_MAX_MATCHES);
+
+ if (index_ >= self->num_matches)
+ return NULL;
+
+ return nv_table_get_value_if_set(self->payload, match_handles[index_], value_len, type);
+}
+
e.g. if a match number was outside the range of num_matches we always considered it to be unset (NOTE the NULL return). This worked correctly even before. The value-pairs code path had a bug as @OverOrion noticed correctly, as it didn't use log_msg_get_match(), rather it iterated over all name-value pairs using log_msg_values_foreach(), which did not filter out $1 in this case. That's what my last patch fixed.
With all this iterations however, at the end I've decided to go back to the original concept and call log_msg_unset() all matches whenever the num_matches would change. This eliminates the need to handle matches separately from other name-value pairs in the query APIs.
This is this patch that reverses most of the above.
commit 855e0c4174ebbba5db05d9395a2d33efb6b1cc4f
Author: Balazs Scheidler bazsi77@gmail.com
Date: Tue May 3 13:48:12 2022 +0200
logmsg: unset matches proactively
With that we are failing the cisco-parser() testcase in light, so I am now going to find that issue.
425227b to
68dbb05
Compare
|
Build SUCCESS |
Signed-off-by: Gabor Nagy <gabor.nagy@oneidentity.com>
…hes_too Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
The input parameters might refer to a borrowed reference we might clobber with log_msg_set_value() calls. In those cases, we save the original value into result.source_value so use that instead of the parameter. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
2b67be9 to
16f1c76
Compare
|
Build SUCCESS |
We only have up to $255 so don't set any more than that. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
LogMessage has traditionally aborted whenever we tried to access match variables that are out-of-range, however with the adoption of set-matches() and the use of match variables for array-like use-cases this is not right choice anymore, ignore the set/get operation in these cases instead. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
16f1c76 to
7b9dd74
Compare
|
Build SUCCESS |
|
Build SUCCESS |
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
…obbering() Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
…ndle that as well Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
0c0c8b1 to
30ad5b6
Compare
|
I have found yet another case that clobbers the borrowed input of LogMatcher instances, which caused the change in |
|
Build SUCCESS |
OverOrion
left a comment
There was a problem hiding this comment.
LGTM!
Also thank your for not only addressing the review notes, but writing additional tests for those as well!
|
regexp-parser() still uses Maybe |
gaborznagy
left a comment
There was a problem hiding this comment.
Approved. This was a very long time and lot of efforts for all of us, but I think it really worth it.
Sorry for taking it so long. :)
Otherwise I would ask to squash some patches (especially that some feature took opposite directions), but it would require a lot of effort again from all of us.
The patches are self-contained (not fixup patches), so we can merge it in this state.
This builds on top of #3888 and implements $* as a list and other match related improvements.
These are the new features: