-
Notifications
You must be signed in to change notification settings - Fork 468
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Fix grouping-by() use through parser references #3957
Conversation
Build FAILURE |
c31fac1
to
7ab37f3
Compare
This pull request fixes 1 alert when merging 7ab37f3ba232312552d821709750bee1681db79f into ba656ba - view on LGTM.com fixed alerts:
|
Build FAILURE |
IRL we discussed this topic and concluded that we shouldn't share state accross multiple references of the same grouping-by() instance. |
I have now removed the state sharing piece, as discussed, so this is ready for review/merge. |
f7544f5
to
753b86a
Compare
Build SUCCESS |
This pull request fixes 1 alert when merging 753b86a7afa1470f30b9aabc17a5e066dc049689 into 9ee50ad - view on LGTM.com fixed alerts:
|
05ae876
to
ca197a5
Compare
Build SUCCESS |
This pull request fixes 1 alert when merging ca197a5a177c08fa067e0361b86fdb72f464c835 into c5e823f - view on LGTM.com fixed alerts:
|
ca197a5
to
9b27547
Compare
Rebased against current master. A review of this would be very much appreciated as I have some patches against grouping-by in parallel branches which have a high probability of conflicting with this one. |
Build SUCCESS |
This pull request fixes 1 alert when merging 9b275471a89c8ae55874836a463533bd57c9d111 into 06fa9d7 - view on LGTM.com fixed alerts:
|
…imer() Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Correlation is a combination of state and associated timers, so move them into the same struct. This is a preparatory step for sharing state between cloned GroupingBy instances. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Since the state hash, timer_wheel and last_tick form the stateful part of correlation, move them into the same struct. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
As that lock protects the state, move it within the structure that holds the state. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Instead of overloading the lock for the correlation state, use a separate lock for the ruleset. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
With the new ruleset_lock added in the last commit, the GRWLock can simply be changed to GMutex. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
This patch contains changes to rename internal functions to use an internal name. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
All non-transaction related methods of CorrelationState do their own locking, so do that even in correlation_state_set_time() Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
The hash used for validating the persist names stored the original pointer as returned by the log_pipe_get_persist_name() function, which is usually a static string. This means that if we have two instances of the same object in our config, its persist name will be formatted in the same spot, effectively changing the key as stored in the GHashTable. Clearly this is not what we want. Make sure we strdup the key before storing it. Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
After careful consideration, we decided not to share the state between grouping-by() clones, which get created if the same parser statement is referenced from multiple parts of the configuration. The rationale is summarized below. Example configuration: @Version: 3.36 parser p_gby { grouping-by( key("$PROGRAM") aggregate( value("aggr" "$(list-slice :-1 $(context-values $mod))") ) timeout(3) ); }; log { source { tcp(port(2000)); }; if ("$(% $SEC 2)" == "0") { rewrite { set("0" value("mod")); }; parser(p_gby); } else { rewrite { set("1" value("mod")); }; parser(p_gby); }; filter { "$aggr" ne "" }; destination { file("aggr.log" template("$(format-json aggr)\n")); }; }; With the example config above, parser object named "p_gby" is referenced twice, in two parallel branches of the configuration: 1) once when the timestamp has an even second value, 2) once when the timestamp has an odd second value If state sharing would be implemented, both references would store messages in the same state object, e.g. aggregation contexts would contain a mixture of messages, received using either of the two branches. This consequence is already pretty difficult to follow and can be confusing. To add to the confusion, another problem comes up. Once the timeout in grouping-by() elapses, which branch would emit the "aggregate" message? Following our existing configuration logic, we should emit the aggregate twice. This means that our destination in the example config above, would receive the same, identical message twice both of which contains a mixture of messages. With that in perspective we decided not to share the state: * this matches other existing cases, db-parser() or rate-limit() does not share state either. * with unshared state, we'd basically have two grouping-by() instances that are configured the same but with distinct states. * each grouping-by() clone would process messages only on its respective branch * each aggregate would contain a distinct set of messages, they are not mixed up Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
Produced by "make style-format" Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
9b27547
to
8a610a3
Compare
Build SUCCESS |
This pull request fixes 1 alert when merging 8a610a3 into 8b7c37e - view on LGTM.com fixed alerts:
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nice refactors.
We should write a news entry for this.
Signed-off-by: Balazs Scheidler <bazsi77@gmail.com>
added a news entry. thanks for the review. |
This pull request fixes 1 alert when merging 473a5d0 into 4aabae7 - view on LGTM.com fixed alerts:
|
Build SUCCESS |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thank you.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM, thank you!
grouping-by() had an incomplete log_pipe_clone() implementation, which meant that whenever we were using a grouping-by() through a reference (e.g. not as an inline block), not all settings propagated to different references.
This branch is a much larger than I anticipated as
it also implements state sharing between these clones andrefactors a lot of similarities between db-parser() and grouping-by() as they share a common infrastructure.I added a test to get at least some unit test coverage for grouping-by(), but that's still far from complete. db-parser() has a more complete testsuite and a lot of the code is shared between the code. I do hope that @kira-syslogng and light has some end-to-end tests.
Albeit this branch makes it possible it does not add similar state sharing to db-parser(), yet. I opened the PR as I felt I needed to get some feedback (from both reviewers and kira).Also, it's a tricky question which of the clones would handle timer expirations. I do know the answer, but it's a great question to answer if you are doing the review.I've used this config to test: