Skip to content

Conversation

@hlein
Copy link
Owner

@hlein hlein commented Nov 17, 2025

Various cleanups and improvements to parser regexes in .yaml config files.


Enter [N/A] in the box, if an item is not applicable to your change.

Testing
Before we can approve your change; please submit the following in a comment:

  • Example configuration file for the change <- the commits are config files
  • Debug log output from testing the change
  • [N/A] Attached Valgrind output that shows no leaks or memory corruption was found

If this is a change to packaging of containers or native binaries then please confirm it works for all targets.

  • [N/A] Run local packaging test showing all targets (including any new ones) build.
  • [N/A] Set ok-package-test label to test for all targets (requires maintainer to do).

Documentation

  • [N/A] Documentation required for this feature

Backporting

  • Backport to latest stable release.

Does not really need backporting. Similar changes might be needed in .conf files if they are not pruned (see fluent#11161 (comment))


Fluent Bit is licensed under Apache 2.0, by submitting this pull request I understand that this code will be released under the terms of that license.

@hlein
Copy link
Owner Author

hlein commented Nov 17, 2025

Multiple test runs.

Simplified output after the '' standardization:

fluent-bit/tests/internal/data/config_format/convert $ ./run_tests.sh | egrep '^##'
### 'parser_custom.external-dns.test'
### OK 'parser_custom.external-dns.test'
### 'parser_custom.neo4j.test'
### OK 'parser_custom.neo4j.test'
### 'parser_custom.rabbitmq.test'
### OK 'parser_custom.rabbitmq.test'
### 'parsers.cri.test'
### OK 'parsers.cri.test'
### 'parsers.envoy.test'
### OK 'parsers.envoy.test'
### 'parsers.istio-envoy-proxy.test'
### OK 'parsers.istio-envoy-proxy.test'
### 'parsers.k8s-nginx-ingress.test'
### OK 'parsers.k8s-nginx-ingress.test'
### 'parsers.kmsg-netfilter-log.test'
### OK 'parsers.kmsg-netfilter-log.test'
### 'parsers.syslog-rfc3164.test'
### OK 'parsers.syslog-rfc3164.test'
### 'parsers_cinder.ceph.test'
### OK 'parsers_cinder.ceph.test'
### 'parsers_extra.chefclient.test'
### OK 'parsers_extra.chefclient.test'
### 'parsers_extra.couchbase_java_multiline.test'
### OK 'parsers_extra.couchbase_java_multiline.test'
### 'parsers_extra.couchbase_simple_log_mixed.test'
### OK 'parsers_extra.couchbase_simple_log_mixed.test'
### 'parsers_extra.couchbase_simple_log_utc.test'
### OK 'parsers_extra.couchbase_simple_log_utc.test'
### 'parsers_extra.crowbar.test'
### OK 'parsers_extra.crowbar.test'

kmsg-netfilter-log after the first commit changing the regex to (?x) multiline but otherwise unchanged:

fluent-bit/tests/internal/data/config_format/convert $ ./run_tests.sh parsers.kmsg-netfilter-log.test -v
### 'parsers.kmsg-netfilter-log.test'
CONF: {"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp1s0","macsrc":"94:2b:f3:a0:10:af","macdst":"00:00:5e:00:01:2b","ethtype":"08:00","saddr":"192.168.1.123","daddr":"192.168.1.1","len":"40","tos":"0x00","prec":"0x00","ttl":"239","id":"34391","proto":"TCP","sport":"42694","dport":"10005","window":"1024","res":"0x00","flag":"SYN","urgp":"0"}
{"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp3s0.12","out":"enp3s0.11","macsrc":"94:2b:f3:a0:10:af","macdst":"9c:b6:d0:d6:a1:af","ethtype":"08:00","saddr":"192.168.91.200","daddr":"192.168.96.3","len":"152","tos":"0x00","prec":"0x00","ttl":"63","id":"7769","proto":"UDP","sport":"41641","dport":"41641","protolen":"132"}
YAML: {"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp1s0","macsrc":"94:2b:f3:a0:10:af","macdst":"00:00:5e:00:01:2b","ethtype":"08:00","saddr":"192.168.1.123","daddr":"192.168.1.1","len":"40","tos":"0x00","prec":"0x00","ttl":"239","id":"34391","proto":"TCP","sport":"42694","dport":"10005","window":"1024","res":"0x00","flag":"SYN","urgp":"0"}
{"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp3s0.12","out":"enp3s0.11","macsrc":"94:2b:f3:a0:10:af","macdst":"9c:b6:d0:d6:a1:af","ethtype":"08:00","saddr":"192.168.91.200","daddr":"192.168.96.3","len":"152","tos":"0x00","prec":"0x00","ttl":"63","id":"7769","proto":"UDP","sport":"41641","dport":"41641","protolen":"132"}
### OK 'parsers.kmsg-netfilter-log.test'

After the final commit, behavior changes:

fluent-bit/tests/internal/data/config_format/convert $ ./run_tests.sh parsers.kmsg-netfilter-log.test -v
### 'parsers.kmsg-netfilter-log.test'
CONF: {"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp1s0","macsrc":"94:2b:f3:a0:10:af","macdst":"00:00:5e:00:01:2b","ethtype":"08:00","saddr":"192.168.1.123","daddr":"192.168.1.1","len":"40","tos":"0x00","prec":"0x00","ttl":"239","id":"34391","proto":"TCP","sport":"42694","dport":"10005","window":"1024","res":"0x00","flag":"SYN","urgp":"0"}
{"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp3s0.12","out":"enp3s0.11","macsrc":"94:2b:f3:a0:10:af","macdst":"9c:b6:d0:d6:a1:af","ethtype":"08:00","saddr":"192.168.91.200","daddr":"192.168.96.3","len":"152","tos":"0x00","prec":"0x00","ttl":"63","id":"7769","proto":"UDP","sport":"41641","dport":"41641","protolen":"132"}
YAML: {"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp1s0","macdst":"94:2b:f3:a0:10:af","macsrc":"00:00:5e:00:01:2b","ethtype":"08:00","saddr":"192.168.1.123","daddr":"192.168.1.1","len":"40","tos":"0x00","prec":"0x00","ttl":"239","id":"34391","proto":"TCP","sport":"42694","dport":"10005","window":"1024","res":"0x00","flag":"SYN","urgp":"0"}
{"date":1234567890.123456,"pri":"4","host":"gw-stl-a1","logprefix":"FIREWALL","in":"enp3s0.12","out":"enp3s0.11","macdst":"94:2b:f3:a0:10:af","macsrc":"9c:b6:d0:d6:a1:af","ethtype":"08:00","saddr":"192.168.91.200","daddr":"192.168.96.3","len":"152","tos":"0x00","prec":"0x00","ttl":"63","id":"7769","proto":"UDP","sport":"41641","dport":"41641","protolen":"132"}
### FAIL 'parsers.kmsg-netfilter-log.test'

But of course that's not really a fail, because macsrc/macdst flipping is the point.

Same thing with syslog-rfc5424, we get another match now:

fluent-bit/tests/internal/data/config_format/convert $ ./run_tests.sh parsers.syslog-rfc5424.test -v
### 'parsers.syslog-rfc5424.test'
CONF: {"date":1234567890.123456,"pri":"34","time":"2003-10-11T22:14:15.003Z","host":"mymachine.example.com","ident":"su","pid":"-","msgid":"ID47","extradata":"-","message":"BOM'su root' failed for lonvick on /dev/pts/8"}
{"date":1234567890.123456,"pri":"165","time":"2003-10-11T22:14:15.003Z","host":"mymachine.example.com","ident":"evntslog","pid":"-","msgid":"ID47","extradata":"[exampleSDID@32473 iut=\"3\" eventSource= \"Application\" eventID=\"1011\"]","message":"BOMAn application event log entry..."}
YAML: {"date":1234567890.123456,"pri":"34","time":"2003-10-11T22:14:15.003Z","host":"mymachine.example.com","ident":"su","pid":"-","msgid":"ID47","extradata":"-","message":"BOM'su root' failed for lonvick on /dev/pts/8"}
{"date":1234567890.123456,"pri":"165","time":"2003-10-11T22:14:15.003Z","host":"mymachine.example.com","ident":"evntslog","pid":"-","msgid":"ID47","extradata":"[exampleSDID@32473 iut=\"3\" eventSource= \"Application\" eventID=\"1011\"]","message":"BOMAn application event log entry..."}
{"date":1234567890.123456,"pri":"165","time":"2003-10-11T22:14:15.003Z","host":"mymachine.example.com","ident":"evntslog","pid":"-","msgid":"ID47","extradata":"[exampleSDID@32473 iut=\"3\" eventSource= \"Application\" eventID=\"1011\"][examplePriority@32473 class=\"high\"]"}
### FAIL 'parsers.syslog-rfc5424.test'

iptables tests after the latest improvements:

fluent-bit/tests/internal/data/config_format/convert $ ./run_tests.sh parsers_extra.iptables.test -c
### 'parsers_extra.iptables.test': 12 lines
CONF:      11      13    4310
YAML:      12      14    4909
### NO MATCH 'parsers_extra.iptables.test'

hlein added 15 commits November 19, 2025 22:28
Single-quote regexes and do not use unnecessary / / delimeters.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
No change in behavior, confirmed w/test harness.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
Linux kernel uptime timestamps use "[%5lu.%06lu]", meaning there
are leading spaces inside the [ ] until uptime reaches 10,000 secs.
The existing test-cases both have 6-digit seconds, so this wasn't
noticed.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
Linux firewall logs' MAC= field is in wire order - dst MAC, then
src MAC, then ethertype.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
Pulled these out of https://hackmd.io/@njjack/syslogformat, I _think_
all three are valid. The current pattern only matches on the first two.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
See https://hackmd.io/@njjack/syslogformat

With this change, we match the third test-case as well.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
… case

Signed-off-by: Hank Leininger <hlein@korelogic.com>
No change in behavior, confirmed w/test harness.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
Signed-off-by: Hank Leininger <hlein@korelogic.com>
Signed-off-by: Hank Leininger <hlein@korelogic.com>
Strictly OK/FAIL when the before-after don't match is less useful
when we are making improvements that introduce changes on purpose.
Change labels to MATCH/NO MATCH, and also add -c which shows the
wc of output lines - much easier to confirm when updates cause
more tests to successfully match.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
Also more whitespace and comments for legibility.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
The previous implementation would clobber and only remember one.
This field name is now a misnomer, pkt_flags or individual ones
for pkt_{cwr,ewe,urg...} might be better, but would break backwards
compat more substantially.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
…rted

Other firewall-building tools (UFW, firewalld, etc.) craft their
line prefix differently, causing the rest of the regex to fail.
Also some errors and more exotic message contents.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
Address prefixes added by different tools, parse payloads in
ICMP errors, support --log-uid logs, etc. Note that this _does_
change or rename a few fields, it is not strictly additive.
Also switch regex reference to one that matches fluent-bit behavior.

Signed-off-by: Hank Leininger <hlein@korelogic.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

2 participants