Skip to content
Switch branches/tags
Go to file
Cannot retrieve contributors at this time
Scheduled Release 8.2112.0 (aka 2021.12) 2021-12-??
- 2021-11-22: new contribtion: URL parser module function using libfa
Thanks to Théo Bertin for the patch.
- 2021-11-18: mmanon: relax IPv6 detection - improve anonymization
We so far tried to ensure a value is really an IPv6 address, in order
to avoid to mangle with just similar-looking information elements.
However, this lead to misdetection for unusual formats, e.g. when a
port is appended to a numerical IPv6 adress given without braces [].
This has been changed now. In a sense, we now prefer to err on the
side of privacy.
Previously, a suspect value was not anonymized, and thus some other
elements (like some MAC addresses) preserved. Now the opposite is
true, and we anonymize anything that looks close enough to be an
IPv6 address. This improves anonymization.
- 2021-11-10: ruleset bugfix: ruleset queue was incorrectly named
The ruleset was incorrectly and unusably named. This was a regeression
from 4a63f8e9629c3c9481a8b6f9d7787e3b3304320b.
Many thanks to github user digirati82 for alerting us.
- 2021-11-10: omsnmp: update module to current IP best practices
The omsnmp module uses the inet_addr() function to convert the Internet host address
from IPv4 numbers-and-dots notation into binary data in network byte order. If the input
is invalid, INADDR_NONE (usually -1) is returned. Use of this function is problematic
because -1 is a valid address ( We should avoid its use in favor of
inet_aton(), inet_pton(3), or getaddrinfo(3), which provide a cleaner way to indicate
error return [1].
This is just a request to satisfy covscan, so no error is reported at all.
Thanks to Attila Lakatos for the patch.
- 2021-10-27: ommysql: fix threading bug
When the MariaDB connection was (re)established, old or NULL handle
could be used. This is fixed now.
We need to synchronize access to the mysql handle, because multiple threads
use it and we may need to (re)init it during processing. This could lead to
races with potentially wrong addresses or NULL accesses. If this really
matters mostly depends on the MariaDB/MySQL client library. It looks like
they guard against fatal failuers. Anyhow, logging errors inside rsyslog
could happen in any case.
- 2021-10-25: testbench: false positive when impstats was not built
Test omfwd_fast_imuxsock failed when impstats was not built. This
has been corrected, test is now only executed when impstats is
- 2021-10-25: imtcp: add support for permittedPeers setting at input() level
The permittedPeers settig was actually forgotten during the refactoring
of TLS input() level settings. This functionality is now added.
Scheduled Release 8.2110.0 (aka 2021.10) 2021-10-19
- 2021-10-13: config bugfix: global(security.abortonidresolutionfail=) did not work
when used with rscript based configuration, it was not checked.
- 2021-10-13: config bugfix: global param $privDropToUser did not work correctly
The parameter was not implemented for rscript based configuration and
did not properly apply to legacy configuration. In essence, it almost always
did not work as expected.
see also:
see also:
- 2021-10-12: rscript bugfix: ruleset called async when ruleset had queue.type="direct"
The call rscript statement is able to call a rule set either synchronously or
asynchronously. We did this, because practice showed that both modes
are needed. For various reasons we decided to make async
calls if the ruleset has a queue assigned and sync if not.
To know if a "queue is assigned" we just checked if queue parameters were
given. It was overlookeded the case of someone explicitly specifying a
"direct queue", aka "no queue". As such, queue="direct" triggered async
calls. That in turn meant that when a write operation to a variable was
made inside that rule set, other rulesets could or could not see the
write. While if was often not seen, this was a data race where the
change could also be seen by the outside.
This is now fixed. No matter if queue.type="direct" is specified or
left out, the call will always by synchronous. Any values written to
variables will also be seen by the "outside world" in later processing
Note that this has some potential to BREAK EXISTING CONFIGURATIONS.
We deem this acceptable because:
1. this was racy at all, so unexpected behaviour could alwas occur
2. it is actually unlikely that someone used the triggering conditions
in practice. But we can not outrule this, especially when the
configuration was auto-generated.
Potential compatibility issues can be solved by defining a small
array-memory queue on the ruleset in question instead of specifying
direct type.
Again, we expect that almost all users will never experience any
problems. If you do, however, please let us know: we may add an
option to re-enable the bug.
- 2021-10-12: ksi bugfix: locking bug fixed in rsksiCtxOpenFile
Thanks to Taavi Valjaots for the patch.
- 2021-10-11: core bugfix: fix typo in error message
Thanks to github user jkschulz for the patch.
- 2021-10-11: tcpsrv bugfix: compilation without exceptions
tcpsrv.c:992:1: error: label at end of compound statement
Quoting from pthread.h:
pthread_cleanup_push and pthread_cleanup_pop are macros and must always
be used in matching pairs at the same nesting level of braces.
Amends commit bcdd220142ec9eb106550195ba331fd114adb0bd.
Thanks to Orgad Shaneh for the patch.
- 2021-10-11: mkubernetes bugfix: no connection retry to kubernetes APP
When connection to the kubernates API was not possible, mmkubernetes
did not retry. This does now happen via regular rsyslog retry
Thanks to github user jayme-github for the analysis and patch.
- 2021-10-11: openssl bugfix: Correct gnutlsPriorityString (custom ciphers) behaviour
- Only apply default anon ciphers if gnutlsPriorityString is NULL and
Authentication Mode is set to anon. Otherwise we do not set them
as they overwrite custom Ciphers.
- Added two tests for custom cipher configuration (anon/certvalid mode).
- Add call for applyGnutlsPriorityString if gnutlsPriorityString changes.
- Merged openssl init code from Connect into osslInitSession
- 2021-10-11: build issue: handle undefined MAXPATHLEN, PATH_MAX
While we handled missing PATH_MAX, we did not handle missing MAXPATHLEN.
This happens under GNU/Hurd, because there is no official limit. However,
extremely long pathes are extremely uncommon, so we do not want to
use slow dynamic alloc each time we need to build pathes. So we
impose a limit of 4KiB, which should be fairly enough. Note that
this obviously increases stack requirements in GNU/Hurd.
As suggested by Michael Biebl, we have now implemented a generic
approach to handle this via autoconf.
- 2021-09-12: openssl: extended output information on connection failure
Now includes the remote client/server IP address in the log output.
- 2021-09-12: imhttp enhancements - query parameter ingestion & basic auth support
- Basic Authentication support & tests
* configured via imhttp option "basicAuthFile". This option should be configured
to point to your htpasswd file generated via a standard htpasswd tool.
- Query parameter ingestion capability & tests
use t `addmetadata` option to inject query parameters into
metadata for imhttp input.
libaprutil (libaprutil1-dev on debian'ish, apr-util-devel on Red Hat)
Thanks to Nelson Yen for the patch.
- 2021-09-07: testbench bugfix: privdrop tests under root user did not work
When running under root, the privdrop tests did not properly work. This
patch fixes the issue and skips test where necessary.
This also includes some modernization of the related tests.
- 2021-09-07: core/ratelimiting: fix rate limiting for already parsed messages
Rate limiting may not have worked if the considered message had already
been parsed (not having NEEDS_PARSING in msgFlags).
This affects also imuxsock in its default configuration
(useSpecialParser="true" and ratelimit.severity="1")
- 2021-09-07: core bugfix: use of property $wday terminates string
When $wday is used inside a template, all template parts after it
are ignored. For exmaple:
template(name="json_filename" type="string" string="/var/log/%$wday%.log")
would generate something like "/var/log/0" - the ".log" part would be
missing. For the same reason, $wday can not reliably checked in script
Thanks to Alain Thivillon for reporting the bug and providing an
excellent analysis, which essentiellay was exactly this fix here.
- 2021-09-07: core/queue bugfix: potential misadressing when queue discarded messages
When a discard mark was set, the queue was very busy and discarded messages, a
NULL pointer access could happen. Depending on circumstances, several problems
could occur, including a SEGFAULT. This is now fixed.
- 2021-09-07: imdiga bugfix: iOverallQueueSize calculation could be incorrect
This issue only affects testbench and rsyslog development debugging. The active
messages counter, used for synchronizing test steps, went wrong when the queue
discarded messages on it's consumer thread. Now fixed.
- 2021-09-06: gnutls driver: SAN priority did not work correctly on server side
PrioritizeSAN was not propagated when accepting a new connection, this is now fixed.
Thanks to Attila Lakatos for the patch.
- 2021-08-24: config: implement script-equavalent for $PrivDrop* statements
Scheduled Release 8.2108.0 (aka 2021.08) 2021-08-17
- 2021-08-16: openssl tls: Improved error message output on tls failures.
- 2021-08-16: impstats: add percentile metrics tracking functionality
Brief overview:
TO configure tracking percentile metrics in rainerscript:
User would need to define:
- which percentile to track, such as [p50, p99, etc.]
- window size - note, this correlates directly with memory usage to
track the percentiles.
To track a value, user would call built-in function `percentile_observe()` in their configurations to
record an integer value, and percentile metrics would be emitted every
impstats interval.
Thanks to Nelson Yen for the patch.
- 2021-08-12: imfile: add parameter "ignoreolderthanoption"
instructs imfile not to ingest a file that has not been modified in the
specified number of seconds.
Thanks to github user yanjunli76 for the patch (submitted from Nelson Yen)
- 2021-08-10: imklog bugfix: invalid memory adressing, could cause abort
This is a regeression from commit 94c4a87. It introduced a free() call
using an object that was no longer valid (the main pointer to the
to-be-freed object) was already freed at time of use. This could
cause various issues, including a segfault.
Note: this bug was triggerred only during late phase of rsyslog
shutdown, so it did not affect regular operation.
Special thanks to github user wxiaoguang for analyzing the issue
and providing a draft fix proposal, on which this patch builds.
see also
- 2021-08-09: imfile bugfix: deleteStateOnFileDelete missed some state files
When the log file is deleted, imfile would attempt to delete the statefile but it
was missing the file_id part of the statefile name. This means the statefiles were
only removed in the log file was less than 512 characters, because for very small
files the file ID hash is not created. This lead to some state files not being
Thanks to pearseimperva for the patch.
- 2021-08-09: imfile bugfix: hash char invalidly added in readmode != 0
If imfile is ingesting log files with readMode set to 2 or 1, the resulting
messages all have a '#' character at the end. This patch corrects the behaviour.
Note: if some external script "supported" the bug of extra hash character at
the end of line, it may be necessary to update them.
- 2021-08-09: omelasticsearch bugfix: errorFile mutex was not consistently locked
Lock the file during SIGHUPs to avoid issues with concurrent accesses by
Thanks to François Poirotte for the patch.
- 2021-08-09: imudp: add socket type (IPv4 vs. 6) to input name
Most importantly, the input name is used for stats counter names as
well. Previously, the same name was used for IPv4 and IPv6, so we had
two counters with an equal name. That left users puzzled.
Unfortunately, this change can potentially require changes to existing
analysis scripts, as the name is now slightly different.
- 2021-08-06: omfwd: add capability for action-specific TLS certificate settings
This permits to override the global definitions for TLS certificates
at the action() level.
- 2021-08-06: imfile bugfix: file handle leak if "freshStartTail" was turned on
- 2021-08-05: imtcp: permit to use different certificate files per input/action
This completes the ability to override global/default TLS settings at the imtcp
input() level. Support for using multiple CAs/Certs per Connection is now provided.
- 2021-08-04: imptcp bugfix: keep alive interval was incorrectly set
The interval was accidentally set to keep alive interval. This has been
- 2021-07-08: openssl network driver bugfix: small memory leak
Fixes a static, non-growing memory leak which existed when parameter
"GnutTLSPriorityString" was used. This was primarily a cosmetic issue,
but caused some grief during development in regard to memory leak
Note: yes, this is for openssl -- the parameter name is historical.
- 2021-07-07: psrv bugfix: abort if no listener could be started
Modules (like imtcp and imdiag) which use tcpsrv could abort or
otherwise malfunction if no listener for a specific input could
be started.
Found during implementing a new feature, no report from practice.
But could very well happen.
- 2021-07-07: mmkubernetes bugfix: apiserver error handling
- Added graceful handling of apiserver errors with unexpected responses,
i.e., anything other than 200, 404, or 429. Idea is that apiserver
transient error state will recover. We don't want mmkubernetes to miss
metadata resolution for containers that don't have cached metadata.
During these transient error states, mmkubernetes will provide basic
container file path based resolution of namespace and pod metadata for
new pods whose metadata is not yet cached. After this error state
recovers, mmkubernetes is expected to resume its metadata resolution as
- Added a unit test case for apiserver return 500 with changes to mock server
- Fixed existing unit test that was failing due to missing expected results file
- Added mmkubernetes unit tests to testbench
Thanks to Abdul Waheed for the patch (submitted from Nelson Yen).
- 2021-07-07: ommongodb bugfixes
- Fix Segmentation fault when server is down
- Add server connexion check while resuming
Thanks to Kevin Guillemot for the patch.
- 2021-06-28: omkafka improvements
- drain librdkafka queues and retry later during rsyslog restart or hup. This
re-injects messages into rsyslog's native queues.
- add statsname on per kafka instance for better visibility
- omkafka - count errors related ssl as "errors_ssl"
Thanks to Nelson Yen for the patch.
- 2021-06-23: some CI/QA improvements, Travis-CI disabled
For the time being, Travis CI is disabled because it was outdated and Travis also
changed their system. We will re-evaluate if we re-enable it. Since quite a while
the Travits tests were redundant with the rest of CI, so this does not reduce
- 2021-06-23: omhttp bugfix: dynrestpath param in batch mode invalid
When batchmode was used, the templates could not be used to
expand dynrestpath. We are now storing the restpath param
within the batch data if we are in batch mode.
When we are in batch mode, and the restpath value changes, the
batch is submitted and reinitialized
- 2021-06-17: add predefined template RSYSLOG_SyslogRFC5424Format
This is essentially the same as RSYSLOG_SyslogProtocol23Format with
a better name and a fix to remove the unnecessary LF at the end of
the message.
The different name also enables us to fix the LF issue without
any concern about backwards compatibility.
- 2021-06-17: impstats/bugfix: _sender_stats reports integer counter as string
Note that this introduces a small backwards incompatibility: in previous output
the field was of string type, now it is integer (as intended). We discussed this
on the mailing list and the overwhelming thought was that this is not a problem
because almost all analysis backends are able to cover that format change. This made
the bugfix essentially costmetic.
HOWEVER, if you still experience issues, please let us know. We can add an option
to provide the previous format, and just spared to do so because there was no
evidence it was needed.
Scheduled Release 8.2106.0 (aka 2021.06) 2021-06-15
NOTE: the prime new feature is support for TLS and non-TLS connections
via imtcp in parallel. Furthermore, most TLS parameters can now be overriden
at the input() level. The notable exceptions are certificate files, something
that is due to be implemented as next step.
- 2021-06-14: new global option "parser.supportCompressionExtension"
This permits to turn off rsyslog's single-message compression extension
when it interferes with non-syslog message processing (the parser
subsystem expects syslog messages, not generic text)
- 2021-05-12: imtcp: add more override config params to input()
It is now possible to override all module parameters at the input() level. Module
parameters serve as defaults. Existing configs need no modification.
- 2021-05-06: imtcp: add stream driver parameter to input() configuration
This permits to have different inputs use different stream drivers
and stream driver parameters.
- 2021-04-29: imtcp: permit to run multiple inputs in parallel
Previously, a single server was used to run all imtcp inputs. This
had a couple of drawsbacks. First and foremost, we could not use
different stream drivers in the varios inputs. This patch now
provides a baseline to do that, but does still not implement the
capability (in this sense it is a staging patch).
Secondly, we now ensure that each input has at least one exclusive
thread for processing, untangling the performance of multiple
inputs from each other.
see also:
- 2021-04-27: tcpsrv bugfix: potential sluggishnes and hang on shutdown
tcpsrv is used by multiple other modules (imtcp, imdiag, imgssapi, and,
in theory, also others - even ones we do not know about). However, the
internal synchornization did not properly take multiple tcpsrv users
in consideration.
As such, a single user could hang under some circumstances. This was
caused by improperly awaking all users from a pthread condition wait.
That in turn could lead to some sluggish behaviour and, in rare cases,
a hang at shutdown.
Note: it was highly unlikely to experience real problems with the
officially provided modules.
- 2021-04-22: refactoring of syslog/tcp driver parameter passing
This has now been generalized to a parameter block, which makes it much cleaner and
also easier to add new parameters in the future.
- 2021-04-22: config script: add re_match_i() and re_extract_i() functions
This provides case-insensitive regex functionality.
Scheduled Release 8.2104.0 (aka 2021.04) 2021-04-20
- 2021-04-19: new contributed module imhiredis
Thanks to Théo Bertin (frikilax) for the patch.
- 2021-04-19: new built-in function get_property() to access property vars
Provides ability to evaluate a rsyslog variable using dynamically
evaluated parameters.
1st param is the rsyslog param, 2nd param is a key, can be an array
index or key string.
Useful for accessing json sub-objects, where a key
needs to be evaluated at runtime. Can be used to access arrays as well.
Thanks to Nelson Yen for contributing this module.
- 2021-04-19: mmdblookup: add support for mmdb DB reload on HUP
Thanks to Théo Bertin (frikilax) for the patch.
- 2021-04-19: script bugfix: empty array in foreach() improperly handled
When running a foreach() loop inside a ruleset, if the json array/object iterated
over is empty but valid, the foreach will make the message processing in the
ruleset abort operation, no following operation (such as actions) will be
executed after this.
Thanks to Théo Bertin (frikilax) for the patch.
- 2021-04-19: imjournal bugfixes (handle leak, empty file)
Flush the FILE* buffer before rename & fsync in order
to not end up syncing an empty file.
Also, close WorkDir on fsync in order to prevent
file descriptor leakage.
Thanks to github user gerd-rausch for the fix.
- 2021-04-06: new contributed function module fmunflatten
This commit adds a new rainerscript function to unflatten keys in a JSON tree. It
provides a way to expand dot separated fields.
<result> = unflatten(<source-tree>, <key-separator-character>);
It allows for instance to produce this: { "source": { "ip": "", "port": 443 } }
from this source data: { "source.ip": "", "source.port": 443 }
Thanks to Julien Thomas for the contribution.
- 2021-02-22: test bugfix: some tests did not work with newer TLS library versions
Newer versions provide TLS versions that cannot be disabled in older versions as they
are unknown there. This is solved by setting restrictions in multiple steps. For
older library versions, the final step will error out, but the other one be applied.
This permits to achieve proper test results.
- some improvements to project CI
Scheduled Release 8.2102.0 (aka 2021.02) 2021-02-16
- 2021-02-15: omfwd: add stats counter for sent bytes
Thanks to John Chivian for suggesting this feature.
- 2021-02-15: omfwd: add error reporting configuration option
RSyslog on a plain TCP cannot guarantee the message delivery
without using RELP protocol. Besides that the logs may be
flooded with connection errors making the rest of messages
difficult to find. To alleviate the problem (see issue 3910),
this patch adds a configuration option that enables to reduce
the number of network errors logged and reported.
For example, if each 10th network error message should be logged,
the rsyslog configuration has to be updated as follows.
action(type="omfwd" Target="<IP_ADDR>" Port="<PORT>" Protocol="tcp" ConErrSkip="10")
Thanks to Libor Bukata for the patch.
- 2021-02-15: action stats counter bugfix: failure count was not properly incremented
In some cases the counter was not incremented, most notably with transaction-enabled
Thanks to github user thinkst-marco for the patch.
- 2021-02-15: action stats counter bugfix: resume count was not incremented
And so it always stayed at zero.
Thanks to github user thinkst-marco for the patch.
- 2021-02-15: omfwd bugfix: segfault or error if port not given
If omfwd is configured via RainerScript config format and the "port"
parameter is not given, a segfault will most likely happen on
connection establishment for TCP connections. For UDP, this is
usually not the case.
Alternatively, in any case, errors may happen.
Note that the segfault will usually happen right on restart so this
was easy to detect.
We did not receive reports from practice. Instead, we found the bug
while conducting other work.
- 2021-01-29: lookup table bugfix: data race on lookup table reload
A data race could happen when a lookup table was reloaded. We found
this while moving to newer version of TSAN, but have no matching
report from practice. However, there is a potential for this to cause
a segfault under "bad circumstances".
- 2021-01-18: testbench modernization
Bump dependency versions, use newer distro versions for some tests.
Make kafka distcheck separate to help diagnose flaky kafka tests.
- 2021-01-16: testbench: fix invalid sequence of kafka tests runs
kafka tests can not run well in parallel (mostly due to ressource
constraints on CI machines). Accidentally, this was not enforced for
one of the tests. That could lead to random failures and false positives.
- 2021-01-14: testbench: fix kafkacat issues
The kafkacat tool has an upper limit of how many messages it can send
at once. Going over that limit causes messages loss. The exact limit
seems to depend on the environment. This causes testbench false positives.
This commit fixes two related issues:
- errors during kafkacat run were not detected - this has been added
- we now have a "max messages at once" setting, after which kafkacat
is restarted for the next batch of messages. It currently is set
to 25,000 msgs per incarnation. All tests loop now to send the
required number of messages. This has been fixed at the testbench
framework level, so no need to adjust individual tests.
- 2021-01-14: testbench: fix year-dependendt clickhouse test
A test had the year value hardcoded and as such failed whenever the
year changed. This patch corrects that.
Scheduled Release 8.2012.0 (aka 2020.12) 2020-12-08
- 2020-12-07: testbench bugfix: some tests did not work in make distcheck
- certificate file missing in dist tarball
- some test cases did not properly specify path to cert file
Thanks to Michael Biebl for alerting us and providing part of
the fix.
- 2020-12-07: immark: rewrite with many improvements
- mark message text can now be specified
- support for rulesets
- support for using syslog API vs. regular internal interface
- support for output template system
- ability to specify is mark message flag can be set
- minor changes and improvements
- 2020-11-30: usability: re-phrase error message to help users better understand cause
see also
- 2020-11-10: add new system property $now-unixtimestamp
Among others, this may be used as a monotonic counter
for doing load-balancing and other things.
Thanks to Nicholas Brown for suggesting this feature.
- 2020-11-04: omfwd: add new rate limit option
Adding new rate limit option to omfwd for rate limiting
syslog messages sent to the remote server
Specifies the rate-limiting interval in seconds.
Default value is 0, which turns off rate limiting.
Specifies the rate-limiting burst in number of messages.
Thanks to Dinesh-Ramakrishnan for the patch.
- 2020-11-03: omfwd bug: param "StreamDriver.PermitExpiredCerts" is not "off" by default
The default behaviour of expired certificates of stream driver in TLS mode, should
have been that the see tcp transmission is closed due to expired certificates, and
error messages emited in rsyslog status. This was not the case. That in turn could
lead to permitting sessions which should not be permitted.
Thanks to Vincent Zhu for alerting us and providing a great problem analysis
Scheduled Release 8.2010.0 (aka 2020.10) 2020-10-20
- 2020-10-13: gnutls TLS subsystem bugfix: handshake error handling
If the tls handshake does not immediatelly finish, gnutls_handShake is called in
doRetry handler again. However the error handling was not
complete in the doRetry handler. A failed gnutls_handShake call
did not abort the connection and properly caused unexpected
problems like in issues:
- 2020-10-13: core/msg bugfix: memory leak
There is a missing call to json_object_put(json) if the call to
jsonPathFindParent() failed. It's leaking memory. Depending on workload and config,
this leak can potentially grow large (albeit we did not see reports from practice).
Thanks to Julien Thomas for the patch.
- 2020-10-13: core/msg bugfix: segfault in jsonPathFindNext() when <root> not an object
The segfault gets happens when <bCreate> is 1 and when the <root>
container where to insert the <namebuf> key is not an object.
Here is simple reproducible test case:
// ensure we start fresh
// unnecessary if there was no previous set
unset $!;
set $! = "";
set $!event!created = 123;
Thanks to Julien Thomas for the patch.
- 2020-10-13: openssl TLS subsystem: improvments of error and status messages
Adding error logs at the ssl handshake failure scenarios.
Adding the header "nsd_ossl:" tag to these logs to identify
the origin module from which logs are generated.
Thanks to Anusha Pai G for the patch.
- 2020-10-06: add 'exists()' script function to check if variable exists
This implements a way to check if rsyslog variables (e.g. '$!path!var') is
currently set of not.
Sample: if exists($!somevar) then ...
- 2020-10-03: core bugfix: do not create empty JSON objects on non-existent key access
Performing a condition (eg: check for an empty string) on a subtree key that do not
exists (depth > 1 from the root container), creates an empty "parent" object.
Depending on your context, you may end up with (kind of...) annoying garbage when
producing object documents (for instance to index in ES).
Also fixes a hypothetical hang condition with an almost (?) unused plugin parameter
passing mode, for details see
Thanks to Julien Thomas for the patch.
- 2020-09-28: gnutls subsysem bugfix: potential hang on session closure
Some TLS servers don't reply to graceful shutdown requests "for
optimization". This results in rsyslog's omfwd+gtls client to wait
forever for a reply of the TLS server which never comes, due to shutting
down the connection with gnutls_bye(GNUTLS_SHUT_RDWR).
On systemd systems, commands such as "systemctl restart rsyslog" just
hang for 1m30 and rsyslogd gets killed upon timeout by systemd.
This is fixed by replacing the call to gnutls_bye(GNUTLS_SHUT_RDWR) by calls to
gnutls_bye(GNUTLS_SHUT_WR) which is sufficient and doesn't wait for a
server reply.
As an example, Kiwi Syslog server is known to cause this issue.
Thanks to Renaud Métrich for the patch.
- 2020-09-23: core/network bugfix: obey net.enableDNS=off when querying local hostname
Local hostname resolution used DNS queries even if the enableDNS was set to off, and
this could cause unexpected delays in the HUP signal handling if the DNS server was
not responsive.
Thanks to Samu Nuutamo for the fix.
- 2020-09-14: core bugfix: potential segfault on query of PROGRAMNAME property
A data race can happen on variable iLenProgram as it is not guarded
by the message mutex at time of query. This can lead to it being
non -1 while the buffer has not yet properly set up.
Thanks to Leo Fang for alerting us and a related
patch proposal.
- 2020-09-14: imtcp bugfix: broken connection not necessariy detected
Due to an invalid return code check, broken TCP sessions could not
necessarily be detected "right in time". This can result is the loss
of one message.
Thanks to Leo Fang for the patch.
- 2020-09-14: new module: imhttp - http input
permits to receive log data via HTTP.
uses http library to provide http input.
user would need to configure an 'endpoint' as input, along
with a ruleset, defining how the input should be routed in
Thanks to Nelson Yen for contributing this module.
- 2020-09-11: mmdarwin bugfix: potential zero uuid when reusing existing one
- fix a use-after-free variable during darwin uuid message extraction
- improve debug/output by logging uuid parse errors
Thanks to github user frikilax for the patch.
- 2020-09-10: imdocker bugfix: build issue on some platforms
An invalid variable type was used, leading to compile errors at least on
all platform that use gcc 10 and above. Otherwise, however, it looks like the
issue caused no real harm.
- 2020-09-07: omudpspoof bugfix: make compatbile with Solaris build
Thanks to Dagobert Michelsen for the patch.
- 2020-09-03: testbench fix: python 3 incompatibility
- 2020-09-02: core bugfix: segfault if disk-queue file cannot be created
When using Disk Queue and a queue.filename that can not be created
by rsyslog, the service does not switch to another queue type as
supposed to and crashes at a later step.
- 2020-08-26: cosmetic: fix dummy module name in debug output
When we have optional components (like imjournal) a dummy module
is used. It's sole purpose is to emit "this module is not available".
During init, the module emitted an invalid module name into the debug
log. This has now been replaced by the generic term "dummy".
Note: it is highly unlikely that someone will ever see that message
at all, as it is unlikely for the dummy modules to be build.
see also:
Thanks to Thomas D. (whissi) for the patch.
- 2020-08-26: config bugfix: intended warning emitted as error
When there are actions configured after a STOP, a warning should be
emitted. In fact, an error message is generated. This prevents the
construct, which may have some legit uses in exotic settings. It
may also break older configs, but as the message is an error
for so long now, this should be no longer of concern.
Scheduled Release 8.2008.0 (aka 2020.08) 2020-08-25
- 2020-08-25: imdocker bugfix: error reporting not always correct
A wrong function to obtain the error code was used. This
could lead to invalid error messages.
Thanks to Steve Grubb for the bug report and fix proposal.
- 2020-08-25: imptcp: add max sessions config parameter
The max is per-instance, not global across all instances.
There is also a bugfix where if epoll failed I think we could leave a
session linked in the list of sessions, this code unlinks it.
Thank to Alfred Perlstein for the patch.
- 2020-08-24: omelasticsearch bugfix: reply buffer reset after health check
The issue happens when more than one server is defined on the
action. On that condition a health check is made through
checkConn() before sending the POST. The replyLen should be
set back to 0 after the health check, otherwise the response
data received from the POST gets appended to the end of the
last health check.
Thanks to Julien Thomas for the patch.
- 2020-08-14: omfile: do no longer limit dynafile cache size in legacy format
When using obsolete legacy config format, omfile had a hard limit of
1,000 dynafile cache entries. This does not play well with very
large installation. This limit is now removed and converted into
a warning if cache size > 25,000 is specified.
Note: the problem can easily be worked-around by using modern
config format (RainerScript).
- 2020-08-13: imudp: fix very small, static memory leak
When ruleset support was used, the ruleset name was not freed upon rsyslog
termination. While this has no consequences for regular runs, it generates
leak errors under memory debuggers and as such makes debugging harder than
Thanks to github user frikilax for the patch.
- 2020-08-13: omelasticsearch: add parameter skipPipelineIfEmpty
When POST'ing a document, Elasticsearch does not allow an empty pipeline
parameter value. This patch introduces boolean option skipPipelineIfEmpty
to the omelasticsearch action. When set to true, the pipeline parameter
won't be posted. Default is false so we do not modify current behavior.
Thanks to Julien Thomas for the patch.
- 2020-08-12: systemd service file removed from project
This was done as distros nowadays have very different service files and it no
longer is useful to provide a "generic" (sic) example.
see also:
- 2020-08-11: gnutls TLS driver bugfix: EKU check not done properly
When the server accepted a new connection, it did not properly set the
dataTypeCheck field based on the listening socket. That resulted in
skipping ExtendedKeyUsage (EKU) check on the client.
Thanks to Daiki Ueno for the patch.
- 2020-08-06: MMDARWIN:: improve configuration flexibility and UUID fix
-t pu now able to get fields from local variables ($.)
- now able to configure a custom root container for mmdarwin fields
- now able to put nested keys ($!key1!key2)
- don't regenerate a UUID each time, but instead check if one exists before
creating it (allow successive calls without losing previous UUID)
Thanks to github user frikilax for the contribution.
- 2020-08-06: add --enable-imjournal=optional ./configure option
- 2020-08-06: IMPCAP::Fixes: segfault, memory and build corrections
* fix bug in ethernet packets parsing
* fix removes build error with gcc10: 'multiple definition of...'
* resolve memory leak during interface init failure (device not freed after post-create error)
* add test 'impcap_bug_ether' to prove ethernet parser fix is working
Thanks to github user frikilax for the contribution.
- 2020-07-14: CI: add support for github actions
- 2020-07-14: imklog: add ruleset support
see also:
see also:
- 2020-07-06: config system fix: ChkDisabled method to make config.enabled work
There was wrong negation in the method so it returned 0/1 in reverse
and also it did not mark the node to not be reported as unknown at all
times which is needed after all.
Thanks to Jiri Vymazal for the patch.
Scheduled Release 8.2006.0 (aka 2020.06) 2020-06-23
- 2020-06-22: queue: permit ability to double size at shutdown
This prevents message loss due to "queue full" when re-enqueueing data
under quite exotic settings.
see also
- 2020-06-22:Fixing imfile segfaulting on selinux denial
If imfile is denied access to file watched trough symlink there is
unchecked condition resulting in access to not initialized memory.
- 2020-06-22: openssl: Fixed memory leak when tls handshake failed.
- 2020-06-22: change systemd service file to wait for network
now that rsyslog is usually only installed for real syslog servers,
we should assume that some network listening or forwarding happens
on start. As such we need to start a bit later, after the network.
This poses no problem as systemd nowadays comes with journal which
is in almost all cases configured to buffer log data while
rsyslog is not yet running.
see also
- 2020-06-22: NEW INPUT MODULE:: impcap, network packets input parser
Thanks to github user frikilax for the contribution.
- 2020-06-22: ksi bugfix: Optimized code in KSI module initialization fixed.
KSI module initialization will not stuck in infinite loop when code is
built with optimization -O2.
- 2020-06-05: operatingstatefile bugfix: month was given too low
The month was printed with the range 0 (January) to 11 (December).
This has now been corrected.
- 2020-06-05: build system: add "optional" build functionality to some components
If used, builds a dummy module which just emits a "module not supported
on this platform" error message when loaded.
Primary use case for this system is Debian-ish builds on SUSE OBS,
where we prefer to have a single package definition for all versions
(else things get much more complicated).
- 2020-05-23: config system bugfix: backticks cat segfault if file cannot be opened
when a `cat <filename>` construct is used in rsyslog.conf and <filename> can not
be accessed (does not exist, no permissions, ...), rsyslog segfaults.
Thanks to Michael Skeffington for notifying us and providing root cause analysis.
- 2020-05-15: imtcp bugfix: octet framing/stuffing problem with discardTruncatedMsg on
When "discardTruncatedMsg" was enabled in imtcp, messages were incorrectly
skipped if the last character before the truncation was the LFdelimiter.
Also adds two testbench tests for this case.
- 2020-05-12: ompipe bugfix: race during HUP
When HUP was received, the write mutex was not aquired. This could
lead to unexpected invalidation of the output file descriptor.
Thanks to Julien Thomas for alerting us on this issue.
see also
- 2020-05-12: ompipe: add action parameter tryResumeReopen
Sometimes we need to reopen a pipe after an ompipe action gets
suspended. Sending an HUP signal to rsyslog does the job but requires
an interraction with rsyslog. The patch adds support for a new boolean
option, tryResumeReopen, for the ompipe action. It mimics what an HUP
signal would do.
Thanks to Julien Thomas for the patch.
- 2020-05-12: imjournal: remove strcat call
Thanks to Jeff Marckel for the patch.
- 2020-05-12: build system: libzcmq version requirement needs to be bumped
Thanks to Thomas Deutschmann for pointing this out.
- 2020-05-12: testbench: download ElasticSearch binaries from
The official ElasticSearch download site sometimes denies the download.
- 2020-05-11: openssl netstream driver bugfix: context leak
The context object was not properly freed.
Thanks to Michael Zimmermann for the fix.
- 2020-05-11: omhttp: Add support for multiple http headers
Allows the inclusion of multiple http headers on the REST call.
Thanks to callmegar for the patch.
- 2020-04-29: core bugfix: group id could not be obtained for very large groups
Thanks to github user emilbart for the patch.
- 2020-04-29: testbench additions (relp broken connection test)
- 2020-04-29: omudpspoof bugfix: issues with oversized messages
First issue was an incorrect packet length in UDP Header. It has to be the FULL UDP Packet
regardless of the MTU Setting. As a result regardless of IP fragmentation, the MTU setting
also limited the siizmax size of the UDP message.
The second issue was incorrect calculation of the UDP Checksum with libnet if
IP fragmentation was used (Based on MTU Setting). As a result, the network packets were
dropped by the tcp stack before they even could reach there target. The workarround for this
problem is, that we set the UDP Checksum to 0x0000 which allows skipping of the checksum
test. Fixing the problem by calculating the correct UDP Checksum would require some
code changes in the libnet.
Also fixed the omudpspoof bigmsg test and increased the testing size to 16KB.
- 2020-04-29: omprog: fix assert failed on HUP with output flag
If the 'output' setting of omprog was used and rsyslog received a HUP
signal just after starting (and before the omprog action received the
first log to process), an internal assertion could fail, causing
rsyslog to terminate. The failure message was "rsyslogd: omprog.c:660:
closeOutputFile: Assertion `pCtx->bIsRunning' failed."
The failure could also occur if rsyslog received a HUP signal during
the shutdown sequence.
This bug was introduced in v8.2004 by PR
Although a test already existed that checked the interaction of HUPs
with the 'output' setting, it didn't always fail in this particular case
due to timing conditions. The test has been improved to cover this case
more reliably.
Thanks to Joan Sala Isern for the patch.
Scheduled Release 8.2004.0 (aka 2020.04) 2020-04-28
- 2020-04-28: ksi bugfix: When KSI module is suddenly closed, files are finalized
In async. mode all pending signature requests are closed immediately and
unsigned block marker is attached with message about sudden closure.
Similar approach is used for blocks that already contain some records.
Empty blocks are just closed without any metadata.
Thanks to Taavi Väljaots for the patch.
- 2020-04-28: ksi bugfix: Signer thread initialization is verified before usage.
When signer thread is created in rsksiInitModule thread successful
initialization is verified before returning the function. This will
prevent adding records to not initialized module and in case of an
error signature files opened will contain only magic bytes.
Thread flags replaced with thread state.
When init module fails, module is disabled.
Thanks to Taavi Väljaots for the patch.
- 2020-04-28: ksi bugfix: Hardcoded default hash algorithm replaced with 'default'
Instead of hardcoded SHA-256 KSI_getHashAlgorithmByName("default")
is used to get default hash function.
Function rsksiSetHashFunction and SetCnfParam updated.
Thanks to Taavi Väljaots for the patch.
- 2020-04-28: imfile bugfix: poential segfault in stream object on file read
- if cstrLen(pThis->prevMsgSegment) > maxMsgSize then len calculation
become negative if cstrLen(thisLine) < cstrLen(pThis->prevMsgSegment)
This causes illegal access to memory location and thus causing segfault.
- assigning len = 0 if cstrLen(pThis->prevMsgSegment) > maxMsgSize so that
it access the correct memory location.
Thanks to github user jaankit
- 2020-04-28: openssl TLS drivers: made more reliable for older openssl versions
OpenSSL can retry some failed operations, but older versions need an explicit
opt-in to do so. This is now done.
- 2020-04-28: omprog: fix bad fd errors in daemon mode
When omprog was used with the 'forceSingleInstance=on' option, and/or
the 'output' setting, "bad file descriptor" errors occurred, which
prevented the external program to be executed and/or the program output
to be correctly captured. The bug could also manifest as "resource
temporarily unavailable" errors, or other errors related to the use of
invalid/reassigned file descriptors. These errors only happened when
rsyslog ran in daemon mode (i.e. they didn't happen if rsyslogd was
run with the '-n' option).
The cause of the bug was that omprog opened the pipe fds needed by
these flags during the configuration load phase (in the 'newActInst'
module entrypoint). This is a bad place since the fork of the daemon
occurs after this phase, and all fds are closed when the daemon process
is started (see 'initAll' in rsyslogd.c), hence invalidating the
previously opened fds.
To correct this, the single child process and the output capture thread
are now started later, when the first log message is received by the
first worker thread. (Note: the 'activateCnf' module entrypoint, despite
being invoked after the fork, cannot be used for this purpose, since it
is invoked per module, not per action instance.)
Currently no automated test exists for this use case since the testbench
always runs rsyslog in non-daemon mode.
Affected versions: v8.38 and later
Thanks to Joan Sala Isern for the patch.
- 2020-04-28: omfile bugfix: $outchannel split log lines at rotation time
- 2020-04-17: openssl: add support for libreSSL
Disable use of "@SECLEVEL" in default cipher string and
avoid SSL_CONF_CTX_set_flags() API when LIBRESSL is used.
This means tlscommands will not work.
- 2020-03-04: imudp bugfix: build problems on some Linux kernel versions
Thanks to Wen Yang for the patch.
- 2020-03-02: conf output bugfix: -o produces missing space between call and rulename
Thanks to Tetiana Ohnieva for the patch.
Scheduled Release 8.2002.0 (aka 2020.02) 2020-02-25
- 2020-02-25: imfile: add per minute rate limiting
Add MaxBytesPerMinute and MaxLinesPerMinute options.
These take integer values and, respectively, limit the number
of bytes or lines that may be sent in a minute.
This can be used to put a limit on the count or volume of logs
that may be sent for an imfile.
Thanks to Greg Farrell for the patch.
- 2020-02-24: core: add global parameter "security.abortOnIDResolutionFail"
This parameter controls whether or not rsyslog aborts when a name ID
lookup fails (for user and group names). This is necessary as a security
measure, as otherwise the wrong permissions can be assigned or privileges
are not dropped.
The default for this parameter is "on". In previous versions, the default
was "off" (by virtue of this parameter not existing). As such, existing
configurations may now error out.
We have decided to accept this change of behavior because of the potential
security implications.
- 2020-02-24: openssl TLS driver bugfix: chained certificates were not accepted
This was supported since always inside GnuTLS driver, but was missing for openssl one.
- 2020-02-24: core bugfix: too early parsing of incoming messages
In theory, rsyslog should call parsers on the queue worker threads whenever
possible. This enables the parsers to be executed in parallel. There are
some cases where parsers needs to be called earlier, namely when parsed
data is needed for rate-limiting.
The logic to do this previously did not work correctly and was fixed six
years ago (!) by b51dd22. Unfortunately, b51dd22 was overly agressive:
it actually makes the early parser call now mandatory, effectively moving
parsing to the input side where there is no to little concurrency.
We still do not need to call the parser when all messages, regardless of
severity, need to be rate-limited. This is the default and very frequent
case. This patch introduces support for this and as such makes parsers
able to run in parallel in the frequent case again.
- 2020-02-20: testbench bugfix: two minor issues in test
lead to false positives during test runs (depending on circumstances)
- 2020-02-20: testbench: set max extra data length for tcpflood from 200 to 512KiB
Added a imrelp test for big messages (256KB).
- 2020-02-20: config system bugfix: 'config.enabled' directive oddities
Previously the directive was processed way too late which caused false
errors whenever it was set to 'off' and possibly other problems.
Thanks to Jiri Vymazal for the patch.
- 2020-02-09: imfile bugfix: timeout did not work on very busy system
The timeout feature was soley based on timeouts of the poll()
system call. On a very busy system, this would probably happen
very seldomly. Moreover, the timeout could occur later than
expected on any system with high load.
The issue was not reported from practice but discovered during
CI system improvements.
- 2020-01-30: build system: change --enable-imfile-tests default to "yes"
This was accidentally set to "no" some time ago (actual commit unknown). Tests for
imfile should by default run when imfile is enabled.
see also
- 2020-01-27: build system: add option --enable-gnutls-tests
This enables us to build GNUtls support but not necessarily
test it in CI. This is useful for some specialised subcomponent
test. The default is enabled if gnutls is enabled and disabled if not.
- 2020-01-26: testbench: new test for loadbalancing via global vars
This is a popular functionality which had not been routinely tested
in the past.
- 2020-01-26: mmdblookup bugfix: invalid data returned when no entry found
Since the upgrade of the package libmaxminddb on FreeBSD (1.3.2_2 -> 1.4.2),
the module mmdblookup returns the first entry of the mmdb database even if the entry
is not found. After some debug, I found the solution in the official maxminddb
repository : to check if the entry is in database, we must check the found_entry
attribute, otherwise the function MMDB_get_entry_data_list will return the first
entry of the database if the entry is not found in it.
Thanks to Kevin Guillemot for the patch.
- 2020-01-23: oversize message log bugfix: do not close fd -1
The oversize message log fd is always closed on HUP, even if it never
was opened (and thus has -1 value). This patch corrects the issue.
The bug had no know-bad effect in practice other than getting an
(ignored) error status from close(). However, it introduced warnings
in test runs (e.g. when running under valgrind).
- 2020-01-22: imfile bugfix: saving of old file_id for statefiles
Previously we saved old file_id unconditionally, which led to not
deleting old statefiles if files changes without rsyslog running.
Now it should work correctly.
Thanks to Jiri Vymazal for the patch.
- 2020-01-22: imfile bugfix: misadressing and potential segfault
Commit 3f72e8c introduced an invalid memory allocation size. This lead to
too-short alloc and thus to overwrite of non-owned memory. That in turn
could lead to segfaults or other hard to find problems.
The issue was detected by our upgraded CI system. We did not receive
any problem reports in practice. Nevertheless, the problem is real and
people should update affected versions to patched ones.
The bug was present in scheduled stable release 8.1911.0 and 8.2001.0.
see also:
see also:
- 2020-01-20: core bugfix: potential race during HUP
when rsyslog is HUPed immediately after startup and before it is fully
initialized, there is a potential race with the list of loaded modules.
This patch ensures no bad things can happen in that case.
Detected by LLVM TSAN, not seen in practice.
- 2020-01-20: testbench improvements and fixes
modernize tests, reduce robustness against slow machines, provide some
test framework functional enhancements, and optimize some tests.
Also includes some code changes to C testing components. Among others,
tests have slightly been speeded up by reducing the wait time at queue
shutdown. This is possible because of better overall completion checks.
Scheduled Release 8.2001.0 (aka 2020.01) 2020-01-14
- 2020-01-12: core bugfix: race condition related to libfastjson when using DA queue
Rsyslogd aborts when writing to disk queue from multiple workers simultaneously.
It is assumed that libfastjson is not thread-safe.
Resolve libfastjson race condition when writing to disk queue.
see also
Thanks to MIZUTA Takeshi for the fix.
- 2020-01-12: omfwd bugfix: parameter streamdriver.permitexpiredcerts did not work
- 2020-01-11: Bugfix: KSI module + dynafile in asynchronous mode fixed
Thanks to Taavi Valjaots for the patch
- 2020-01-08: tls driver: add support to configure certificate verify depth
Support added in omfwd as instance parameter:
Support added in imtcp as module parameter:
Can be 2 or higher.
Support added into ossl driver
Support added into gtls driver
Added testcases for both drivers.
- 2020-01-08: modernization of testbench
moved some tests to newer standards, hardened them against slow testbench machines,
kafka component download improvements, and prevent dangling left-over test tool
instances from aborted tests
- 2020-01-07: tls subsystem bugfix: default for permitExpiredCerts was invalidly "on"
The problem occured with commit 3d9b8df in December 2018 and went into
scheduled stable 8.1901.0. Unfortunately, the change in default was not detected
until a year later. This commit re-enables the previous default ("off"), which is
also the only sensible default from a security PoV. Unfortunately, new 2019
deployments may begin to see connection rejection when usin expired certs. As
expired certs should not be used, this hopefully will not cause problems in
Thanks to Jiri Vymazal for the patch.
- 2020-01-01: testbench: improve ElasticSearch test speed
We now support re-using suitable running ES instances, which reduces the
number of restarts.
- 2019-12-31: omelasticsearch: improve curl reply buffer handling
The curl reply buffer (pWrkrData->reply) was allocated, realloced and freed with
each request. This has now been reduced to once per module, slightly increasing
overall performance.
- 2019-12-31: config system: emit proper error message on $ in double-quoted string
- 2019-12-30: core bugfix: rsyslog aborts when config parse error is detected
In defaut settings, rsyslog tries to continue to run, but some data
structures are not properly initialized due to the config parsing error.
This causes a segfault.
In the following tracker, this is the root cause of the abort:
see also
- 2019-12-30: fix some alignment issues
So far, this worked everywhere (for years). But it may still have
caused issues on some platforms.
- 2019-12-27: core bugfix: APP-NAME fields could become empty
RFC 5424 specifies that an empty APP-NAME needs to be indicated by
"-". Instead, the field could become empty under certain conditions.
If so, outgoing 5424 messages were invalidly formatted.
This happened under quite unusual conditions, but could be seen
in practice.
- 2019-12-27: core bugfix: reopen /dev/urandom file descriptor after fork on Linux
This patch updates prepareBackground() in tools/rsyslogd.c to reopen any file
descriptors used for random number generation in the child process. This fixes
an issue on Linux systems where the file descriptor obtained for /dev/urandom
by seedRandomNumber() in runtime/srutils.c was left closed after the fork. This
could be observed in procfs, where /proc/fd/ would show no open descriptors to
/dev/urandom in the forked process. /dev/urandom is reopened as the child may be
be operating in a jail, and so should not continue to use file descriptors from
outside the jail (i.e. inherited from the parent process).
I found that this issue led to rsyslog intermittently hanging during seedIV()
in runtime/libgcry.c. After the fork, the closed file descriptor number tended
to get re-assigned. randomNumber() would then read from an incorrect (although
still valid) file descriptor, and could block (depending on the state of that
file descriptor). This gave rise to the intermittent hang that I observed.
Thanks to Simon Haggett for the patch.
- 2019-12-20: imdocker bugfix: did not compile without atomic operations
- 2019-12-20: omclickhouse: new parameter "timeout"
Thanks to Pavlo Bashynskiy for the patch.
- 2019-12-20: omhiredis: add 'set' mode plus some fixes
- new mode 'set' to send SET/SETEX commands
- new parameter 'expiration' to send SETEX instead of SET commands (only applicable to 'set' mode)
- fixes to missing frees
Thanks to github user frikilax for the patch.
- 2019-12-18: relp: Add support setting openssl configuration commands.
Add new configuration parameter tls.tlscfgcmd to omrelp and imrelp.
(Using relpSrvSetTlsConfigCmd and relpCltSetTlsConfigCmd)
OpenSSL Version 1.0.2 or higher is required for this feature.
A list of possible commands and their valid values can be found in the
The setting can be single or multiline, each configuration command is
separated by linefeed (n). Command and value are separated by
equal sign (=). Here are a few samples:
Add to new testcases for librelp and tlscfgcmd.
- 2019-12-18: bugfix core: potential segfault in template engine
under some circumstances (not entirely clear right now), memory
was freed but later re-used as state-tracking structures were not
properly maintained. Github issue mentioned below has full details.
Thanks to github user snaix for analyzing this issue and providing
a patch. I am committing as myself as snaix did not disclose his or
her identity.
- 2019-12-18: fixed some minor issues detected by clang static analyzer 9
- 2019-12-10: core/config bugfix: false error msg when config.enabled="on" is used
When the 'config.enabled="on"' config parameter an invalid error message
was emitted that this parameter is not supported. However, it was still
applied properly. This commit removes the invalid error message.
- 2019-12-03: omsnmp bugfix: "traptype" parameter invalidly rejected value 6
"Traptype" needs to support values 0 to 6.
However, if value 6(ENTERPRISESPECIFIC) was set, an invalid error message
was emitted. Otherwise processing was correct.
This could lead to problems with automatic config deployment,
as valid configurations were invalidly reported as incorrect.
That in turn could make a deployment fail.
- 2019-12-03: omsnmp: add new parameter "snmpv1dynsource"
If set, the source field from SNMPv1 trap can be overwritten
with a template, default is "%fromhost-ip%". The content should be a
valid IPv4 Address that can be passed to inet_addr(). If the content
is not a valid IPv4 Address, the source will not be set.
- 2019-12-02: imfile bugfix: state file renaming sometimes did not work properly
Now checking if file-id changes and renaming - cleaning state file
accordingly and always checking and cleaning old inode-only style
state files.
Thanks to Jiri Vymazal for the patch.
- 2019-12-02: ratelimit: increase rate limit interval parameter max value
The burst parameter in the ratelimit was increased to an unsigned int
but the interval remained an unsigned short. While it may be unusual,
there is possibly a chance to need to represent an interval longer than
about 3/4 of a day.
While here, go through and normalize all the various incarnations of
rate limiting to be explicitly unsigned int for the burst and interval.
Thanks to github user frikilax for the patch.
- 2019-12-02: ommongodb: Add other supported formats for 'time' and 'date' fields
Thanks to github user frikilax for the patch.
- 2019-12-02: imjournal bugfix: too many messages in error case
Under certain error conditions, `ignorePreviousMessages="on"` could be ignored
an existing messages be processed.
Thanks to github user 3chas3 for the patch.
- 2019-11-27: core bugfix: action on retry mangles messages
When a failed action goes into retry, template content is rendered
invalid if the action uses more than 1 template.
Thanks to Mikko Kortelainen for the patch.
- 2019-11-27: testbench: improve mysql testing support
tests can now run in parallel and are hardened against several glitches
- 2019-11-22: omhttp: add basic support for Loki Rest
Loki is a new message indexer and querier from Grafana Labs. See for details on Loki.
This change provides the initial message structure to send bulk message
payloads to the Loki Rest endpoint. omhttp, received a new bulk message
format called lokirest. Additionally, the plugin relies on the user to
provide the correct "stream" read message format.
A loki template must be json compatible and include a "stream" key of
key value tags, and a values key of an array of 2 element arrays, where
each 2 element array is the unix epoch in nanoseconds followed by an
unstructured message.
An example:
template(name="array_loki" type="string" string="{\"stream\":{\"host\":\"%HOSTNAME%\",\"facility\":\"%syslogfacility-text%\",\"priority\":\"%syslogpriority-text%\",\"syslogtag\":\"%syslogtag%\"},\"values\": [[ \"%timegenerated:::date-unixtimestamp%000000000\", \"%msg%\" ]]}")
- 2019-11-22: testbench: obtain python binary path via AM_PATH_PYTHON
see also
- 2019-11-22: omprog: detect violation of interface protocol
The spec for the omprog interaction with the program it calls specifies
that the program receives one message via one line. In other words:
it must be a string terminated by LF.
However, omprog does currently rely on a proper template to fulfill this
requirement, If the template does not provide for the LF, it is never
written. For the called program, this looks like it does not receive any
input at all. Even if it finally reads data (e.g. due to full buffer),
it will not properly be able to discern the messages.
This handling is improved with this commit.
We cannot just check the template, because at the end of the template
may by a non-constant value. As such, we do not know at config load
time if there is this problem or not.
So the correct approach is to, during runtime, check if each message
is properly terminated. For those that are not:
* we append a LF, because anything else makes matters worse
* log a warning message, at least for a sample of the messages
The warning is useful in the (expected most often) case that the template
is simply missing the LF. While appending works, it slows down processing.
As such the user should be given a chance to correct the config bug.
To avoid clutter, the warning is emitted at most once every 30 seconds.
This value is hardcoded as we do not envision a need to adjust it. Usually
users should quickly fix the template.
- 2019-11-19: core queue: emit warning if parameters are set for direct queue
Direct queues do not apply queue parameters because they are actually
no physical queue. As such, any parameter set is ignored. This can
lead to unintentional results.
The new code detects this case and warns the user.
- 2019-11-19: imjournal bugfix: do not wait too long on recovery try
When trying to recover journal errors, imjournal waited a hardcoded
period of 10s between tries. This was pretty long and could lead to
loss of journal data.
This commit adjust it to 100ms, which should still be fully sufficient
to prevent the journal from "hammering" the CPU.
It may be worth considering to make this setting configurable - but
let's first see if there is real demand to actually do that.
- 2019-11-19: mmutf8fix: enhance handling of incorrect UTF-8 sequences
1. Invalid utf8 detection didn't handle 3 and 4-byte overlong encodings (2
byte overlong encodings were handled explicitly by rejection E0 and E1
start bytes). Unified checks for overlong encodings.
2. Surrogates U+D800..U+DFFF are not valid codepoints (Unicode Standard, D92)
3. Replacement of characters in invalid 3 or 4-bytes encodings was too
eager. It must not replace bytes which are valid UTF-8 sequences. For
example, in [0xE0 0xC2 0xA7] sequence the 0xC2 is invalid as a continuation
byte, but it starts a valid UTF8 symbol [0xC2 0xA7]. That is, with current
code processing the sequence will result in "???" but the correct result is "?§"
(provided that the replacement character is "?").
4. Various tests for UTF-8 invalid/valid sequences.
Thanks to Sergei Turchanov for the patch.
- 2019-11-14: imfile: add new input parameter escapeLF.replacement
The new parameter permits to specify a replacement to be configured
when "escapeLF" is set to "on". Previously, a fixed replacement string
was used ("#012"/"\n") depending on circumstances. If the parameter is
set to an empty string, the LF is simply discarded.
Scheduled Release 8.1911.0 (aka 2019.11) 2019-11-12
- 2019-11-12: core queue: add config param "queue.takeFlowCtlFromMsg"
This is a fine-tuning option which permits to control whether or not
rsyslog shall alays take the flow control setting from the message. If
so, non-primary queues may also block when reaching high water mark.
This permits to add some synchronous processing to rsyslog core engine.
However, it is dangerous, as improper use may make the core engine
stall. As such, enabling this option requires very careful planning
of the rsyslog configuration and deep understanding of the consequences.
Note that the option is applied to individual queues, so a configuration
with a large number of queues can (and must if use) be fine-tuned to
the exact use case.
The rsyslog team strongly recommends to let the option turned off,
which is the default setting.
see also
- 2019-11-12: imrelp: add new config parameter "flowcontrol"
This permits to fine-tune the flowControl parameter. Possible values are
"no", "light", and "full". With light being the default and previously
only value.
Changing the flow control setting may be useful for some rare applications,
but be sure to know exactly what you are doing when changing this setting.
Most importantly, whole rsyslog may block and become unresponsive if you
change flowcontrol to "full". While this may be a desired effect when
intentionally trying to make it most unlikely that rsyslog needs to
lose/discard messages, usually this is not what you want.
see also
- 2019-11-11: imrelp: remove unsafe debug instrumentation
dbgprintf, which is not signal safe, was called from a signal handler
to get better understanding during debugging. While this usually works,
it can occasionally (5%) lead to a hang during shutdown. We have now
removed that debug info as it is no longer vital.
Note: this could only happen during debug runs. Production mode was
not affected. As such, this fix is only relevant to developers.
However, it caused some confusion in the following issue tracker.
see also
- 2019-11-06: ossl driver bugfix: fix wrong OpenSSL Version check
Fix OpenSSL Version check in:
- SetGnutlsPriorityString function in nsd_ossl.c
- initTLS() function tcpflood.c
for more.
This bug lead to not enabling some functionality correctly.
Removed "MinProtocol=TLSv1.1" from two testcases because MinProtocol
is only supported by OpenSSl 1.1.0 or higher and was not really
necessary for the testcases.
- 2019-11-05: mmdarwin: Optimizations, new parameters, update to protocol header
- use permanent worker-dependent buffers to avoid malloc/free for each entry
- move socket structures to worker data, remove global mutex
- add log lines for parameters and general workflow
- don't send body if empty/incomplete (see new parameters)
- don't close/reopen socket every time -> let session open or create new every X
entry (see new parameters)
- clean up code
- added 'send_partial', to let mmdarwin send body if not all fields were
retrieved, or not; default false = only send complete bodies
- added 'socket_max_use' to open new session every X packet, useful for
some versions of Darwin (prior to 1.1)
default is 0 = do not open new session/keep only one
- added 'evt_id' to the darwin header (Darwin v1+ compatibility)
Note: mmdarwin is a contributed module
Thanks to github user frikilax for the patch.
- 2019-11-01: mmkubernetes bugfix: improper use of realloc()
could cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-31: imjournal: set the journal data threshold to MaxMessageSize
When data is read from the journal using sd_journal_get_data it may be
truncated to a certain threshold (64K by default).
If the rsyslog MaxMessageSize is larger than the threshold, there is a
chance rsyslog will receive incomplete messages from the journal.
Empirically, this appears to happen reliably when XZ compression is
used by journald. Systems where journald uses LZ4 compression do not
appear to suffer this issue reliably--if at all.
This change sets the threshold to the MaxMessageSize when the
journal is opened.
Thanks to Robert Winslow Dalpe for the patch.
- 2019-10-30: improg bugfix: allow improg to handle multi-line inputs
miscellaneous bug fixes in improg:
* properly truncate string after an input event is submitted
* set msgoffset to 0.
* tests added to check above fixes
Thanks to Nelson Yen for the fix.
- 2019-10-30: mmdblookup bugfix: missing space in city name
This fixes the issue that spaces in city names are dropped. However, the
fix is more or less a work-around. As it turns out, the libmaxminddb API
is not correctly used. In the somewhat longer term, we should fix this.
see also
- 2019-10-30: core/queue: provide ability to run diskqueue on multiple threads
Up until this release, disk queues could only use a single thread,
what limited their performance with outputs like ElasticSearch.
Now disk queues can utilize multiple threads just like any other
queue type. Most importantly, the disk queue part of a DA queue
now inherits the max number of threads from its memory queue
NOTE: the new multi-threaded DA disk queue is actually a change of
behavior. We have not guarded it by a new config switch as we
assume the new behavior is most often exactly within user
expectations. In any case, we cannot see any harm from running
the disk queue on multiple threads.
see also
- 2019-10-25: omfile bugfix: file handle leak
The stream class does not close re-opened file descriptors.
This lead to leaking file handles and ultimately to the inability
to open any files/sockets/etc as rsyslog ran out of handles.
The bug was depending on timing. This involved different OS
thread scheduler timing as well as workload. The bug was more
common under the following conditions:
- async writing of files
- dynafiles
- not committing file data at end of transaction
However it could be triggered under other conditions as well.
The refactoring done in 8.1908 increased the likelihood of
experiencing this bug. But it was not a real regression, the new
code was valid, but changed the timing so that the race was more
Thanks to Michael Biebl for reporting this bug and helping to
analyze it.
- 2019-10-22: imfile bugfix: improper use of calloc()
could cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-22: TLS driver bugfix: improper use of calloc()
can cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-22: imuxsock bugfix: improper use of calloc()
can cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-17: build system bugfix: incorrect default in ./configure help text
Thanks to Michael Biebl for pointing this out.
- 2019-10-17: mmkubernetes bugfix: improper use of calloc()
can cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-16: core queue bugfix: propagate batch size to DA queue
This was a long-standing bug where the DA queue always had a fixed small batch
size because the setting was not propagated from the memory queue. This also
removes a needless and counter-productive "debug aid" which seemed to be in
the code for quite some while. It did not cause harm because of the batch
size issue.
- 2019-10-16: testbench: fix unreliable gzipwrite test
The test was timing-sensitive as we did not properly check all data
was output to the output file - we just relied on sleep periods.
This has been changed. Also, we made some changes to the testing
framework to fully support sequence checking of multiple ZIP files.
- 2019-10-16: core queue bugfix: handle multi-queue-file delete correctly
Rsyslog may leave some dangling disk queue files under the following
- batch sizes and/or messages are large
- queue files are comparatively small
- a batch spans more than two queue files (from n to n+m with m>1)
In this case, queue files n+1 to (n+m-1) are not deleted. This can
lead to problems when the queue is re-opened again. In extreme cases
this can also lead to stalled processing when the max disk space is
used up by such left-over queue files.
Using defaults this scenario is very unlikely, but it can happen,
especially when large messages are being processed.
- 2019-10-16: imjournal: fix regression from yesterday's patch
commit 78976a9bc059 introduced a regression that caused writing
the journal state file to fail. This happens when the state file
is given as relative file name and the working directory is also
a relative path. This situation is very uncommon. So most deployments
will never experience it. We discovered the issue during CI runs
where the trigger condition is given. Note that it also takes
multiple times of loading the journal to actually see the bug.
see also
- 2019-10-15: imjournal plugin code restructuring, added remote option
Decomposed ReadJournal() a bit, also now coupling journald
variables in one struct, added few warning messages and debug
prints to help with bug hunts in future, also got rid of two
needless journald calls. WorkAroundJournalBug now deprecated.
Added option to pull journald records from outside local machine.
Thanks to Jiri Vymazal for the patch.
- 2019-10-11: core bugfix: potential abort on very long action name
The action name is stored in modified form for the debug header and
some messages. If it is extremely long, a buffer can be overrun,
resulting in misaddressing and potential segfault for rsyslog. This
can also happen if the action is NOT named, but a custom path to
the output module is given and that path is very long. This triggers
the same issue because by default the module load path is included
in the action name.
This patch corrects the problem and truncates overly long names
when being used for name generation.
The problem was detected during testbench work. We did never receive
a bug report from practice.
- 2019-10-10: testbench: add test for mmpstrucdata with RFC5424 escape sequences
Scheduled Release 8.1910.0 (aka 2019.10) 2019-10-01
- 2019-10-01: core bugfix: incorrect error message on duplicate module load
A Null-pointer was passed to printf instead of the module name.
On some platforms this may lead to a segfault. On most platforms
printf check's for NULL pointers and uses the string "(null)"
instead. In any case, the module name is missing from the error message.
- 2019-10-01: imczmq nitfix: potential NULL ptr in printf on out-of-memory condition
very unlikely to happen but if it does without any real issue on most platforms.
- 2019-10-01: work around some compiler warning messages induced by pthreads API
- 2019-10-01: core ratelimiting: more verbose message when rate-limiting happens
When messages are rate-limited, the error message now also contains the
rate limiter setting. This enables the user to more quickly understand what
the problem is (especially if default values apply).
Thanks to Jiri Vymazal for the patch.
- 2019-10-01: openssl TLS driver: do not emit unnecessary error message
On older openssl versions, an API was missing to set user-defined parameters. If we
had such an older version, rsyslog emitted an error message even if the user did
not configure such parameters. This has been corrected, so that a message is only
emitted if there really is a problem. Based on user feedback the severity has also
been downgraded to "warning".
- 2019-10-01: pmcisconames (contributed module) bugfix: potential misaddressing
- 2019-09-30: pmaixforwardedfrom (contributed module) bugfix: potential misaddressing
- 2019-09-30: pmdb2diag (contributed module) bugfix: Out of bounds issue
Add a new sanity check after determining the level len.
Thanks to Philippe Duveau for the patch.
see also:
- 2019-09-02: ability to set stricter TLS operation modes
- checking of extendedKeyUsage certificate field
- stricter checking of certificate name/addresses
Thanks to Jiri Vymazal for the patch.
- 2019-08-21: testbench: add basic test for immark
- 2019-08-20: core: do not unnecessarily set hostname on each HUP
- 2019-08-20: build system: support cross-platform build for mysql/mariadb
rsyslog fails to cross build from source, because it uses mysql_config
and mysql_config is unfixably broken for cross compilation. It would be
better to use pkg-config. The attached patch makes rsyslog try
pkg-config first and fall back to mysql_config.
Thanks to Helmut Grohne for providing a base patch.
- 2019-08-20: core/tcpsrv: potential race on startup/shutdown
if the tcpsrv component is started and quickly terminated, it may hang
for a short period of time. Also a very small amount of memory is leaked
immediately before shutdown. While this leak is irrelevant in practice
(the OS clean up the process anyways), it leads to CI failures. The hang,
however, can lead to longer than expected shutdown times for rsyslog.
The problem can be experienced via imtcp, imgssapi and imdiag (users
of affected core component).
Scheduled Release 8.1908.0 (aka 2019.08) 2019-08-20
- 2019-08-19: testbench: add test for $allowedSender functionality
- 2019-08-19: testbench: harden some tests against very slow CI machines
- 2019-08-16: testbench: make most tests use a port file and assign listen port 0
This makes the test much more robust against heavily loaded test systems.
- 2019-08-16: core/action: guard action.externalstate.file content against whitespace
remove trailing whitespace before checking the status string. This is
most important as a line usually ends with \n, which is considered
trailing whitespace. Accepting this increases usability.
- 2019-08-16: imtcp bugfix: multiple listenerPortFile parameter did not work
... because they were treated as module-global. If we had multiple imtcp
listeners with multiple port files, only the last filename was always used.
- 2019-08-16: testbench: improve testbench plumbing for gzip and fail cases
We have added new capabilities to the testbench plumbing to automatically
deal with gzip-compressed files. This also permits to use the wait_seq_check
function to work for gzip tests as well. The known-timing-sensitive
gzipwr_large test now makes use of the new capabilities. This enables us
to more reliably detect when we can savely shutdown the tested instance.
This commit also adds an ability to "abort" the full testbench run on
first test failure. This is especially useful during CI.
- 2019-08-13: testbench: add test for imuxsock legacy format
This was never tested. Ensures we don't accidentally break existing
- 2019-08-13: omelasticsearch bugfix: segfault on unknown retryRuleset
omelasticsearch does some "interesting tricks" for an output module.
This causes a segfault if the retryRuleset is now known.
The action module interface currently expects that all config errors
be detected during instance creation. Instead omelasticsearch defers
the retry ruleset check to a later state. The reason is that it wants
to support the use the same rulesetname it is defined in - and this
is not yet available at action parsing.
We fix this by ensuring that any deleted instance is properly unlinked
from the instance list. One may argue the module interface should get
upgrade for such cases, but this is a longer-term approach.
- 2019-08-12: imptcp bugfix: port="0" parameter did not work as expected
when multiple interfaces and/or protocols could be bound, each of
them used a different listener ports were assigned. While this is
basically correct, it makes things unusable, especially as
listenPortFileName will only contain the port number used for
the latest listener.
This patch now follows the model of nsd_ptcp.c to assign only
the first port randomly and then use that port consistently.
- 2019-08-10: omelasticsearch bugfix: potential resource leak with "rebindinterval"
If the "rebindInterval" parameter was used connections could be linked. This
was especially the case with small intervals (such as "2"). This is fixed by
forcing libcurl to close the connection on rebind.
Thanks to Noriko Hosoi for providing the patch.
- 2019-08-10: imjournal bugfix: state file close with fsync() was incorrect
This lead to fsync() now always applied where expected.
Thanks to Jiri Vymazal for the patch.
- 2019-08-10: testbench: add addtl test for multithreading and HUP
- 2019-08-10: imptcp bugfix: received bytes counter improperly maintained
imptcp counts the number of bytes received. However, receives
happen on different worker thread. The access to the counter
was not synchronized, which can cause loss of updates. Also,
thread debuggers validly flag this as an error, which creates
problems under CI.
This commit fixes the situation via atomic operations and
falls back to mutex calls if they are not available.
Detected by LLVM thread sanitizer.
- 2019-08-07: testbench: add basic tests for omusrmsg
- 2019-08-05: omhttp bugfix: enable checkpath configuration parameter
omhttp, 'checkpath' option, was not configurable in the past.
- add 'checkpath' to the cnfparamdescr table.
- fix issue with checkpath passing extra garbage characters in string.
- add 'checkpath' into unit test -
Thanks to Nelson Yen for the fix.
- 2019-08-05: testbench bugfix: some tests were executed when req module was missing
In actual case if --enable-impstats was not given some other tests failed.
- 2019-08-03: iminternal bugfix: race on termination
This could in theory lead to loss of shutdown messages, but was mostly a
cosmetic issues. We primarily fixed it to get TSAN-clean so that we can
utilize LLVM TSAN in CI.
- 2019-08-02: testbench: new test for omfile outchannel functionality
- 2019-08-02: core/janitor bugfix: properly maintain dynafile cache
When the janitor cleans out timed-out files, it does not
properly indicate the entry is gone. Especially when running
in async mode this can lead to use-after-free and thus
memory corruption or segfault.
see also
- 2019-08-01: omfile bugfix: race file when async writing is enabled
This seems to be a long-standing bug, introduced around 7 years ago.
It became more visible by properly closing files during HUP, which
was done in 8.1905.0 (and was another bugfix). Note that due to this
race a memory corruption can occur under bad circumstances. As such,
this may have also caused segfaults or system hangs (mutexes could
have been affected).
- 2019-08-01: testbench: additional tests for HUP
- 2019-07-31: imrelp bugfix: hang after HUP
termination condition was not properly checked; this lead to
premature termination after patch 1c8712415b9 was applied.
It is open to debate if patch 1c8712415b9 changed the module
interface. Actually it looks like this was previously not
well thought out.
- 2019-07-24: mmdarwin: add new module
This is a contributed module. For details see doc.
Thanks to the Advens team for contributing it.
- 2019-07-23 iminternal bugfix: suppress mutex double-unlock
If there is a burst of log messages during a time when rsyslog is unable
to output (either during log rotation, an out-of-space condition, or
some other similar condition), rsyslog can SEGFAULT due to a mutex
- 2019-07-23 imtcp: enable listenPortFileName parameter
this parameter was added, but it had no effect as it was not
passed down to the driver layer. This has been fixed. That also
now enables us to use dynamically-assigned port, which are
very useful for further testbench stabilization. Quite some
false positives occurred because the pre-selected port was
already in use again when rsyslog started.
- 2019-07-19 imtcp: enable listenPortFileName parameter
this parameter was added, but it had no effect as it was not
passed down to the driver layer. This has been fixed. That also
now enables us to use dynamically-assigned port, which are
very useful for further testbench stabilization. Quite some
false positives occurred because the pre-selected port was
already in use again when rsyslog started.
- 2019-07-18 core/action: no error file written if act suspended on TX commit
when an action was already disabled while the action was tried to be
committed, no error file was written. Note that this state is highly
unlikely to happen. Most probably, it can only happen if parameter
action.externalstate.file is used.
Version 8.1907.0 (aka 2019.07) 2019-07-09
NOTE TO MAINTAINERS: libee is not used by rsyslog for quite some while.
However, we never included this info into the changelog. So if you still
make rsyslog depend on libee (some do this), you should stop doing so now.
Libee is dead and no longer been maintained nor hosted by us. Old versions
can still be found at github for those in need.
GENERAL NOTE: during 8.1907 scheduled release timeframe we changed the ChangeLog
format to include the date a change went into master branch. This is to provide
an easy way to identify which changes went into the respective daily stable.
- 2019-07-05 imuxsock: support FreeBSD 12 out of the box
FreeBSD 12 uses RFC5424 on the system log socket by default. This
format is not supported by the special parser used in imuxsock.
Thus for FreeBSD the default needs to be changed to use the
regular parser chain by default. That is all this commit does.
- 2019-07-05 function bugfix: "ipv42num" misspelled as "ip42mum" (without "v")
To fix the issue but keep compatible with existing deployments
both function names are now supported.
- 2019-07-04 fix leading double space in rsyslog startup messages
see also
- omamqp1: port to latest api, add tests
This brings omamqp1 up-to-date with the latest qpid-proton-c
api version. This also adds a test for the plugin, to test
the basic functionality. The test requires the user to
install qdrouterd and the python qpid-proton library in order
to use the test program.
Thanks to Richard Megginson for the patch.
- omclickhouse bugfix: potential segfault on omclickhouse batchmode
segfault happened when the template did not contain the string
Thanks to github user wdjwxh for the fix.
- core bugfix: message duplication copied incorrect timestamp
MsgDup() placed timereported into timegenerated property, resulting
in invalid property values. Original timegenerated was lost. This
occured always when a message needed to be duplicated. Most
importantly this is the case when queues are used.
- core bugfix: segfault on startup depending on queue file names
rsyslog will segfault on startup when a main queue file name has
been set and at least on other queue contains a file name. This
was cased by too-early freeing config error-detection data
structures. It is a regression caused by commit e22fb205a3.
Thanks to Wade Simmons for reporting this issue and providing
detailed analysis. That greatly helps fixing it quickly.
- core "bugfix": alignment issue
This was not a hard error on current platforms, but a
to-be-considered compiler warning regarding invalid alignment.
While it works well on current platforms, alignment issues may
turn into real issues in future platforms. So we try to fix them
if possible. As not only a side-effect this resolves compiler
warnings even on current platforms.
This fix has some regression potential. If so, the problems
may occur during IP address resolution.
see also
- omfile bugfix: potential hang/segfault on HUP of dynafile action
when omfile was HUPed it did not sufficiently clear all dynafile
cache maintenance data structures. This usually lead to misaddressing
and could result in various issues, including a hang of rsyslog
processing or segfaults. It could also have "no effect" by pure
luck of not hitting anything important. This actually seems to
have been the most frequent case.
This seems to be a long-standing bug, but the likelihood of its
appearance seems to have been increased by commit 62fbef7
introduced in 8.1905. Note: the commit itself has no regression,
just increases the likelihood to trigger the pre-existing bug.
special thanks to Alexandre Guédon for his help in analyzing
the issue - without him, we would probably still not know
what actually went wrong.
- imjournal bugfix: potential message duplication
When journal was preloaded from previously saved cursor it was not advanced
to next entry so reading begun from last message which was therefore
Thanks to Jiri Vymazal for the patch.
- rfc5424 parser bugfix: leading space sometimes lost
if structured data is present a leading space in MSG field is lost
- queue subsystem bugfix: oversize queue warning message shown as error
The warning message was emitted as an error message, which is misleading
and may also break some automated procedures.
- core bugfix: HUP did not work reliable on all platforms
most notably not on FreeBSD, maybe others. The reason was obviously
different handling of signals in respect to multiple threads.
- build system bugfix: missing files in distribution tarball
- testbench
* fixed "make distcheck" settings which were missing some modules
This lead to incomplete "make distcheck" run; some errors were not
detected due to that.
* testbench framework: use ip tool instead of outdated ifconfig
The framework now first checks if "ip" is available and falls back
to "ifconfig" only if this is not the case.
Thanks to Michael Biebl for the suggestion.
Version 8.1905.0 (aka 2019.05) 2019-05-28
- templates: add datatype template option for JSON generation
The new "datatype" and "onEmpty" template options permits to
generate non-string data rather easily. It works together with
jsonf formatting, which is what people should use nowadays.
- config processing: check disk queue file is unique
If the same name is specified for multiple queues, the queue files
will become corrupted. This commit adds a check during config parsing.
If duplicate names are detected the config parser errors out and the
related object is not created.
Note: this may look to a change-of-behavior to some users. However,
this never worked and it was pure luck that these users did not run
into big problems (e.g. DA queues were never going to disk at the
same time). So it is acceptable to error out in this hard error case.
- global config: new parameters for ruleset queue defaults
* default.ruleset.queue.timeoutshutdown
* default.ruleset.queue.timeoutactioncompletion
* default.ruleset.queue.timeoutenqueue
* default.ruleset.queue.timeoutworkerthreadshutdown
- add capability to write full config file (-o cmdline option)
Introduces the capability to create an output config file that explodes
all "includes" into a single file. This provides a much better overview
of how exactly the configuration is crafted. That could often be a great
troubleshooting aid.
This commit also contains some slight not-really-related cleanup.
- queue subsystem: permit to disable "light delay mark"
New semantic: if lightDelayMark is 0, it is set to the max queue
size, effectively disabling the "light delay" functionality.
Thanks to Yury Bushmelev to mentioning issues related to light
delay mark and proposing the solution (which actually is what
this commit does).
- queue subsystem: provide better user status messages
The queue subsystem now provides additional information messages which
may help a regular user to maintain system health. Most importantly,
DA queues now output when they persist queue data at end of run and
when they restart the queue based on persisted data.
- core: emit a warning message for ultra-large queue size definitions
We see error reports from users who have configured excessively large queues
and receive an OOM condition or other problems.
With that patch we generate a warning message if a queue is configured very
large. "Very large" is defined to be in excess of 500000 messages.
see also
- new global config parameter "internalmsg.severity"
permits to specify a severity filter for internal message. Only
messages with this severity level or more severe are logged.
Originally this was done in rsyslog.conf as usual: you can filter
rsyslog messages on severity, just like any other. But with systemd,
we now emit primarily to the journal, and this is outside of rsyslog's
rule engine and so regular filters do not apply (at least in regard
to the journal). Logging to journal is good, because finally
folks begin to see the messages (traditional distro configs discard
them, for whatever is the reason).
This commit implements a global setting for a severity-based filter
for internal messages, before submitted to journal. So it's not 100%
of what rsyslog can do, but at least some way to customize.
see also
- config processing bugfix: error messages if config.enabled="off" is used
Using config.enabled="off" could lead to error messages on
"parameter xxx not known", which were invalid. They occured
because the config handler expected them to be used, which
was not the case due to being disabled.
This commit fixes that issue.
- core portability bugfix: harden shutdown processing on FreeBSD
On FreeBSD, rsyslog does not always terminate immediately on SIGTERM.
Root cause seems to be that SIGTERM is delivered differently under
FreeBSD. This causes the main thread to not be awaken, and so it
takes until the next janitor interval to come back to life - which
can be far too long. Fixed this bug explicitly awaking the main
- imtcp bugfix: oversize message truncation causes log to be garbled
The actual problem is in the tcpserver component. However, the prime user
is imtcp and so users will likely experience this as imtcp problem.
When a too-long message is truncated, the byte after the truncation
position becomes the first byte of the next message. This will garble
the next messages and in almost all cases render it is syslog-noncompliant.
The same problem does NOT occur when the message is split.
This commit fixes the issue. It also includes a testbench fix.
Unfortunately the test for exactly this feature was not properly
crafted and so could not detect the problem.
- omfile bugfix: FlushOnTXEnd does not work reliably with dynafiles
The flush was only done to the last dynafile in use at end of
transactions. Dynafiles that were also modified during the
transaction were not flushed.
Special thanks to Duy Nguyen for pointing us to the bug and
suggesting a solution.
This commit also contains a bit of cosmetic cleanup inside
the file stream class.
- lmcry_gcry build bugfix: was not always properly build
Due to an invalid definition in build system this seems to have not
been correctly build on at least some platforms (but it worked on
others as it passed CI testing). This has now been corrected.
Thanks to Remi Locherer for the patch.
- dnscache bugfix: very unlikely memory leak
This fixes a memory leak that can only occur under OOM conditions.
Detected by Coverity Scan, CID 203717
- testbench bugfix: wrong parameter check in (tcpflood())
When first parameter is check_only, the tcpflood funtion shall not
abort the test itself (The fail is intended if this option is set).
closes issue #3625
- testbench bugfix: imfile-symlink test failed w/ parallel test run
The test sometimes failed. It used a symlink to a hardcoded name
rsyslog-link.*.log. This symlink was created but then disappears.
The reason is that upon (every!) test exit, rsyslog-link.*.log is
deleted. So a parallel test running the exit procedure just at the
"right" time can removed that file.
The bug is that the file name should be created using the tests's
dynamic name. This is done now.
Version 8.1904.0 (aka 2019.04) 2019-04-16
- omfile: provide more helpful error message on file write errors
now contains actual file name plus a link to probable causes for this type
of problem
- imfile: emit error on startup if no working directory is set
When the work directory has not been set or is invalid, state files
are created in the root of the file system. This is neither expected
nor desirable. We now complain loudly about this fact. For backwards
compatibility reasons, we still need to support running imfile in
this case.
- dnscache: add global parameter dnscache.default.ttl
This permits to control default TTL for cache entries. If set
to 0, the DNS cache is effectively disabled.
- omelasticsearch: new parameter rebindinterval
Thanks to Richard Megginson for the patch.
- omelasticsearch: new parameter skipverifyhost
Add ability to specify the libcurl CURLOPT_SSL_VERIFYHOST
option to skip verification of the hostname in the peer cert.
WARNING: This option is insecure, and should only be used
for testing. The default value is off, meaning, the hostname
will be verified by default.
Thanks to Richard Megginson for the patch.
- omelasticsearch: set rawmsg to data from original request
Previously, when constructing the message to submit for a retry
for an original request, if the original request did not contain
the field `message`, the system property `rawmsg` was set to
the entire metadata + data from the original request. This was
causing problems with Elasticsearch. This patch changes
the code so that the `rawmsg` will be set to only the data part
of the original request if there is no `message` field.
Thanks to Richard Megginson for the patch.
- mmkubernetes - support for metadata cache expiration
New parameters for mmkubernetes (module and action):
* `cacheexpireinterval`
If `cacheexpireinterval` is -1, then do not check for cache expiration.
If `cacheexpireinterval` is 0, then check for cache expiration.
If `cacheexpireinterval` is greater than 0, check for cache expiration
if the last time we checked was more than this many seconds ago.
* `cacheentryttl` - maximum age in seconds for cache entries
New statistics counters:
* `podcachenumentries` - the number of entries in the pod metadata cache.
* `namespacecachenumentries` - the number of entries in the namespace
metadata cache.
* `podcachehits` - the number of times a requested entry was found in the
pod metadata cache.
* `namespacecachehits` - the number of times a requested entry was found
in the namespace metadata cache.
* `podcachemisses` - the number of times a requested entry was not found
in the pod metadata cache, and had to be requested from Kubernetes.
* `namespacecachemisses` - the number of times a requested entry was not
found in the namespace metadata cache, and had to be requested from
- imdocker: new contributed module
imdocker will get (docker) container logs from a host as well as filling
out some basic container metadata as id, name, image, labels.
Thanks to Nelson Yen for the contribution.
- mmtaghostname: new contributed module
This module allows to force hostname after parsing to the localhostname of
rsyslog and/or add a tag to messages received from input modules without
tag parameter.
Thanks to Philippe Duveau for the contribution.
- imbatchreport: new contributed input module
This input module manage batches' reports : complete file as a single log.
Thanks to Philippe Duveau for the contribution.
- imtuxedolog: new contributed input module for Tuxedo ULOG
Thanks to Philippe Duveau for the contribution.
- openssl network driver: Added support setting openssl configcommands
We are using the gnutlsPriorityString setting variable, to pass
configuration commands to openssl.
- omkafka: drop messages rejected due to being too large
Drop messages that were rejected due to
Thanks to Nelson Yen for the patch
- core/action: implement capability to resume/suspend via external file
It has been reported that some TCP receivers exists that accept syslog tcp
messages at any rate, even if they do not manage to actually process them.
Instead, they silently drop the message. This behavior is not configurable.
All in all, it can lead to considerate message loss.
To support such use cases, we need to provide an ability to externally
trigger actions suspension and resumption.
We do this via a configured file which contains the status of the action.
Rsyslog periodically reads the file and if it contains "SUSPEND", it
suspend the action (and likewise for resume).
- improg bugfix: some memory leaks
Thanks to Philippe Duveau for the contribution.
- msg object bugfix: regression from 1255a67
- pmnormalize: fix memory leaks, improve tests
This patch fixes a set of problems plus provides more and enhanced
tests for the module.
Most important problem was a memory leak that occured when a message
could not be passed at all. For each message that could not be parsed
memory of at least the size the message is leaked. Depending on
traffic pattern this can quickly lead to OOM. Note, however, that
this leak was never reported - it was discovered as part of code
- omkafka bugfix: build failure due to inconsistent type
fails depending on platform and settings; was somehow undetected by CI
- imjournal bugfix: potential segfault on some API failure returns
In one case there was possibility of free()'d value of journal
cursor not being reset, causing double-free and crash later on.
- openssl subsystem bugfix: better error handling
Handling of SSL_ERROR_SYSCALL has been hardened.
Handling for SSL_Shutdown errors has been corrected.
Also fixed SSL Shutdown handling in tcpflood (openssl code).
If SSL_Shutdown returns error, we call SSL_read as described in
the documentation to do a bidirectional shutdown.
- imjournal bugfix: Fetching journal cursor only for valid journal
The sd_journal_get_cursor() got called regardless of previous
retcodes from other journal calls which flooded logs with journald
errors. Now skipping the call in case of previous journal call
non-zero result. Fixed success checking of get_cursor() call
to eliminate double-free possibility.
Also, making WorkAroundJournalBug true by default, as there were no
confirmed performance regressions for a quite long time.
Thanks to Jiri Vymazal for the patch.
- omamqp: fix build errors
They occur on some, newer, platforms. We do not really fix them, but rather
make the compiler ignore them. This is not really good, but the module is
contributed and so that's for now the best thing we can do.
- testbench: change to use a larger connection count again
not sure why it was reduced, maybe related to
also, modernize this and another test
- tcpflood bugfix: make soft connection limit work again
It looks like the soft limit became defunct when tcpflood was enhanced to
request more open file handles from OS.
- testbench bugfix: omhttp tests were not run during "make distcheck"
- build system bugfix: omhttp test files were not included in dist tarball
Thanks to Thomas D. (whissi) for the patch.
Version 8.1903.0 (aka 2019.03) 2019-03-05
- omrabbitmq: add features (RabbitMQ HA management, templatize routing_key,
populate amqp message headers, delivery_mode and expiration parameters)
- improg: create input module to use external program as input datas
- imtuxedoulog: create input module to consume Tuxedo ULOG files
- omhttp: rewritten with large feature enhancements
Many thanks to Gabriel Intrator for this work. Gabriel also has adopted the
module and plans to support it in the future.
- pmdb2diag: create parser module for DB2 diag logs
- TLS subsystem: add support for certless communication
both openssl and GnuTLS drivers have been updated to support certless
communications. In this case e.g. Diffie-Helman is used.
NOTE: this is an insecure mode, as it does NOT guard against
man-in-the-middle attacks. We implemented it because of the large demand,
not because we think it makes sense to use this mode. We strongly recommend
against it.
- imrelp/omrelp: add capability to specify tlslib for librelp
- build system: introduce a better way to handle compiler pragmas
we now use macros and _Pragma(). This requires less code lines and is more
- omkafka: add support for dynamic keys
A new configuration property "dynaKey" is added that, when "on", changes the
value of property "key" to a template names instead of a constant value.
This is similar in approach to the DynaTopic implementation.
Thanks to Ludo Brands for the patch.
- AIX port: add AIX linking extensions on many plugins and contributions to
allow building them on this os.
- template: add Time-Related System Property $wday which is the day of week
This allow to get a week based rotation of log as AIX does.
- ksi subsystem: add high availability mode
Note: ksi subsystem now REQUIRES libksi 3.19.0 or above
Thanks to Allan Park for the patch.
- imfile bugfix: file reader could get stuck
State file handling was invalid. When a file was moved and re-created
rsyslog could use the file_id if the new file to write the old files'
state file. This could make the file reader stuck until it reached the
previous offset. Depending on file sizes this could never happen AND
would cause large message loss. This situation was timing dependent
(a race) and most frequently occurred under log rotation. In polling
mode the bug was less likely, but could also occur.
- imfile bugfix: potential segfault when working with directories or symlinks
see also
Thanks to Nelson Yen for the patch
- omhttp bugfix: header items could not have spaces in them
Thanks to Nathan Brown for the patch.
- core bugfix: enlarged msg offset types for bigger structured messages
using a large enough (dozens of kBs) structured message
it is possible to overflow the signed short type which leads
to rsyslog crash. (applies to msg.c, the message object)
Thanks to Jiri Vymazal for the patch.
- core bugfix for AIX: timeval2syslogTime now handle the bias according to
local time zone as documented by IBM.
- imfile feature: add configuration parameter to force parsing of read logs
- imczmq bugfix:
Release zframe following read from socket
Make the 0MQ frame pointer local to the receive loop and destroy the
frame as soon as the contents have been copied. This avoids:
* a memory leak should the receive loop execute more than once
* referencing an un-initialized value during cleanup (finalize_it)
Thanks to Mark Gillott for the patch.
- omclickhouse bugfix: default template unusable
STDSQL option added to the default template used in output module of clickhouse
Thanks to gagandeep trivedi for the patch.
- omclickhouse "bugfix": work-around failed error detection
omclickhouse uses a questionable method to check if a request generated
an error. We have seen the method to fail when we slightly upgraded clickhouse
server in CI testing.
This commit makes the method a bit more reliable without really fixing it.
But it's at least a short-term solution.
This should be changed to a proper status check. I assume such is possible.
see also
- imptcp bugfix: overly long socket bind path can lead to segfault
if the `path` input parameter is overly long (e.g. more than 108
characters on some platforms) a non-terminated string is generated
and then passed to OS API. This can lead to all sorts of problems
including segfault.
We detected that based on gcc-8 warnings during code inspection.
No real-world problem case is known.
- ommongodb bugfix: improper stpncpy() calls
- testbench tcpflood: add new transport option relp-tls
Tcpflood can now send messages via relp with tls support.
- testbench: mmdb valgrind tests failed is srcdir env was not set
- testbench: add omclickhouse tests
- testbench bugfix: some long-running tests had too low runtime allowance
- testbench bugfix: daqueue-dirty-shutdown test
This test occasionally failed with left-over spool files. As far as we
have analyzed, this is due to the use of an invalid shutdown timeout
(very short) in the second phase of the test. It looks like this is
actually a copy&paste error from phase one. Behavior of rsyslog was
correct, but the test itself created a false positive.
We have corrected the timeout now and also modernized the test
a bit.
- testbench bugfix: some omhttp tests had compatibility issues with Python 3
Thanks to Thomas D. (whissi) for the patch.
Version 8.1901.0 (aka 2019.01) 2019-01-22
- new version scheme: 8.yymm.0 - version now depends on release date
see also
- queue: add support for minimum batch sizes
- change queue.timeoutshutdown default to 10 for action queues
The previous default of 0 gave action queues no real chance to
shutdown - at the time they were applied, they were usually already
expired (computing the absolute timeout took a small amount of time).
So we change this now to 10ms, which still is very quick but gives
the queue at least a chance to shutdown itself. That in turn
smoothes the whole shutdown process.
If a very large number of action queues is used this may lead
to a very slightly longer shutdown time, albeit this is very
- omclickhouse: new output module for clickhouse
This output module adds the possibility to send
INSERT querys to a Clickhouse database. See doc for details.
The messages are sent via a REST interface.
This commit also adds support of the testbench
for clickhouse tests, as well as various tests.
- omkafka: Add ability to dump librdkafka statistics to a file
Use statsFile to specify statistics output file; also requires
setting confparam to a non-zero value.
Thanks to github user pcullen65 for the contribution.
- tls(ossl/gtls): add new Option "StreamDriver.PermitExpiredCerts"
The new Option can have one of the following values:
on = Expired certificates are allowed
off = Expired certificates are not allowed
warn = Expired certificates are allowed but warning will be logged (Default)
Includes necessary tests to validate new code.
- action: add "action.resumeIntervalMax" parameter
This parameter permits to set an upper limit on the growth of the
retry interval. This is most useful when a target has extended
outage, in which case retries can happen very infrequently.
- report child process exit status according to config parameter
Add new global setting 'reportChildProcessExits' with possible values
'none|errors|all' (default 'errors'), and new global function
'glblReportChildProcessExit' to report the exit status of a child
process according to the setting.
Invoke the report function whenever rsyslog reaps a child, namely in:
- rsyslogd.c (SIGCHLD signal handler)
- omprog
- mmexternal
- srutils.c (execProg function, invoked from stream.c and omshell)
Remove redundant "reaped by main loop" info log in omprog.
Promote debug message in mmexternal indicating that the child has
terminated prematurely to a warning log, like in omprog.
Thanks to Joan Sala for contributing this.
- build system: add capability to turn off helgrind tests
we add configure switch --enable-helgrind. We need to turn helgrind off
when we use clang coverage instrumentation. The instrumentation injects
mt-unsafe counter updates which we seem to be unable to suppress.
Note: for gcc this was possible, because they all occured in a utility
function. For clang, they are inlined so we get many -and changing- violations.
see also
- imzmq3/omzmq3: remove modules
according to @brianknox (their author) these modules are outdated:
They are replaced by imczmq/omczmq and are no longer maintained. We put a
depreciation notice into the modules a year ago, and now it finally is time
to remove them. They do NOT build in any case, except if very old versions
of the 0mq ecosystem are used.
see also
- bugfix omusrmsg: don't overwrite previous set _PATH_DEV value
Since commit 56ace5e418d149af27586c7c1264fccfbc6badf1, omusrmsg was broken
because "memcpy()" is not a suitable substitute for "strncat()" in this
context, it is actually replacing the previous added content.
Thanks to Thomas D. (whissi) for the patch.
- bugfix ossl TLS driver: fixed authentication mode anon
authentication mode "anon" was not properly supported in ossl TLS
driver; if selected, did still require a full certificate.
- bugfix tls subsystem: Receiver hang due to insufficient TLS buffersize.
gtls and ossl driver used a default buffersize of 8KiB to store received
TLS packets. When tls read returned more than buffersize, the additional
buffer was not processed until new data arrived on the socket again.
TLS RFCs require up to 16KiB+1 buffer size for a single TLS record.
- bugfix pmpanngfw: build issue due to non-matching data types in comparison
Thanks to Narasimha Datta for the patch.
- omfile: work-around for "Bad file descriptor" errors
This works-around an issue we can reproduce e.g. via the test. Here, omfile gets a write
error with reason EBADF. So far, I was not able to see an actual
coding error. However I traced this down to a multithreaded race
on open and close calls. I am very surprised to see this type
of issue, as I think the kernel guarantees that it does not happen.
Here is what I see in strace -f:
openssl accepts a socket:
[pid 66386] accept(4, {sa_family=AF_INET, sin_port=htons(59054), sin_addr=inet_addr("")}, [128->16]) = 10
then, it works a bit with that socket, detects a failure and shuts it down. Sometimes, at the very same instant omfile on another thread tries to open on output file. Then the following happens:
[pid 66386] close(10) = 0
[pid 66389] openat(AT_FDCWD, "./rstb_356100_31fa9d20.out.log", O_WRONLY|O_CREAT|O_NOCTTY|O_APPEND|O_CLOEXEC, 0644 <unfinished ...>
[pid 66386] close(10 <unfinished ...>
[pid 66389] <... openat resumed> ) = 10
[pid 66386] <... close resumed> ) = 0
[pid 66386] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}], 2, -1 <unfinished ...>
[pid 66389] write(2, "file './rstb_356100_31fa9d20.out"..., 66file './rstb_356100_31fa9d20.out.log' opened as #10 with mode 420
) = 66
[pid 66389] ioctl(10, TCGETS, 0x7f59aeb89540) = -1 EBADF (Bad file descriptor)
This is **literally** from the log, without deleting or reordering
lines. I read it so that there is a race between `open` and `close`
where fd 10 is reused, but seemingly closed - resulting in the `EBADF`
While it smells like a kernel issue, it may be a well-hidden program
bug - if so, one I currently do not find. HOWEVER, this commit
works around the issue by reopening the file when we receive EBADF.
That's the best thing to do in that case, especially if it really is
a kernel bug. Data loss should not occur, as the previous writes
succeeded in that case.
The backdraw of this work-around is that it only "fixes" omfile. In
theory every part of rsyslog can be affected by this issues (queue
files, for example). So this is not to be considered a final solution
of the root issues (but a big step forward for known problem cases).
see also
- omhttp bugfix: segfault due to NULL pointer access
many thanks to Gerardo Puerta for the patch
- omkafka bugfix: segfault when running in debug mode using dynamic topics
This should only affect test environments, as debug mode is not
suitable for production (and really does not work when running for
extended period of time).
- testbench bugfix: TLS syslog tests for "anon" mode were broken
They did not detect when "anon" mode was not properly supported by the
- test tooling bugfix: correct tcpflood error messages
it looks like tcpflood's openssl code stems partly back to tcpdump, at
least the error messages indicate this. Thankfully tcpdump is BSD licensed,
so this should not be a big issue. Nevertheless, the incorrect program name
in error messages needs to be corrected, and this is what this commit does.
- tcpflood bugfix: tool did not terminate on certificate error
when tcpflood detected a certificate error, it reported an
error message but did not abort. This could make errors undetectable
during CI runs.
also fix tests which did not properly provide CA cert (which than
caused the error).
- testbench: fix issues with journal testing
The configure/Makefile checks were not correct, leading to the
build of journal components when not necessary, even if not
supported by the platform. Thus lead to invalid build and test
- testbench: add tests for "certless" tcp/tls
This adds a test to ensure that a client without certificate can
connect to a server with certificates. So it is not exactly
The prime intent of this test is to match config suggestions given
by log hosting companies (like loggly) and so ensure that we do
not accidentally break them. This is especially important as the
capability for certless clients was not properly documented and
also become forgotten by the rsyslog team.
see also
- CI
- further improve testbench robustness against slow machines
- testbench: add tests for parser.EscapeControlCharacterTab global option
- testbench: Updated all expired x.509 certs
- fix a potential race in CI debug mode which can lead to segfault
only when instructed to do so, rsyslog may emit a "final worker thread shutdown"
messages. This is usually only enabled in CI and/or other testing. If enabled,
the code has a race on the pWti object which can lead to segfault or abort.
Only system which explicitly enable this CI aid are affected (running in debug
mode alone is NOT sufficient).
This is a regression from 8.40.0.
- testbench: improve robustness against slow CI, gen. improvements
* add an overall timeout value for tests - if running longer,
testbench framework tries to FAIL and end test. Note that
this is not bullet-proof and not intended to be so.
* guard against hanging rsyslog instances via a new imdiag
feature to abort after n number of seconds; among others,
this guards as against timeout-cancel in CI, which is always
pretty hard to diagnose - now we see these errors in test-suite.log
* fix a bug in tcp zip test, which actually did not use zip mode
* experimentally add debug output to better understand
shutdown_when_empty operation; goal is to improve understanding
and then remove that code again.
* improve shutdown predicate for a couple of tests
* made travis run make check with two parallel threads, for which
we seem ready now. Nevertheless, it's still experimental and we
may roll this back if required.
* testbench: disable omprog tests that hang under coverage instrumentation
When gcc coverage instrumentation is used, these tests hang. They work
with clang coverage instrumentation, but for some reason clang does not
give us full reports (at least not when used together with
We have tried to troubleshoot this for hours and hours - now is time to
give up until someone comes up with a bright idea. So we make the affected
tests skip themselves when they detect gcc with coverage instrumentation.
* testbench: add new test for imfile and logrotate in copytruncate mode
* testbench: add new omkafka tests for dynamic topics
* travis: do no longer run 0mq tests
This often causes trouble when the packages are rebuild by the 0mq project
(which happens frequently). We already do intensive testing of the 0mq
components in the buildbot infrastructure, where we use dedicated containers.
This is reliable, as the containers already contain everything needed and so
do not need to reach out to the 0mq package archives. In the light of this,
let's save us the trouble of Travis failures. The only downside is that
users cannot pre-test with their local Travis when modifying 0mq modules,
which is quite acceptable.
Version 8.40.0 [v8-stable] 2018-12-11
- mmkubernetes: add support for sslpartialchain for openssl
If `"on"`, this will set the OpenSSL certificate store flag
`X509_V_FLAG_PARTIAL_CHAIN`. This will allow you to verify the Kubernetes API
server cert with only an intermediate CA cert in your local trust store, rather
than having to have the entire intermediate CA + root CA chain in your local
trust store. See also `man s_client` - the `-partial_chain` flag.
This option is only available if rsyslog was built with support for OpenSSL and
only if the `X509_V_FLAG_PARTIAL_CHAIN` flag is available. If you attempt to
set this parameter on other platforms, you will get an `INFO` level log
message. This was done so that you could use the same configuration on
different platforms.
- openssl driver: improved error messages
also fixes misleading wording of some error messages
- imfile: disable file vs directory error on symlinks
The file/directory node-object alignment now ignores symlinks. Previously
it reported error on each directory symlink spamming user error logs.
Thanks to Jiri Vymazal for the patch.
- cleanup: remove no longer needed --enable-rtinst code
configure option --enable-rtinst is gone-away since a while, but there were
still some supporting code left. It required careful analysis what could
actually be removed. This is now done and the code fully cleaned up. This
greatly simplifies the code and also makes it better readable for
developers which are not deep inside the rsyslog code base.
As a positive side effect, we could eliminate mutex calls inside
the debug system. This means we are more likely to reproduce race
conditions in runs with debugging enabled.
- bugfix imfile: rsyslog re-sends data for files larger 2GiB
This occurs always if and only if
- reopenOnTruncate="on" is set
- file grows over 2GiB in size
Then, the data is continuously re-sent until the file becomes smaller
2GiB (due to truncation) or is deleted.
It is a regression introduced by 2d15cbc8221e385c5aa821e4a851d7498ed81850
- config: fix segfault in backticks "echo" expansion of undefined variables
The bug was introduced in commit abe0434 (config: enhance backticks "echo"
capability). The getenv() result passed to strlen() and es_addBuf() may be
NULL if the environment variable does not exist, resulting in a segfault.
Thanks to Julien Thomas for the patch.
- bugfix imsolaris: message timestamps on Solaris
On Solaris messages don't have their time directly in the raw body but in
a separate log_ctl structure which is currently not used.
When message is logged and processed, rsyslogd gives it current time because
it ignores the actual one. That means that old messages (e.g. from system
reboot) get timestamp of processing instead of the reboot itself (it is
not a problem for live logging where now is used anyway).
Thanks to Jakub Kulik for the patch.
- bugfix build system: "make distcheck" did not work for mysql tests
- bugfix build system: don't link liblogging-stdlog when available but not enabled
When liblogging-stdlog was available but configure option "--disable-liblogging-stdlog"
was set, rsyslog was still linking against liblogging-stdlog.
This commit will ensure that rsyslog will only link against liblogging-stdlog when
"--enable-liblogging-stdlog" was set.
see also:
- bugfix RainerScript: abs() could return negative value, now in range [0..max]
Thanks to Harshvardhan Shrivastava for providing the patch
- bugfix debug output: date property options output wrongly
inside debug logging, the date property options were not all
properly converted into strings. Some of the newer ones were
invalidly flagged as "UNKNOWN". This is primarily a cosmetic
problem and has no effect other than puzzling folks looking at
the debug log.
- bugfix omhttp: did not compile on some platforms
- CI
* made mysql-based tests (ommysql and omlibdbi) work inside containers
* bugfix testbench: do not execute libgcrypt tests if disabled
* testbench: grep failed when string starting with "-" was used
The search term was mistakenly interpreted as an option.
* testbench: support auto-start/-stop of mysqld
This is required to run mysql/mariadb tests inside containers.
* improve bash coding style and fix a some bug in testbench
- duplicate init call was not detected due to typo
- queue-persists test did not work correctly
- some general testbench framework improvements
issues found be shellcheck, fixes brought up other work to do
* testbench: improve journal tests and testbench framework
improving both style and reliability of journal tests; along that way
also improve testbench framework:
- do cleanup on error_exit and skip
- explicit skip handler (vs exit 77)
this permits us to do better cleanup
- new testbench functions for journal-specific functionality
reduce code duplication and make things easier to maintain in the
- provide a way to do valgrind and non-valgrind tests with a single
test file
see also
* testbench: improve framework, harden rscript http test
- the test now tries to detect unavailable http server, which
should not result in test failure
- equivalent valgrind test changed to new method, removing code
- testbench supports
* new exit code 177, which indicates environment error, makes
test SKIP but still reports the failure
* new exitcode, logurl stats reporting fields
* report buildbot builder (if provided) in failure report
* testbench: add test for mmjsonparse with unparsable data
* testbench: make es-bulk-retry test more reliable
We now no longer depend on a fixed 'sleep' command but rather
check the output file for what we expect. This is much more
robust on slow test machines.
We believe this closes the below-mentioned issue. If not, it
should be re-opened.
* testbench: suppress valgrind error caused by pthreads lib
finally I give up and honestly think this is a problem in pthreads and
not in rsyslog code. See issue below and previous commit for more
Unfortunately, this will also mask off cases where we do not properly
call pthread_join() albeit it is needed. Nevertheless, this bug is
causing so much CI grief that it is definitely worth it.
* testbench: made a couple of (unnamed due to too many) test more robust
against slow (CI) machines
Version 8.39.0 [v8-stable] 2018-10-30
- imfile: improve truncation detection
previously, truncation was only detected at end of file. Especially with
busy files that could cause loss of data and possibly also stall imfile
reading. The new code now also checks during each read. Obviously, there
is some additional overhead associated with that, but this is unavoidable.
It still is highly recommended NOT to turn on "reopenOnTruncate" in imfile.
Note that there are also inherent reliability issues. There is no way to
"fix" these, as they are caused by races between the process(es) who truncate
and rsyslog reading the file. But with the new code, the "problem window"
should be much smaller and, more importantly, imfile should not stall.
see also
see also
- imjournal: work around journald excessive reloading behavior
This is workaround for possible imjournal interaction with systemd
where journal invalidate fix is not present. The code tries to
detect SD_JOURNAL_INVALIDATE loop and not reload after each call.
Thanks to Jiri Vymazal for the patch.
- errmsg: remove no longer needed code
refactored code (over a long time) so that object-ish style is no longer
needed and could now finally be removed; We also refactored the last
component (omhttp contrib module) that used the old interface.
- queue bugfix: invalid error message on queue startup
due to some old regression (commit not exactly identified, but for
sure a regression, 9 years ago it was correct) an error message
is emitted when no .qi file exists on startup of the queue, which
is a normal condition.
Actually, the code should not have tried to open the .qi file in
the first place because it detected that it did not exist. That
(necessary) shortcut had been removed a while ago.
- bugfix imrelp: regression with legacy configuration startup fail
Startup of a relp listener failed if legacy configuration was used.
caused by commit: 32b71daa8aadb8f16fe0ca2945e54d593f47a824
- bugfix imudp: stall of connection and/or potential segfault
There was a regression in 493279b790a8cdace8ccbc2c5136985e820dd2fa.
This regression may cause stop (or delay) of reception from some systems
and may also cause a segfault. Triggering condition is that at least
one listener could not be created.
Thanks to Jens Låås for the patch.
- bugfix gcry crypto driver: small memleak
If a crypto key is specified directly via the key="" parameter,
the storage for that key is not freed, causing a small memleak.
Note that the problem occurs only once per context, so this
should not cause real issues. Even more so, as specifying a
key directly is meant only for testing purposes and is strongly
discouraged for production use.
Detected by internal testing, no actual fail case known.
- fix potential misaddressing in encryption subsystem
could happen if e.g. disk queues were encrypted
not seen in practice but caught by testbench test
- ksi subsystem changes
* enhance debug logging
* disable unsafe SHA1 algorithm
Thanks to Allan Park for the patch.
- bugfix core: regex compile error messages could be incorrect
- bugfix core: potential hang on rsyslog termination
The root cause was a deadlock during worker startup. This could
happen for example when a DA queue needed to persist data during
Fail condition:
* startup request for a new worker
* initialization of that worker
* immediate detection that the worker can or must shutdown
* main thread waiting for worker running state, which it skips,
and so the main thread hangs inside a loop
- bugfix imkafka: system hang when backgrounded
imkafka initializes librdkafka too early (before the fork). This leads
to hangs in various parts of the system - not only im imkafka but
other functions as well (e.g. getaddrinfo() calls).
- bugfix imfile: file change was not reliably detected
A change in the inode was not detected under all circumstances,
most importantly not in some logrotate cases.
Includes new tests made by Andre Lorbach. They now use the
logrotate tool natively to reproduce the issue.
- bugfix imrelp: do not fail build if librelp does not have relpSrvSetLstnAddr
- bugfix queue subsystem: DA queue did ignore encryption settings
- bugfix KSI: lmsig-ksils12 module skips signing the last block
Thanks to Allan Park for the patch.
- bugfix fmhash: function hash64mod sometimes returned wrong result
Thanks to Harshvardhan Shrivastava for providing the patch
- bugfix core/debug: data written to random fd 2 under some debug settings
This happens only during auto-backgrounding, where we cannot any longer
access stderr. Whatever is opened with fd2 receives some debug messages.
Note that the specific feature is usually turned on only in CI runs.
- cleanup: removed no longer needed code
Code that was unused for quite a while or did not really belong to the
project identified and removed.
- overall code cleanup
e.g. remove unused code, replace bad bash constructs, etc...
- CI:
* some small improvements in testbench plumbing
e.g. (`cmd` replaced by $(cmd), removed useless use of cat, ...)
* testbench: improve plumbing for kafka tests
- Removed all sleeps where possible.
- Moved all kafka start/stop/download logic into functions.
- Moved kafka/zookeeper stop into error_exit and exit_test.
- Kafka/Zookeeper cleanup only done on success now.
- Kafka/Zookeeper logfiles automatically dumped on error_exit only now.
- Added cleanup for Kafka/Zookeeper instances into CI/
- added new tests
* testbench: fix incompatibility of one omprog test with Python3
Python3 writes to stderr immediately, and this caused the
captured output to differ with respect to Python2. Simplified
the test to do a single write to stderr. Also a cast to int
was needed when calculating 'numRepeats'.
* testbench: fixed imfile parallel issues
- Fixed timing issues in some imfile wildcard/regex tests
- Added touch command in imfile wildcard tests to make sure directories
exist before files are created in it if IO is under stress.
- changed content checking in some tests to use "content_check_with_count"
with check timeouts instead of using fixed sleeptimes.
* testbench: new basic tests
These ensure that for some modules that did not have any tests at all
we have at least a minimal coverage (module loads, activates, is able
to emit error messages). Of course, further improvements would make
much sense. Modules:
- ommail
- testbench: new tests for disk queue encryption
- testbench: improved auto-diagnostics for hanging instance
- testbench: hardened kafka test against failing kafka subsystem,
not in 100% of the cases, but at least in some that frequently occur
- failing tests now report failure status so that we can get stats
on unreliable tests
- testbench tooling: fix incorrect tcpflood TLS parameter check
could lead to segfault when started
- bugfix testbench tooling: tcpflood invalid type in calloc (openssl mode)
It is unlikely that this has caused a real issue, as long as pointers
are all of the same size (what is highly probable).
detected by cppcheck via
Version 8.38.0 [v8-stable] 2018-09-18
- AIX: make basic modules work again
- make rsyslog build on AIX again
... at least for a limited set of default modules
- imfile: support for endmsg.regex
This adds support for endmsg.regex. It is similar to
startmsg.regex except that it matches the line that denotes
the end of the message, rather than the start of the next message.
This is primarily for container log file use cases such as this:
date stdout P start of message
date stdout P middle of message
date stdout F end of message
The `F` means this is the line which contains the final part of
the message. The fully assembled message should be
`start of message middle of message end of message`.
`startmsg.regex="^[^ ]+ stdout F "` will match.
Thanks to Richard Megginson for the patch.
- imkafka: add parameter "parseHostName"
This enables imkafka to parse the hostname from log message.
Previously that was not possible. It was most likely a bug, but
one that users may count on. The new parameter "ParseHostName"
(default is off) controls this behavior. Default is to NOT
parse the hostname.
Thanks to github user snaix for the contribution.
- im[p]tcp: improve error message on connect failure
Now a message with the actual OS error is emitted, making things far
easier to troubleshoot.
- imkafka: implement multithreading support for kafka consumers.
Each consumer runs in it's own consumer thread now. New tests have also
been added for this.
- omelasticsearch: write all header metadata to $.omes for retries
Write all of the original request metadata fields to $.omes for
the retry, if present. This may include all of the following:
_index, _type, _id, _parent, pipeline
This is in addition to the fields from the response. If the same
field name exists in the request metadata and the response, the
field from the request will be used, in order to facilitate
retrying the exact same request.
Thanks to Richard Megginson for the patch.
- core: improve error message on module load fail
The error message now lists all dlopen() errors in depth. This is
especially useful if the error is due to missing symbols or file
format errors.
- core/queue: add error message if queue file cannot be accessed
When having a disk-assisted queue without permission to write to the specified
queue file an error will now be generated.
- imtcp/imudp: new option preservecase for managing the case of FROMHOST value
default is left at current behavior
see also
see also
- omprog: add feedback timeout and keep-alive feature
- Restart the program if it does not respond within timeout.
- New setting 'confirmTimeout' (default 10 seconds).
- Allow the program to provide keep-alive feedback when a
message requires long-running processing.
- Improve efficiency when reading feedback line (use buffer).
Retry interrupted writes/reads to/from pipe.
- New setting 'reportFailures' for reporting error messages
from the program.
- Report child termination when writing to pipe.
- Minor refactor: renamed writePipe function to sendMessage,
renamed readPipe to readStatus.
Thanks to Joan Sala for contributing this.
- omprog: fix forceSingleInstance configuration option
The forceSingleInstance option did not work as intended. Even
if set multiple instances were spawned. This most probably
was a regression from 0453b1670fc34c96d31ee7c9a370f0f5ec24744a
The code was broken roughly 3.5yrs ago, so it looks like the
issue was little-noticed. This also means that potentially some users
may see the bugfix as change of behavior. If so, just remove
the option.
Thanks to Joan Sala for contributing this.
- imfile: implement file-id, used in state file
This ensures that files with the same inodes are not accidentally treated
as equal, at least within the limits of the file id hash (see doc for
We use the siphash reference implementation to generate our non-cryptographic
- imfile: experimental input throttling feature
The new input parameter delay.message has been added. It specifies
a delay in microseconds after each line read.
- core: emit TZ warning on startup not on Linux non-container
On Linux it seems common that the TZ variable is NOT properly set.
There are some concerns that the warning related to rsyslog correcting
this confuses users. It also seems that the corrective action rsyslog
takes is right, and so there is no hard need to inform users on that.
In Linux containers, however, the warning seems to be useful as the
timezone setup there seems to be frequently-enough different and
rsyslog's corrective action may not be correct.
So we now check if we are running under Linux and not within a container.
If so, we do not emit the warning. In all other case, we do. This is
based on the assumption that other unixoid systems still should have
TZ properly set.
- omkafka:
* better debug information
* Fixed minor issue in omkafka producing wrong kafka timestamps when
msgTimestamp was NULL.
* Setting RD_KAFKA_V_KEY(NULL, 0) in rd_kafka_producev now when KEY is not
* Fixed minor issue when rsyslog is compiled with --enable-debug and
librdkafka is too old.
- omfile bugfix: errant error message when dynafile param needed
also fixes related message in contributed module omfile-hardened
Thanks to Frank Bicknell for the patch
- omhttp: new contributed module
Thanks to Christian Tramnitz for contributing it.
Some more info at
- mmkubernetes: action fails preparation cycle if kubernetes API ...
... destroys resource during bootup sequence
The plugin was not handling 404 Not Found correctly when looking
up pods and namespaces. In this case, we assume the pod/namespace
was deleted, annotate the record with whatever metadata we have,
and cache the fact that the pod/namespace is missing so we don't
attempt to look it up again.
In addition, the plugin was not handling error 429 Busy correctly.
In this case, it should also annotate the record with whatever
metadata it has, and _not_ cache anything. By default the plugin
will retry every 5 seconds to connect to Kubernetes. This
behavior is controlled by the new config param `busyretryinterval`.
This commit also adds impstats counters so that admins can
view the state of the plugin to see if the lookups are working
or are returning errors. The stats are reported per-instance
or per-action to facilitate using multiple different actions
for different Kubernetes servers.
This commit also adds support for client cert auth to
Kubernetes via the two new config params `tls.mycert` and
Thanks to Richard Megginson for the patch.
- bugfix pmnormalize/core: several memory leaks, invld property handling
- major memory leak which occurred once per message processed
So this could lead to OOM. Caused by improper free of json
- another two major leaks of similar magnitude could occur if
"fromhost-ip" and/or "fromhost" properties were set
- minor leaks upon termination. these were unproblematic as
static and only occured immediately before shutdown.
But they triggered memory debugger errors.
- fixed test which did not check for mem leaks albeit it should
- core invalid handling of the "fromhost" property, if set via
the MsgSetPropsViaJSON() call. This was primarily of concern
for pmnormalize and mmexternal, and only if these properties
were used by either the rulebase or the external program
Actually, most of the leaks go back to rsyslog core, but that
core functionality was not used by other modules in the same
way. But if some other would have used it, the effects would
have been the same (so be aware if you wrote custom modules).
- bugfix imptcp: fixed pointers for session counting
imptcp open, failedopen, and closed pstats counters were assigned the wrong
name, thus pstats values did provide a totally wrong picture of what was
going on.
Thanks to github user jeverakes for the patch.
- bugfix omprog: invalid memory access on partial writes to pipe
When sending logs to the program, in case of a partial write to the pipe,
invalid data was sent, or an invalid memory access could occur. (A
partial write can occur if the syscall is interrupted or the pipe is full.)
Thanks to Joan Sala for contributing this.
- bugfix omprog: rsyslog's environment was not passed to script
- bugfix omprog: severity of some log messages in waitForChild corrected
Log some messages related to child process termination as info/warn
instead of error.
- bugfix imfile: files which were loaded via symlink were not always followed
They were stopped watching after being rotated.
Thanks to Jiri Vymazal for the patch.
- bugfix imfile: potential misaddressing when processing symlinks
Fixed parent name when processing symlinks. Detected during code review.
There was a garbage byte left before which could cause errors down the
Thanks to Jiri Vymazal for the patch.
- bugfix ommongodb: build issue if mongo-c-driver is not compiled with TLS
Let ommongodb module works even if mongo-c-driver is not compiled with SSL support.
Thanks to Jérémie Jourdin for the patch.
- CI:
* many changes with the goal to support parallel test execution, e.g.
use dynamic ports and file names, changes to testing tools, etc.
* kafka tests re-enabled, as they should now no longer be racy. However,
this has yet to be proven in practice.
* upgrading kafka server version to current
* Fixed server configuration issues holding the kafka tests back from working
* Fixed some config issues in all sndrcv kafka tests.
* Generating dynamically kafka topics now for each kafka test.
* Reenabled kafka_multi test which runs a test on 3 kafka/zookeeper instances
Version 8.37.0 [v8-stable] 2018-08-07
- build system: add --enable-default-tests ./configure option
This permits to control the "default tests" in testbench runs. These
are those tests that do not need a special configure option. There are
some situations where we really want to turn them of so that we can
run tests only for a specific component (e.g. ElasticSearch).
This commit also removes the --enable-testbench[12] configure switches,
which were introduced just to work-around travis runtime restrictions.
With the new CI setup and new options we could reduce the Travis runtime
dramatically and so we do not need them any longer.
- overall adaptation to gcc 8 which emits new warnings
- fix some build warnings on 32bit systems, namely armhf architecture
- ommail change of behavior: "enable.body" default now "on"
This was always documented to be "on", but actually was "off". Usually, we
fix the doc, but after long discussion the agreement was that in this
specific case it was actually better to change the default.
see also:
- core/omfile: race in async writing mode
mutex was not properly locked at all times when the async writing buffer
was flushed
Thanks to Radovan Sroka for the patch.
- core: provide a somewhat better default action name
We now include the module name (e.g. "omelasticsearch" or "builtin:omfile")
as part of the name. This is still not perfect, but hopefully a bit
easier to grasp.
see also
- new global() parameter "abortOnUncleanConfig"
This provides a new-style alternative to $AbortOnUncleanConfig.
- tcpflood no longer links with -lgrcypt
as this is no longer necessary for GnuTLS
Thanks to Michael Biebl for the patch.
- imjournal: add journal-specific impstats counters
these provide some additional insight into journal operations
Thanks to Abdul Waheed for the patch.
- imjournal: fixed startup on missing state file
When starting rsyslog with imjournal for first time it outputs
an error and plugin does not run because no state file exists yet.
Now it skips the loading and creates state file on first persist.
Thanks to Jiri Vymazal for the patch.
- imjournal: fetching cursor on readJournal() and simplified pollJournal()
Fetching journal cursor in persistJournal could cause us to save
invalid cursor leading to duplicating messages further on, when new
WorkAroundJournalBug option is set we are saving it on each
readJournal() where we now that the state is good.
pollJournal() is now cleaner and faster, correctly handles INVALIDATE
status from journald and is able to continue polling after journal
flush. Also reduced POLL_TIMEOUT a bit as it caused rsyslog to exit
with error in corner cases for some ppc when left at full second.
re-factored imjournal CI tests with journal_print tool to have more
detailed error reporting.
Thanks to Jiri Vymazal for the patch.
- config: enhance backticks "echo" capability
This is now more along the lines of what bash does. We now support
multiple environment variable expansions as well as constant text
between them.
env SOMEPATH is set to "/var/log/custompath"
config is: param=`echo $SOMEPATH/myfile`
param than is expanded to "/var/log/custompath/myfile"
among others, this is also needed inside the testbench to properly
support "make distcheck".
Note: testbench tests follows via separate commit. There will be
no special test, as the testbench itself requires the functionality
at several places, so the coverage will be very good even without
a dedicated test.
- imrelp: add support for setting address to bind to (#894)
This adds a new optional `address` parameter to `imrelp` inputs in order
to specify an address to bind to.
Based on support added by rsyslog/librelp@96eb5be
Thanks to Simon Wachter for the patch.
- omrelp: permit all authmodes; updated tests
omrelp for some time limited authentication modes to those
that were known. While this was OK, it prevented the easy
introduction of new auth modes into librel.
This has now been changed; omrelp now checks the validity of
the authmode directly via librelp by doing some librelp calls
upon processing the configuration.
Also, some tests have been updated to check this feature and
also ensure that the new librelp mode "certvalid" works
(if it is available).
- regexp.c: reduce lock contention when using glibc.
When using glibc, we enable per-thread regex to avoid lock contention.
This should not affect BSD as they don't seem to take a lock in regexec.
NOTE: it is assumed that we can craft an even better solution than
this patch, but it improves the situation and we do not have time to
craft more. So we decided to merge. For details see
- mmpstrucdata: better error message, support $! in var names
see also
- more explicit error msg with message modification mod on queue
Message modification modules do not work if used with a non-direct queue.
We now make this more explicit in the config parsing error message.
- omrabbitmq: improve high-load performance
A different pthread mutex is created for each connection (action)
instead of a single one shared by all connections. This will
improve performance when using multiple concurrent connections
to a single (or multiple) RabbitMQ instance(s) (e.g. for load balancing)
Thanks to github user micoq for contributing the patch.
- imudp: replace select() calls by poll()
This improves reliability in extreme cases (more than 1024 fds open when
imudp begins to listen) and potentially improves performance a little.
- ommysql: support mysql unix domain socket:
via action(.. socket="/tmp/mysqld.sock" ..)
Thanks to JoungKyun Kim for contributing this.
- impstats: emit warning if log.syslog="off" and ruleset name given
With this config, "ruleset" is silently ignored, what probably is
not obvious to a user.
- build system cleanup: remove no longer needed --enable-memcheck
This was used for a very old testing capability, no longer functional but
causes build to fail if enabled. Replaced by ASAN/valgrind.
Issue detected while testing some other CI settings.
- tools: Updated python based statslog analyzer sample scripts
- developer tools: make devcontainer tool more developer friendly
slight improvement for easy interactive use
- enable better testing via "make distcheck"
Also a couple of changes to testbench worth mentioning:
* use cp -f to ensure files can be overwritten in VBUILD
* fix issue of missing include test file in EXTRA_DIST
* new suppressions
* testbench: try to use local system dependency cache
avoid going to Internet repos if not absolutely necessary. For
development containers, they should be pre-populated with the
important dependencies.
* do not enable libfaketime if ASAN is selected
unfortunately, libfaketime does not work in that case
Note: for modules with non-standard dependencies (e.g. databases),
"make distcheck" only enables what on the original ./configure line
was enabled. This is done in order to ensure that "distcheck" adapts
to what is actually available on the system in question. Rsyslog's
own CI system installs the maximum set of possible dependencies and
so tries the maximum set "make distcheck" can support on a platform.
see also
- add new global config parameter "inputs.timeout.shutdown"
- omusrmsg: do not fall back to max username length of 8
This happens if utmp.h and friends are not available and stems back to
the original syslogd. Nowadas, 32 is more appropriate and now being used
in that (now very unlikely) case. The detection logic for UT_NAMESIZE has
also been streamlined.
- bugfix build system: fix race in parallel builds
If is built later than, there is a failure:
|../aarch64-wrs-linux-libtool --tag=CC --mode=link aarch64-wrs-linux-gcc
-o lmcry_gcry_la-lmcry_gcry.lo -lgcrypt
|aarch64-wrs-linux-libtool: error: cannot find the library ''
or unhandled argument ''
|Makefile:1049: recipe for target '' failed
|make[2]: *** [] Error 1
The LIBADD of contains, we should also add
Thanks to Hongxu Jia for the patch.
- bugfix imfile: memory leak upon shutdown (cosmetic)
When rsyslog shuts down and imfile is inside a change polling loop,
it does not properly free memory returned by glob(). This is a cosmetic
bug as the process terminates within the next few milliseconds. However,
it causes memory analyzer reports and thus makes CI fail.
- bugfix core msg: potential deadlock (and rsyslog hang)
can happen e.g. with headerless messages when app-name
property is used
- bugfix core: do not abort startup on problems setting scheduling policy
rsyslog creates a default scheduling policy on startup. This code
invalidly used CHKiRet (our exception handler) to check pthreads
return codes, what this macro cannot do. This lead to hard to
diagnose startup problems in cases where there were problems
setting the scheduling defaults (e.g. when rsyslog is set to run
at idle priority). Even more so, this blocked startup altogether,
which is not the right thing to do. Actually, this can be considered
a regression from commit 7742b21. That commit was 8 years ago, so
in general this cannot be a big issues ;-)
The code now emits proper error messages (to stderr, as at this point
no other output is available as it is during the initial state of
rsyslog initialization) and continues the startup.
- bugfix core: input shutdown timeout not properly applied
The timeout could be reduced by mutex wait time, which was not the
intended behavior and could lead the the input thread being
cancelled while it would have been perfectly legal to shut it down
Noticed during working on the CI system. May explain some testbench
instability and may have caused trouble with state files (not)
properly being written by inputs.
- bugfix config optimizer: error in constant folding
did not work properly if a string and a number were to be folded.
Detected by gcc 8.
- build: fix improper function casts
no real issue, but generated warnings under gcc 8 and thus
broke CI
- bugfix omlibdbi: fix potential small memory leak
detected by clang static analyzer
- bugfix ommysql: unsafe use of strncpy()
also now reports oversize names as user error vs. silent truncation
overly long names only could affect config load phase
- bugfix omhttpfs: fix insecure usage if strncmp()
consequences not evaluated as this is a contributed module.
Detected by gcc 8.
- bugfix mmgrok: cosmetic build issue - compiler warnings
caused build under gcc 7 to fail with warning
- bugfix mmkubernetes: stops working with non-kubernetes container names
When mmkubernetes encounters a record with a CONTAINER_NAME field,
but the value does not match the rulebase, mmkubernetes returns
an error, and mmkubernetes does not do any further processing
of any records.
The fix is to check the return value of ln_normalize to see if
it is a "hard" error or a "does not match" error.
This also adds a test for pod names with dots in them.
Thanks to Richard Megginson for the patch.
- bugfix mmkubernetes: potential NULL pointer access
If token file could not be opened, fclose() was passed a NULL pointer.
Thanks to github user jvymazal for finding and Richard Megginson
for fixing the issue.
- bugfix omsnmp: invalid traptype was not detected
this could leave config errors unreported and cause unexpected
- bugfix mmkubernetes: default rules use container_name_and_id
also include rulebase files in dist and fix rule so that dot inside
pod name is supported.
Thanks to Richard Megginson for fixing the issue.
- bugfix omelasticsearch: build regression
Commit 6d4635efbb13907bf651b1a6e5a545effe84d9d9 introduced some compile
problems, which were only detected on CentOS6, which unfortunately did
not compile omelasticsearch during CI runs
- bugfix ommongodb: do not force MongoDB to use "PLAIN" auth mechanism
... which also seems not to be handled by current MongoDB.
Remove ?authMechanism=PLAIN URI part to let the mongo library chooses the
default mechanism. One can force a specific authentication mechanism by
adding ?authMechanism=XXX into the uristr argument of the module
Thanks to Jérémie Jourdin for the fix.
- build system: do not disable tests via --disable-liblogging-stdlog
This setting controlled both the actual rsyslog functionality as well
as some testbench tests, which use liblogging-stdlog to provide some
specific functionality. This meant those tests were not run since
changing the default. Now untangling the dependency.
- CI:
* most test refactored to use newer testbench plumbing
while no functional change, this permits further enhancements
* ElasticSearch startup timeout in tests increased to care for
slower test systems
* imjournal: fixed tests to actually test plugin functionality
Thanks to Jiri Vymazal for the patch.
* new test for gnutls priority string in librelp
Thanks to github user jvymazal for the patch
* testbench: relax hanging instance detection
This does not work reliably if multiple instances of rsyslog
builds run on a single machine. We need to improve, but this
commit makes conflict less likely and provides some diagnostic
info to help guide us towards a final solution.
* testbench: fix tests that look awfully wrong
These tests indicated they terminate rsyslog forcefully without
draining the queues, but then checked if they were drained (all
messages processed). That does not make sense, and we cannot
envision why this was written the first place. So we assume some
copy&paste problem was the root of that.
* testbench: refactor tests which used "nettester" tool
Some old tests are carried out via the nettester tool. This was
our initial shot at a testbench a couple of years ago. While it
worked back then, the testbench framework has been much enhanced.
These old tests are nowadays very hard to handle, as they miss
debug support etc. So it is time to refactor them to new style.
As a side-activity, the testbench plumbing has been enhanced to
support some operations commonly needed by these tests. Contrary
to pre-existing plumbing, these new operations are now crafted
using bash functions, which we consider superior to the current
method. So this is also the start of converting the older-style
functionality into bash functions. We just did this now because
it was required and we entangled it into the test refactoring
because it was really needed. Else we had to write old-style
operations and convert them in another commit, which would
have been a waste of time.
Special thanks to Pascal Withopf for the initial step of taking
old tests and putting config as well as test data together into
the refactored tests, on which Rainer Gerhards than could build
to create the new tests and update testbench plumbing.
* testbench: ensure uxsock test leaves no dangling listener instances case the test aborts. We utilize the timeout utility for now
to prevent this.
* testbench: make port for imdiag dynamic
This is prep work to support parallel test runs
Version 8.36.0 [v8-stable] 2018-06-26
- build system change:
Liblogging-stdlog was introduced to provide a broader ability to send rsyslog
internal logs to different sources. However, most distros did not pick up
that capability and so instead we do a regular syslog() call. We assume that
the actual functionality is never used in practice, so we plan to retire it.
That makes building rsyslog from source easier.
The plan is to disable use of liblogging-stdlog by default during
configure. So users (and distros!) can still opt-in to have it enabled if
they desire.
A couple of releases later, we want to completely remove the functionality,
except if there has desire been shown in the meantime which justifies to keep
This version disabled liblogging-stdlog by default. We now also
emit a warning message ("liblogging-stdlog will go away") so that users
know what is going on and my react.
see also
- add openssl driver alongside GnuTLS one for TLS communication
The openssl driver is currently experimental. It will become the new preferred
driver as it permits us to provide much better end-user error message than
we could provide with GnuTLS. It is also less picky with certificate files
and provides specific error messages if there are certificate problems.
- GnuTLS TLS driver: support intermediate certificates
this is necessary for certificate chains
Thanks to Arne Nordmark for providing the patch.
- omelasticsearch: write op types; bulk rejection retries
* Add support for a 'create' write operation type in addition to
the default 'index'. Using create allows specifying a unique id
for each record, and allows duplicate document detection.
* Add support for checking each record returned in a bulk index
request response. Allow specifying a ruleset to send each failed
record to. Add a local variable `omes` which contains the
information in the error response, so that users can control how
to handle responses e.g. retry, or send to an error file.
* Add support for response stats - count successes, duplicates, and
different types of failures.
* Add testing for bulk index rejections.
Thanks to Richard Megginson for the patch.
- lookup tables: reload message now with "info" severity (was "error")
thanks to Adam Chalkley for the patch
- imptcp: add support for regex-based framing
for complex multi-line messages (XML in particular), the multiLine method
does not work well. We now have a capability to specify via a regex when
a frame starts (and the previous thus ends).
adds imptcp input parameter "framing.delimiter.regex"
- imjournal: add statistics counter
following statistics counter are now supported by imjournal
- submitted = total number of messages submitted for processing
- config: permit 4-digit file creation modes
permit 4-digit file creation modes (actually 5 with the leading zero) so
that the setgid bit can also be set (and anything else on that position.
- ommongodb: add possibility to ignore some insertion error code
new config parameter "allowed_error_codes", which will be ignored if
they happen. For example, 11000 DuplicateKey in case of collection
containing a unique field.
Thanks to Hugo Soszynski for contributing this work
- omprog: simplify '' example
Make the skeleton easier to understand by removing transaction support.
Also, transaction failures did not work as explained in the skeleton,
because of issue #2420. In the future, a ''
example can be added, ideally once the issue is solved.
Thanks to Joan Sala for contributing this.
- core: misaddressing when writing disk queue files
when writing disk queue files during shutdown, access to freed
memory can occur under these circumstances:
- action A is processing data, but could not complete it
most importantly, the current in-process batch needs not to
be totally completed. Most probable cause for this scenario
is a suspended action in retry mode.
- action A is called from a ruleset RA which
- does not have a queue assigned
- where RA is called from a ruleset RO which is bound
to the input from which the message originated
- RO must be defined before RA inside the expanded config
- Disk queues (or the disk part of a DA queue) must be utilized by A
When re-injecting the unprocessed messages from A into the disk queue, the
name of ruleset RO is accessed (for persisting to disk). However, RO is
already destructed at this point in time.
The patch changes the shutdown processing of rulesets, so that all
shutdown processing is done before any ruleset data is destructed. This
ensures that all data items which potentially need to be accessed
remain valid as long as some part may potentially try to access them.
This follows a the approach used in
where obviously that part of the problem was not noticed.
see also
- core: fix message loss on target unavailability during shutdown
Triggering condition:
- action queue in disk mode (or DA)
- batch is being processed by failed action in retry mode
- rsyslog is shut down without resuming action
In these cases messages may be lost by not properly writing them
back to the disk queue.
- imrelp bugfix: error message "librelp too old" is always emitted ...
... even if librelp is current. The condition check was actually missing.
This commit adds it.
- imrelp: segfault on startup when cert without priv key is configured
- omrelp bugfix: segfault on first message sent when authmode was wrong
A segfault could occur if the authmode was configured to an invalid value.
This is now caught during config processing and an error is reported.
- imfile bugfix: double-free on module shutdown
detected by code review, not seen in practice
- imfile/core bugfix: potential misaddressing in string copy routine
This can be exposed via imfile, as follows:
- use a regex to process multiline messages
- configure timeouts
- make sure imfile reads a partial message
- wait so that at least one timeout occurs
- add the message termination sequence
This leads to a misaddressing, which may have no obvious effects potentially
up to a segfault.
- imfile bugfix: if freshStartTail is set some initial file lines missing
When the option is set and a new file is created after rsyslog startup,
freshStartTail is also applied to it. That is data written quickly to it
(before rsyslog can process it) will potentially be discarded. If so,
and how much, depends on the timing between rsyslog and the logging process.
This problem is most likely to be seen in polling mode, where a relatively
long time may be required for rsyslog to find the new file.
This is changed so that now freshStartTail only applies to files that
are already-existing during rsyslog's initial processing of the file
monitors. HOWEVER, depending on the number and location (network?) of
existing files, this initial startup processing may take some time as
well. If another process creates a new file at exactly the time of
startup processing and writes data to it, rsyslog might detect this
file and it's data as prexisting and may skip it. This race is inevitable.
So when freshStartTail is used, some risk of data loss exists. The same
holds true if between the last shutdown of rsyslog and its restart log
file content has been added. This is no rsyslog bug if it occurs.
As such, the rsyslog team advises against activating the freshStartTail
- core: fix undefined behavior (unsigned computation may lead to value < 0)
This was detected by LLVM UBSAN. On some platforms re-setting the rawmsg
inside the message object could lead to invalid computation due to the
fact the the computation was carried out as unsigned and only then
converted to integer.
No known problem in practice.
- CI/QA:
- improved Elasticsearch tests so they can now be run without system-
installed ES service; also enables us to specify specific ES versions
and should now make the tests executable inside a container
Version 8.35.0 [v8-stable] 2018-05-15
- imptcp: add ability to configure socket backlog
this can be useful under heavy load.
For a detailed discussion see
Thanks to Maxime Graff for implementing this.
- omfile: do not permit filename that only consists of whitespace
- fmhash: new hash function module
implements hash32() and hash64() functions
Thanks to Harshvardhan Shrivastava for implementing these
- some better error messages
- imklog: add ratelimiting capability
On Linux kernel logs are ratelimited only for messages using
printk_ratelimit(). Some logs do not use this facility, so
we ratelimit kernel ourselves.
Thanks to Berend De Schouwer for the patch.
- omkafka: added impstats counters for librdkafka returned statistics
* statscallback counters
* librdkafka failure and error counters
* acked message counter
Thanks to Abdul Waheed for implementing this.
- imudp
* use rsyslog message rate-limiter instead of home-grown one
imudp introduced it's own (feature-limited) rate-limiting capability for
message on disallowed senders before we had central rate-limiters
inside rsyslog. Also, that code evolved from running on a single
thread to running on multiple threads, which introduced data races
and so made unreliable.
Now we removed the old rate-limiting capability and depend on the
system rate limiter for internal rsyslog messages.
* add stats counter "disallowed"
counts the number of messages discarded due to being received from
disallowed senders
see also
- imrelp: add parameter "oversizeMode"
Permits to instruct librelp how to handle oversize messages. The new default
is to truncate messages. Previously, the connection was aborted, what often
lead to stuck messages at the sender side. Now, there are three options passed
down to librelp:
* abort - same behavior as previously, connection is aborted on error
* truncate - do not abort but instead truncate oversize message to
configured max size
* accept - accept all oversize messages (note: this can cause security issues,
see doc for details)
see also
see also
- core: consistent handling of oversize input messages
In the community we frequently discuss handling of oversize messages.
David Lang rightfully suggested to create a central capability inside
rsyslog core to handle them.
We need to make a distinction between input and output messages. Also,
input messages frequently need to have some size restrictions done at
a lower layer (e.g. protocol layer) for security reasons. Nevertheless,
we should have a central capability
* for cases where it need not be handled at a lower level
* as a safeguard when a module invalidly emits it (imfile is an example,
see for a try to fix it
on the module level - we will replace that with the new capability
described here).
The central capability works on message submission, and so cannot be
circumvented. It has these capabilities:
* oversize message handling modes:
- truncate message
- split message
this is of questionable use, but also often requested. In that mode,
the oversize message content is split into multiple messages. Usually,
this ends up with message segments where all but the first is lost
anyhow as the regular filter rules do not match the other fragments.
As it is requested, we still implemented it.
- accept message as is, even if oversize
This may be required for some cases. Most importantly, it makes
quite some sense when writing messages to file, where oversize
does not matter (accept from a DoS PoV).
* report message to a special "oversize message log file" (not via the
regular engine, as that would obviously cause another oversize message)
This commit, as the title says, handles oversize INPUT messages.
see also
Note: this commit adds global parameters:
* "oversizemsg.errorfile",
is used to specify the location of the oversize message log file.
* "",
is used to control if an error shall be reported when an oversize
message is seen. The default it "on".
* add global parameter "oversizemsg.input.mode"
is used to specify the mode with which oversized messages will
be handled.
- omfwd: add support for bind-to-address for UDP
To allow the same source address to be used regardless of the egress
interface taken, an option is added for an address to bind the datagram
socket to. Similarly to imudp, it is necessary to add an ipfreebind
option which is set by default, so as to avoid an excess of errors at
startup before the network interface has come up. This enhancement
allows a usecase on networking devices, by which a source interface
that is typically a loopback is specified, on which an address to bind
to is configured. This is so that the same source address is used for
all packets from rsyslog.
Thanks to Mike Manning for the patch.
- template systemd service file proposes higher permitted file handle limit
Especially on busy systems the default are too low. Please keep in mind
that on a very busy system even the now-proposed setting may be too low.
Thanks to github user jvymazal for the patch.
- imuxsock: replace select() call by poll()
While extremely unlikely, imuxsock could abort if a file descriptor
> 1024 was received during the startup phase (never occured in
practice, but theoretically could if imfile monitored a large number
of files and were loaded before imuxsock - and maybe other
strange cases).
see also
- nsdsel_ptcp: replace select() by poll()
This removes us of problems with fds > 1024. The performance will
probably also increase in most cases.
Note this is not a replacement for the epoll drivers, but a general
stability improvement when epoll() is not available for some reason.
see also
- omprog: refactor tests, fix child closing issues
Refactor omprog tests. Fix sync issues in these tests by
using the feedback mode (confirmMessages=on) to synchronize
the test with the external program. Closes #2403 (I hope)
Fix omprog not properly closing child process when
signalOnClose=on. Needed for the new tests. Closes #2599
Fix omprog not waiting for the child process to terminate
when signalOnClose=off. Needed for the new tests. Closes #2600
Close all fds before executing the child even when valgrind
is enabled (--enable-valgrind). Needed for the new tests.
Fix memory leak when the xxxTransactionMark parameters were
Thanks to Joan Sala for the patch.
- core: config optimizer did not handle call_indirect
This also caused the emission of an "internal error" error message
- debug support: add capability to print testbench-specific timeout reports
done by setting RSYSLOG_DEBUG_TIMEOUTS_TO_STDERR to "on"
this is by default activated inside the testbench
- mmgrok: fix potential segfault
The modules used strtok(), which is not thread-safe. So it will potentially
segfault when multiple instances are spawned (what e.g. happens on busy
This patch replaces strtok() with its thread-safe counterpart
see also
- imrelp bugfix: maxDataSize could be set lower than maxMessageSize
maxDataSize specifies the length which will still be accepted
It previously could be set to any value, including values lower than the
configured rsyslog max message size, which makes no sense. Now this is
checked an error message is emitted if the size is set too low.
- build system bugfix: build broken if liblogging-stdlog installed in custom path
Thanks to Dirk Hörner for the patch.
- core bugfix: segfault on queue shutdown
if a ruleset queue is in direct mode, a segfault can occur during
rsyslog shutdown. The root cause is that a direct queue does not
have an associated worker thread pool, but the ruleset destructor
does not anticipate that and tries to destruct the worker thread
pool. It needs to do this itself, as otherwise we get a race
between rulesets on shutdown.
This was a regression from
- imfile bugfix: statefiles contain invalid JSON
When imfile rewrites state files, it does not truncate previous
content. If the new content is smaller than the existing one, the
existing part will not be overwritten, resulting in invalid json.
That in turn can lead to some other failures.
- omfile bugfix: segfault if empty filename was given
- fix build issues when atomic operations are not present
for details, see
- lmsig_ksils12 bugfix: build and static analyzer issues
The module had a couple of problems building as well as some potential
errors detected by clang static analyzer. These have been fixed.
Thanks to Allan Park for the patch.