Skip to content
Switch branches/tags

Name already in use

A tag already exists with the provided branch name. Many Git commands accept both tag and branch names, so creating this branch may cause unexpected behavior. Are you sure you want to create this branch?
Go to file
Latest commit ba32469 Jun 6, 2023 History
71 contributors

Users who have contributed to this file

@rgerhards @alorbach @theinric @Whissi @mtomaschewski @pduveau @deoren @jsoref @rfujita @mcarpenter @kaiwangchen @jbondc
Scheduled Release 8.2306.0 (aka 2023.06) 2023-06-??
- 2023-06-05: GNUTls Driver: Fix memory leaks in gtlsInitCred
Missing CA Certificate or multiple Connections caused
a memory leak in pThis->xcred as it was allocated each time in
gtlsInitCred by gnutls_certificate_allocate_credentials
- 2023-05-24: CI: update base ubuntu image for github actions
Scheduled Release 8.2304.0 (aka 2023.04) 2023-04-18
- 2023-04-17: imptcp bugfix: spam log on oversize message
If an oversize message was received by imptcp, imptcp reported
one error message for EACH oversize character. This could
result in a potentially very large number of similar (and
useless) messages.
This is a regression from commit f052717178.
- 2023-04-17: core/bugfix: using $uuid msg prop can deadlock rsyslog on shutdown
This problem can occur if a large number of threads is used and rsyslog
cannot shut down all queues etc within the regular time interval. In this
case, it cancels some threads. That can leave the mutex guarding libuuid
calls locked and thus prevents other, not yet cancelled threads from
progressing. Assuming pthread_mutex_lock() is not a cancellation point,
this will case these other threads to hang forever and thus create a
deadlock situation.
- 2023-04-17: Do not preserve capabilities when changing credentials
In configurations where $PrivDropToGroup or $PrivDropToUser are used,
rsyslogd changes uid/gid to a non-privileged user. As part of that
change, all capabilities should be lost. However, if rsyslog is
compiled with --enable-libcap-ng option, some capabilities are
preserved due to using capng_change_id() instead of setgid()and
This function preserves capabilities while changing uid/gid, causing
rsyslogd to run as non-root user, but with some root capabilities.
Unfortunately, rsyslogd will run with higher privileges than before.
The patch also removes CAP_SETPCAP, because the capability set does
not need to be altered at a later phase.
Thanks to Attila Lakatos for the patch.
Scheduled Release 8.2302.0 (aka 2023.02) 2023-02-21
- 2023-01-27: core/template: implement negative
This will easily permit to drop the last n characters from a property
without the need to know the exact length of the string. This is
especially useful as the exact length is most often not known
- 2023-01-18: Introduce --enable-libcap-ng configure option
The option allows to drop the capabilities to only
the necessary set, to minimize security exposure in
case there was ever a mistake in a networking
plugin or some other input resource. Moreover, it adds
ability to change uid and gid while retaining the
previously specified capabilities.
Add ability to change uid and gid while retaining the
capabilities previously specified.
Thanks to Attila Lakatos for the patch.
- 2023-01-16:
- omfile: add action parameters "rotation.*"
Add new action parameters
- rotation.sizeLimit
- rotation.sizeLimitCommand
provide automatic output file rotation functionality feature-wise
equivalent to legacy $outchannel. This finally permits to use
this feature set in rscript.
- core substring function: enhancement and hardening
Now, length can have a negative value -n to denote that the
substring should be build between startpos and the character
-n chars from the end. This is a shortcut for stripping charactes
on "both ends" of the string. See doc for details on the enhanced
Also, some hardening against invalid startpos and length has
been added.
- core bugfix: wrong type conversion in internal string class could lead to segfault
This could only happen with very unusually large strings
Thanks to Flos Lonicerae for the patch.
- QA: changed to CodeQL scanning on github as LGTM replacement
- bugfix: wrong version number on daily stable builds
- CI: use newer version of zookeeper (needed modernization)
- ffaup bugfix : memory corruption with concurrent workers
The ffaup function fails to work properly when it is used with multiple workers.
The faup_handler_t struct is not supposed to be shared between threads.
This may have caused memory corruptions and race conditions when used
inside of actions.
Thanks to Thibaud Cartegnie for the fix.
- openssl bugfix: undefined reference error on OpenSSL 1.1 or higher.
This could have prevented ossl components from being loaded/used.
- 2023-01-02: core bugfix: template system may generate invalid json
- a list template
- is created with option.jsonf="on"
- and the last list element is a property with onEmpty="skip"
- and that property is actually empty
invalid JSON is generated.
The JSON string in this case ends with ", " instead of "}\n". This
patch fixes the issue.
Scheduled Release 8.2212.0 (aka 2022.12) 2022-12-06
- 2022-12-05: testbench: make python http server based tests more reliable
Harden them against races during server port assignment. Prevents
testbench flakes.
- 2022-12-05: omprog bugfix: invalid status handling at called program startup
There is a bug when external program *startup* does not return "OK". This
can also lead to a misadressing with potentially a segfault (very unlikely).
Note that no problem exists once the initializiation phase of the external
program is finished and regular message transfer runs.
The problem basically is that for a startup failure, the control data for
that external program instance is freed on error. Unfortunately, that state
data is needed later on to detect a suspended instance. We now keep the control
data even on init failure (as we then need to do normal control options).
- 2022-11-29: testbench bugfix: wrong message injection object of instance 1
In some client-server test cases, messages are supposed to be injected into
the instance 2(client), but they are actually injected into instance 1(server),
which may lead to false negative results. This patch fixed it by replacing
'injectmsg' with 'injectmsg2', and dealt with some minor issues.
Thanks to Guodong Zhu for the patch.
- 2022-11-21: rsyslog.conf man page bugfix: description of selectors
Document historic difference to BSD syslog selectors.
- 2022-11-18: imtcp bugfix: legacy config directives did no longer work
Many "$InputTCPServer..." config directives did no longer work
and were completely ignored (e.g. "$InputTCPServerStreamDriverMode").
This was a regression from a08591be5d9 (May, 5th 2021).
- 2022-11-16: ksi bugfix: sending of too many signing requests fixed.
As there is a bug in libksi where too many signing requests may have bene sent
out the amount of signing requests will be limited by KSI module until the fix
is implemented.
Thanks to Taavi Valjaots for the patch.
- 2022-11-14: bugfix: prevent potential segfault when switchung to queue emergency mode
When switching to Disk queue emergency mode, we destructed the in-memory
queue object. Practice has shown that this MAY cause races during
destruction which themselfs can lead to segfault. For that reason, we
now keep the disk queueu object. This will keep some ressources,
including disk space, allocated. But we prefer that over a segfault.
After all, it only happens after a serious queue error when we are
already at the edge of hard problems.
see also:
- 2022-11-08: ksi bugfix: Segmentation fault in async mode fixed
Thanks to Taavi Valjaots for the patch.
- 2022-11-02: imjournal: add second fallback to _COMM
If SYSLOG_IDENTIFIER is not present in the journal message,
then lookup the _COMM field, which stands for the name
of the process the journal entry originates from. This is
needed in order to be in compliance with the journalctl
Thanks to Attila Lakatos for the patch.
- 2022-10-25: core bugfix: local hostname invalid if no global() config object given
The local hostname is invalidly set to "[localhost]" on rsyslog startup
if no global() config object is present in rsyslog.conf. Sending a HUP
corrects the hostname.
This is a regression from ba00a9f25293f
- 2022-10-25: testbench bugfix: fixed timing issue that sometimes lead to test failure
Timing caused a race in test tool sync and could lead to premature termination of
tools, which in turn caused test failure
Scheduled Release 8.2210.0 (aka 2022.10) 2022-10-18
- 2022-10-13: fix NetBSD build issue
On NetBSD, time_t has for a long time now been __int64_t.
On 32-bit CPUs, the compiler is not obliged to define
__sync_bool_compare_and_swap_8, so instead this ends up
as an undefined symbol when linking rsyslog. This makes
the code fall back to the pthread / locking method on these
systems, but at least lets the program build.
Thanks to Havard Eidnes for the patch.
- 2022-10-12: omrabbitmq: Add TLS support
Thanks to github user 21stcavenan for the patch.
- 2022-09-14: config: add "abortOnFailedQueueStartup" global config parameter
similiar to "abortONUncleanConfig", this parameter aborts rsyslog
when a queue has problems during startup. Some users perfer rsyslog
to terminate in this case. By default, nothing changes.
- 2022-09-07: cor bugfix: leak in helper function SetString
A part of rsyslog runtime, SetString(), had a small memory leak when a value was
assigned multiple times. While this could potentially consume larger amounts of
memory, this did not happen in practice. The reason is that multiple assignments
to the same object occur very seldom.
Thanks to github user seuzw930 for the patch.
- 2022-09-07: core bugfix: correct local host name after config processing
rsyslog.conf may affect the host's local name. These changes were
so far only activated after the first HUP. This patch now ensures
that the configured local host name is applied correctly throughout
all processing, including early startup.
This patch causes a slight change of behaviour. However, the behaviour
was inconsitent before. Now it is consistent and according to the config.
Please note: this patch also exposes a global entry point via "regular"
dynamic loading as this makes things much easier to do. This is in-line
with ongoing simplification effort.
Finally, we also remove a CI test that we do no longer need because
the problem covered is now addressed differently and the original issue
can no longer occur.
- 2022-08-31: imtcp: add option notifyonconnectionopen
Add this both as module an input parameter. Complements already-existing
config param notifyonconnectionclose and mirrors the similar feature from
The module parameter acts as default, similarly to notifyonconnectionclose.
Note that in contrast to imptcp, we emit IP addresses and not host
names. This sticks with the traditional semantics of imtcp.
Note that we also fixed a mislading error message in the case when a
disallowed sender tried to connect.
Thanks to John Chivian for suggesting the addition.
- 2022-08-26: openssl TLS driver: add mechanism to include extra CA files parameter
This change allows to include extra CA files so that no "unable to get issuer
certificates" issue is obtained when using chained cert files. New parameter name is
Thanks to Sergio Arroutbi for the patch.
- 2022-08-19: fix compile issue with older gcc compilers
Thanks to Julien Thomas for the contribution.
Scheduled Release 8.2208.0 (aka 2022.08) 2022-08-09
- 2022-08-09: ksi bugfix: request cache size and send timeout issue fixed.
Async service send timeout is not configurable and request cache size is too
small to handle large amount of signing requests with small amount of permitted
requests per aggregation round. For example user with max_requests = 4 results
cache size 5 * max_requests or at least 256. When signing 300 log files cache
will be too small resulting several unsigned blocks. When signing 200 log file
cache will be adequate, but with rate of 4 signatures per second, it is only
possible to sign 4 * 10 blocks before all requests that are not sent out will
Fix for the issue is to make send timeout configurable and make the size of the
cache depend on the value of send timeout. New configuration value
sig.block.signtimeout="time, s" introduced that defines the time window wherein
the block has to be signed. The size of the request cache is increased to
3 * max_requests * sign_timeout or at least 256.
Thanks to Taavi Valjaots for the patch.
- 2022-08-09: imjournal bugfix: segmentation fault in close journal
Thanks to github user t-feng for the patch.
- 2022-08-09: net subsystem: support sha256 for StreamDriverAuthMode="x509/fingerprint"
Thanks to github user codemaker219 for the patch.
- 2022-08-05: imfile bugfix: message loss/duplication when monitored file is rotated
When a to-be-monitored file is being rotated, some messages may be lost or
duplicated. In case of duplication, many file lines may be duplicated
depending on actual timing. The whole bug was primarily timing depenedent
in general. It most often was visible in practice when the monitored
file was very frequently rotated (we had some report with every few
Note that while we try hard to not lose any messages, input file
rotation always has some loss potential. This is inevitable if
the monitored file is being truncated.
Also note that this bugfix affects imfile, only. It has nothing to do
and no relation to rsyslog output files being rotated on HUP.
- 2022-08-05: ksi bugfix: optimize processing of signer queue to fix delays.
There is a worker queue where rsyslog KSI module collects events and signing
requests. When queue is processed thread is periodically put to sleep. Previous
implementation handles signature requests well but sleeps every time after
handling new file open / close event. When several log files are opened or
closed simultaneously process is significantly slowed down. Another issue is
that thread always sleeps 1000ms that may be 2x longer than aggregation round.
This slows down overall signing process.
Fix for the issue is to simply not sleep after file open / close event if there
are next items to be processed. To speed up the signing process, rsyslog uses
KSI aggregator conf. to obtain the aggregation period that is used for the sleep
time configuration.
Thanks to Taavi Valjaots for the patch.
- 2022-08-04: ksi bugfix: possible crash fixed when several log files are opened.
KSI module in async mode used to request aggregator conf. every time a log
file was opened. When several log files were opened simultaneously
corresponding amount of pointless concurrent conf. requests were posted.
Concurrent conf. requests lead to a bug in libksi, where internal count of
pending requests was not decremented correctly causing system to crash.
Fix for the issue is to optimize the frequency of conf. requests so that only
one conf. requests is handled at once. Instead of checking conf. every time
log file is opened, conf is requested periodically after conf timeout. This will
affect both sync and async mode.
New option for KSI module introduced - sig.confinterval="time, s".
Thanks to Taavi Valjaots for the patch.
- 2022-08-04: openssl: add support to split tls commands by semicolon
- Add support to split tls commands by semicolon.
- Changed one test with multiple tls commands to use semicolon as
separator instead of newline.
- 2022-08-04: openssl subsystem bugfix: build issue on Solaris
Needed header file was added. Platforms other than Solaris did not actually need it,
so this bug was discovered late.
Thanks to Jakub Kulík for the patch.
Import <strings.h> when index() is used.
- 2022-08-04: openssl: add more details to error messages
- Avoid LogMsg outputs osslEndSess on successfull terminated
connection. Only LogMsg if the connection was terminated
- Handle SSL_ERROR_SYSCALL in both Send / osslRecordRecv,
do not log as error if underlaying socket was terminated
(ECONNRESET). Log as information instead.
- 2022-08-04: omclickhouse: capture additional exceptions
- DB::NetException
- DB::ParsingExceptions
Thanks to Victor Kustov for the patch.
- 2022-08-04: mmanon bugfix: Simplified and fixed IPv4 digit detection.
- Fixed an issue with numbers above int64 in syntax_ipv4.
Numbers that were up to 256 above the max of an int64
could incorrectly be detected as valid ipv4 digit.
- Simplified the IPv4 digit detection function and renamed
to isPosByte.
- added testcasse for malformed IPvc4 addresses
- 2022-07-21: imptcp: slight tuning
- reduce indirect addressing to obtain more speed
- also a fix for an annoying typo
- minor other optimizations
- modernization of one test
- 2022-07-20: template procesing/json: performance optimization
- 2022-07-19: core bugfix: memory leak when free action worker data table
During free action worker data table when action destruct, worker instance in worker
data table were not null. It resulted in memory leak.
Thanks to github user seuzw930 for the patch.
- 2022-07-13: omfile: support for zstd compression
The zstd library provides better and faster compression than zlib.
This patch integrates zstd as a dynamically-loadable functionality.
As such, no further dependencies need to be added to the rsyslog
base package.
Due to the increased performance, usage of zstd is highly recommended
for high-volume use cases.
This patch also refactor zlib compression in order to unify handling
in both compression cases.
- 2022-07-07: stream cleanup: move error message to debug log, only
This error message is most probably rooted in a kernel problem. At
least knowbody knows how it can happen. It's definitely not a
rsyslog issue. We also can recover from it for a long time now
so there is no reason to irritate users by emitteing this
"error" message.
- 2022-07-04: mmdblookup bugfix: Don't crash Rsyslog on mmdb file errors
Thanks to Théo Bertin (frikilax) for the patch.
- 2022-06-28: build error fix: libbson requires out-of-date language constructs
- 2022-06-27: OpenSSL: fix depreacted API issues for OpenSSL 3.x
- OpenSSL error strings are loaded automatically now
- Debug Callback has changed
- See for more:
Scheduled Release 8.2206.0 (aka 2022.06) 2022-06-14
- 2022-05-25: omelastisearch: allow omitting _type field
Allow omitting the _type field by setting it to an empty string.
Setting this field has been deprecated since 6.0, and support will
be removed in 8.0
Also add testbench test for empty searchType with ES 7.0
This checks for messages in the deprecation log and also provides
avoids deprecation messages from usage of transport.tcp.port in the
test configuration
Thanks to Jarkko Oranen for the patch.
- 2022-05-18: tcpsrv/imtcp: slight performance improvements
This change slightly improves performance for tcpsrv-based servers.
This affects imtcp and imgssapi as well as some helpers.
No other functional change is included in this change.
- 2022-05-12: imptcp bugfix: worker thread starvation on extreme traffic
When connectes were totally busy, without any pause, the assigened worker
did never terminate its reading loop. As such, it could not service any
other conenctions. If this happened multiple time and to all configured
workers, all other connections could not be processed at all. This extreme
scenario is very unlikely, as the whole issue is relatively unlikely.
In practice, the issue could lead to somewhat degraded performance and
resolved itself after some time (in practice no connection is 100% busy
for an extended period of time).
Note that this patch sets a fixed limit of 16 iterations for very busy
connections. This sounds like a good compromise between non-starvation
and performance. The exact number may be made configurable if there
is really need to.
- 2022-05-11: omelasticsearch: several support option for ElasticSearch 8
- config params searchIndex and documentType can be empty
- support for Data Stream API
Thanks to github user EHerzog76 for these changes.
- new config param esVersion.major
- 2022-05-09: tcp receiver bugfix: delay/potential hang on some error conditions
Error were not correctly handled in some cases for imtcp and imgssapi. This could
lead to a temporary stall of some connections. For ultry-low traffic systems, this
stall could stay for a long period of time. In most cases, it was resolved very quickly.
Note that imptcp was not affected.
Thanks to Iwan Timmer for the fix.
- 2022-05-05: net bugfix: potential buffer overrun
there is heap buffer overflow vulnerability in rsyslog tcp reception components.
This can only happen in octet-counted mode, which is enabled by default.
Affected components: imtcp, imptcp, imhttp, imgssapi, imdiag when octet-counted
framing was enabled.
If the receiver ports are exposed to the public Internet AND are used
without authentication, this can lead to remote DoS and potentially to
remote code execution. It is unclear if remote code execution is
actually possible. If so, it needs a very sophisticated attack.
When syslog best practices with proper firewalling and authentication
is used, thean attack can only be carried out from within the Intranet
and authorized systems. This limits the severity of the vulnerability
considerably (it would obviously require an attacker already to be
present inside the internal network).
Credits to Peter Agten for initially reporting the issue and working
with us on the resolution.
fixes CVE-2022-24903
- 2022-05-05: imptcp: set OS worker thread name
We now set the worker thread names to "imptcp/<thrd nbr>" where
<thrd nbr> is the numerical index (0, 1, ...) of the worker thread.
This enables to distinguish individual worker threads in OS tools like
htop. That is useful for performance testing and system monitoring.
The choosen name format is consistant with other similar thread
names inside rsyslog. For imptcp, worker threads were not yet
given individual names.
Note: "in:imptcp" is imptcp's "main" thread, which also is used
as a worker in some scenarios. This name was not modified.
- 2022-04-26: mmanon bugfix: shortened IPv6 form not always anonymized
If the IPv6 is in non-recommended form followed by a 5 digit port number, it
is not anonymized.
A reproducer for this is: 1a00:c820:1180:c84c::ad3f:d991:ec2e:49255
- 2022-04-22: mmdblookup fix: wrong copy of buffer
...following parse of libmaxminddb's return after a successful search sometimes
failed to return specific field from data.
Thanks to Théo Bertin for the patch.
- 2022-04-22: mmdblookup: several enhancements
- support arrays in MMDB entry
- support escaped quotes '"' in MMDB entry
- support '<' characters in MMDB entry, when in a field
- support '}' characters in MMDB entry, when in a field
Thanks to Théo Bertin for the patch.
Scheduled Release 8.2204.1 (aka 2022.04) 2021-05-05
- security bugfix: potential buffer overrun in imptcp, imtcp, imgssapi and others
This addresses CVE-2022-24903
see also
Scheduled Release 8.2204.0 (aka 2022.04) 2021-04-19
- 2022-04-18: gnutls bugfix: possibility of infinite loop
There was a rare possibility that the E_AGAIN/E_INTERRUPT handling
could cause an infinite loop (100% CPU Usage), for example when a TLS
handshake is interrupted at a certain stage.
* After gnutls_record_recv is called, and E_AGAIN/E_INTERRUPT error
occurs, we need to do additional read/write direction handling
with gnutls_record_get_direction.
* After the second call of gnutls_record_recv (Expand buffer)
we needed to also check the eror codes for E_AGAIN/E_INTERRUPT
to do propper errorhandling.
* Add extra debug output based on ossl driver.
* Potential fix for 100% CPU Loop Receiveloop after gtlsRecordRecv
in doRetry call.
- 2022-04-17: core/bugfix: errorfile could grow over max configures size
When action.errorfile.maxsize configuration option is enabled and error file
already has a certain size smaller than max size configured, it is increasing
higher than configured max size as the error file is considered to be zero in code.
This fix reads current error file size and limits the size to the maximum
size configured.
Thanks to Sergio Arroutbi for the patch.
- 2022-04-17: omkafka bugfix: potential misadressing
The `failedmsg_entry` expects a null-terminated string in `key`, but
here we allocate with malloc and copy a string-with-length-n into only
the first n bytes. If the final byte is null, this is by coincidence
This was observed by means of seeing random binary data appended to
keys submitted to kafka apparently at random. This could also result
in more severe problems, inclusing a segfault.
Thanks to David Buckley for the patch.
- 2022-04-06: added new "FullJSONFmt" standard template (with addtl fields)
This comes handy for a number of use cases, especially with ElasticSearch.
Thanks to Art O Cathain for the patch.
- 2022-04-04: imfile: potential processing delay
This was mentioned by Mikko Kortelainen without exact details on what exactly
this could cause in practice. But we were confident enough that it is worth
merging (though it does not look like something that brought real problems in
practice, as we do not know any related reports).
see also:
Thanks to Mikko Kortelainen for the patch.
- 2022-04-04: bugfix: cosmetic data races
there was a more or less cosmetic data race which could happen when children
processes died in quick sequence. Even then, no real harm happened, as all
children were reaped eventually.
A similar data race exists for HUP processing.
However, these races polluted TSAN test runs, and so we fixed them
- 2022-04-01: add property options to support ISO week/year number
Thanks to Mattia Barbon for the patch.
- 2022-04-01: core bugfix: "action suspended" message was emitted even when turned off
Most messages were diasabled, but there was one part of the code that ignored the
user configuration.
Thanks to Deyneko Aleksey for the patch.
- 2022-03-31: testbench: add more tests for rscript comparison operations
- 2022-03-31: core bugfix: make internal logs emitted during HUP procesing appear quicker
After call doHUP(), probably there is a internal log in the list. However, it
will not be wrote out immediately, because the mainloop will be blocked at
pselect in wait_timeout() until a long timeout or next message occur.
More deadly, the log may be lost if the deamon exits unexpectedly.
We might as well put processImInternal() after doHUP(), so that the message
will be flushed out immediately.
Fixes: 723f6fdfa6(rsyslogd: Fix race between signals and main loop timeout)
Thanks to Yun Zhou for the patch.
- 2022-03-20: refactor: Move the parser directive to the main config
Thanks to Attila Lakatos for the patch.
- 2022-03-16: refactor: ake the main message queue part of the config
The intent of this patch is to make the main message queue part of the main config.
It will help us to proceed towards dynamic configuration reload.
- regression bugfix: rsyslog may segfault during startup
glblGetMaxLine() might be called even before the main configuration file exists
resulting unexpected behavior, most probably segmentation fault. This is addressed
by re-introducing the old default of 8KiB. The problem was introduced earlier in
- regression fix: script string comparison did not work correctly
In rscript, comparison operations on strings did not work correctly
and returned false results. This is cause by a regression in commit
5cec5dd634e0. While it fixed number comparisons, it introduced new
problems in string comparisons, which were not present before. Note
that most items in rsyslog are strings, so this can actually cause
some problems.
Scheduled Release 8.2202.0 (aka 2022.02) 2022-02-15
- 2022-02-14: imfile bugfix: remove cause for "internal error message" (not causing harm)
When any message is output into a renamed input file, rsyslogd output the following:
imfile: internal error? inotify provided watch descriptor 7 which we could not find
in our tables - ignored
When rsyslogd detects the inode change, it deletes the entry from wdmap[]. But,
the watch descriptor is not removed. Some application like sssd outputs some messages
(like "HUP signal was received!!") after HUP signal is received and before switching
into the new log file. And, the above messages can be output every log rotation.
This situation is now resolved.
Thanks to Masahiro Matsuya for the patch.
- 2022-02-04: rscript bugfix: literal numbers were not compared correctly
This problem occurred when numbers were used in rsyslog.conf in
the set statement, e.g.
set $nbr = 1234;
In this case, during comparisons, the number was actually interpreted
as a string with digits. Thus numerical comparisons lead to unexpected
results. Even more so, as in other places of the code they were
treated as native numbers.
This is now fixed. We cannot outrule that this causes, in border cases,
change of behavior to existing configs. But it is unlikely and the
previous behaviour was a clear bug and very unintuitive. This in our
opinion it is justified to risk a breaking change for an expected
very minor subset of installations, if any such exists at all.
- 2022-02-04: omelasticsearch bugfix: indexSuccess impstats counter in bulkmode wrong
When bulkmode is enabled, and a batch was processed without any
failures (errors is false), the code that increments the indexSuccess
impstats counter was never reached.
- 2022-01-17: imkmsg bugfix: effectively disabled input on error reading kmsg
Due to a program bug, imkmsg could not recover from an kmsg read error.
Note that recovering is possible and was intended.
Thanks to Kailash Sethuraman for the patch.
- 2022-01-17: imtcp bugfix: worker threads were not properly terminated
Graceful shutdown of Rsyslog could lead to segmentation faults when
multiple imtcp inputs were being used. That is because the rest of the
tcpsrv threads are left behind running, while their underlying objects
are being disposed by the main thread as part of the module
Thanks to Gabor Orosz <> for the analysis and patch.
- 2022-01-07: omlibdbi bugfix: use-after-free bug
This occurred in when sqllite driver was used. Depending on circumstances, this had
no visible issues (often) up to rsyslog segfault. The busier rsyslog is, the more
likely a bad outcome.
- 2022-01-06: omhttp bugfix: memory leak in lokirest batchmode
A JSON object was created (valueObj) but not used and also not released causing a
memory leak. Over time, this could lead to memory overcomittent.
Scheduled Release 8.2112.0 (aka 2021.12) 2021-12-16
- 2021-12-14: refactor:Deallocate outchannel resources in rsconf destructor
Thanks to Attila Lakatos for the patch.
- 2021-12-14: refactor: use runConf instead of loadConf in ratelimiting during runtime
Thanks to Attila Lakatos for the patch.
- 2021-11-22: new contribtion: URL parser module function using libfa
Thanks to Théo Bertin for the patch.
- 2021-11-18: mmanon: relax IPv6 detection - improve anonymization
We so far tried to ensure a value is really an IPv6 address, in order
to avoid to mangle with just similar-looking information elements.
However, this lead to misdetection for unusual formats, e.g. when a
port is appended to a numerical IPv6 adress given without braces [].
This has been changed now. In a sense, we now prefer to err on the
side of privacy.
Previously, a suspect value was not anonymized, and thus some other
elements (like some MAC addresses) preserved. Now the opposite is
true, and we anonymize anything that looks close enough to be an
IPv6 address. This improves anonymization.
- 2021-11-10: ruleset bugfix: ruleset queue was incorrectly named
The ruleset was incorrectly and unusably named. This was a regeression
from 4a63f8e9629c3c9481a8b6f9d7787e3b3304320b.
Many thanks to github user digirati82 for alerting us.
- 2021-11-10: omsnmp: update module to current IP best practices
The omsnmp module uses the inet_addr() function to convert the Internet host address
from IPv4 numbers-and-dots notation into binary data in network byte order. If the input
is invalid, INADDR_NONE (usually -1) is returned. Use of this function is problematic
because -1 is a valid address ( We should avoid its use in favor of
inet_aton(), inet_pton(3), or getaddrinfo(3), which provide a cleaner way to indicate
error return [1].
This is just a request to satisfy covscan, so no error is reported at all.
Thanks to Attila Lakatos for the patch.
- 2021-10-27: ommysql: fix threading bug
When the MariaDB connection was (re)established, old or NULL handle
could be used. This is fixed now.
We need to synchronize access to the mysql handle, because multiple threads
use it and we may need to (re)init it during processing. This could lead to
races with potentially wrong addresses or NULL accesses. If this really
matters mostly depends on the MariaDB/MySQL client library. It looks like
they guard against fatal failuers. Anyhow, logging errors inside rsyslog
could happen in any case.
- 2021-10-25: testbench: false positive when impstats was not built
Test omfwd_fast_imuxsock failed when impstats was not built. This
has been corrected, test is now only executed when impstats is
- 2021-10-25: imtcp: add support for permittedPeers setting at input() level
The permittedPeers settig was actually forgotten during the refactoring
of TLS input() level settings. This functionality is now added.
Scheduled Release 8.2110.0 (aka 2021.10) 2021-10-19
- 2021-10-13: config bugfix: global(security.abortonidresolutionfail=) did not work
when used with rscript based configuration, it was not checked.
- 2021-10-13: config bugfix: global param $privDropToUser did not work correctly
The parameter was not implemented for rscript based configuration and
did not properly apply to legacy configuration. In essence, it almost always
did not work as expected.
see also:
see also:
- 2021-10-12: rscript bugfix: ruleset called async when ruleset had queue.type="direct"
The call rscript statement is able to call a rule set either synchronously or
asynchronously. We did this, because practice showed that both modes
are needed. For various reasons we decided to make async
calls if the ruleset has a queue assigned and sync if not.
To know if a "queue is assigned" we just checked if queue parameters were
given. It was overlookeded the case of someone explicitly specifying a
"direct queue", aka "no queue". As such, queue="direct" triggered async
calls. That in turn meant that when a write operation to a variable was
made inside that rule set, other rulesets could or could not see the
write. While if was often not seen, this was a data race where the
change could also be seen by the outside.
This is now fixed. No matter if queue.type="direct" is specified or
left out, the call will always by synchronous. Any values written to
variables will also be seen by the "outside world" in later processing
Note that this has some potential to BREAK EXISTING CONFIGURATIONS.
We deem this acceptable because:
1. this was racy at all, so unexpected behaviour could alwas occur
2. it is actually unlikely that someone used the triggering conditions
in practice. But we can not outrule this, especially when the
configuration was auto-generated.
Potential compatibility issues can be solved by defining a small
array-memory queue on the ruleset in question instead of specifying
direct type.
Again, we expect that almost all users will never experience any
problems. If you do, however, please let us know: we may add an
option to re-enable the bug.
- 2021-10-12: ksi bugfix: locking bug fixed in rsksiCtxOpenFile
Thanks to Taavi Valjaots for the patch.
- 2021-10-11: core bugfix: fix typo in error message
Thanks to github user jkschulz for the patch.
- 2021-10-11: tcpsrv bugfix: compilation without exceptions
tcpsrv.c:992:1: error: label at end of compound statement
Quoting from pthread.h:
pthread_cleanup_push and pthread_cleanup_pop are macros and must always
be used in matching pairs at the same nesting level of braces.
Amends commit bcdd220142ec9eb106550195ba331fd114adb0bd.
Thanks to Orgad Shaneh for the patch.
- 2021-10-11: mkubernetes bugfix: no connection retry to kubernetes APP
When connection to the kubernates API was not possible, mmkubernetes
did not retry. This does now happen via regular rsyslog retry
Thanks to github user jayme-github for the analysis and patch.
- 2021-10-11: openssl bugfix: Correct gnutlsPriorityString (custom ciphers) behaviour
- Only apply default anon ciphers if gnutlsPriorityString is NULL and
Authentication Mode is set to anon. Otherwise we do not set them
as they overwrite custom Ciphers.
- Added two tests for custom cipher configuration (anon/certvalid mode).
- Add call for applyGnutlsPriorityString if gnutlsPriorityString changes.
- Merged openssl init code from Connect into osslInitSession
- 2021-10-11: build issue: handle undefined MAXPATHLEN, PATH_MAX
While we handled missing PATH_MAX, we did not handle missing MAXPATHLEN.
This happens under GNU/Hurd, because there is no official limit. However,
extremely long pathes are extremely uncommon, so we do not want to
use slow dynamic alloc each time we need to build pathes. So we
impose a limit of 4KiB, which should be fairly enough. Note that
this obviously increases stack requirements in GNU/Hurd.
As suggested by Michael Biebl, we have now implemented a generic
approach to handle this via autoconf.
- 2021-09-12: openssl: extended output information on connection failure
Now includes the remote client/server IP address in the log output.
- 2021-09-12: imhttp enhancements - query parameter ingestion & basic auth support
- Basic Authentication support & tests
* configured via imhttp option "basicAuthFile". This option should be configured
to point to your htpasswd file generated via a standard htpasswd tool.
- Query parameter ingestion capability & tests
use t `addmetadata` option to inject query parameters into
metadata for imhttp input.
libaprutil (libaprutil1-dev on debian'ish, apr-util-devel on Red Hat)
Thanks to Nelson Yen for the patch.
- 2021-09-07: testbench bugfix: privdrop tests under root user did not work
When running under root, the privdrop tests did not properly work. This
patch fixes the issue and skips test where necessary.
This also includes some modernization of the related tests.
- 2021-09-07: core/ratelimiting: fix rate limiting for already parsed messages
Rate limiting may not have worked if the considered message had already
been parsed (not having NEEDS_PARSING in msgFlags).
This affects also imuxsock in its default configuration
(useSpecialParser="true" and ratelimit.severity="1")
- 2021-09-07: core bugfix: use of property $wday terminates string
When $wday is used inside a template, all template parts after it
are ignored. For exmaple:
template(name="json_filename" type="string" string="/var/log/%$wday%.log")
would generate something like "/var/log/0" - the ".log" part would be
missing. For the same reason, $wday can not reliably checked in script
Thanks to Alain Thivillon for reporting the bug and providing an
excellent analysis, which essentiellay was exactly this fix here.
- 2021-09-07: core/queue bugfix: potential misadressing when queue discarded messages
When a discard mark was set, the queue was very busy and discarded messages, a
NULL pointer access could happen. Depending on circumstances, several problems
could occur, including a SEGFAULT. This is now fixed.
- 2021-09-07: imdiga bugfix: iOverallQueueSize calculation could be incorrect
This issue only affects testbench and rsyslog development debugging. The active
messages counter, used for synchronizing test steps, went wrong when the queue
discarded messages on it's consumer thread. Now fixed.
- 2021-09-06: gnutls driver: SAN priority did not work correctly on server side
PrioritizeSAN was not propagated when accepting a new connection, this is now fixed.
Thanks to Attila Lakatos for the patch.
- 2021-08-24: config: implement script-equavalent for $PrivDrop* statements
Scheduled Release 8.2108.0 (aka 2021.08) 2021-08-17
- 2021-08-16: openssl tls: Improved error message output on tls failures.
- 2021-08-16: impstats: add percentile metrics tracking functionality
Brief overview:
TO configure tracking percentile metrics in rainerscript:
User would need to define:
- which percentile to track, such as [p50, p99, etc.]
- window size - note, this correlates directly with memory usage to
track the percentiles.
To track a value, user would call built-in function `percentile_observe()` in their configurations to
record an integer value, and percentile metrics would be emitted every
impstats interval.
Thanks to Nelson Yen for the patch.
- 2021-08-12: imfile: add parameter "ignoreolderthanoption"
instructs imfile not to ingest a file that has not been modified in the
specified number of seconds.
Thanks to github user yanjunli76 for the patch (submitted from Nelson Yen)
- 2021-08-10: imklog bugfix: invalid memory adressing, could cause abort
This is a regeression from commit 94c4a87. It introduced a free() call
using an object that was no longer valid (the main pointer to the
to-be-freed object) was already freed at time of use. This could
cause various issues, including a segfault.
Note: this bug was triggerred only during late phase of rsyslog
shutdown, so it did not affect regular operation.
Special thanks to github user wxiaoguang for analyzing the issue
and providing a draft fix proposal, on which this patch builds.
see also
- 2021-08-09: imfile bugfix: deleteStateOnFileDelete missed some state files
When the log file is deleted, imfile would attempt to delete the statefile but it
was missing the file_id part of the statefile name. This means the statefiles were
only removed in the log file was less than 512 characters, because for very small
files the file ID hash is not created. This lead to some state files not being
Thanks to pearseimperva for the patch.
- 2021-08-09: imfile bugfix: hash char invalidly added in readmode != 0
If imfile is ingesting log files with readMode set to 2 or 1, the resulting
messages all have a '#' character at the end. This patch corrects the behaviour.
Note: if some external script "supported" the bug of extra hash character at
the end of line, it may be necessary to update them.
- 2021-08-09: omelasticsearch bugfix: errorFile mutex was not consistently locked
Lock the file during SIGHUPs to avoid issues with concurrent accesses by
Thanks to François Poirotte for the patch.
- 2021-08-09: imudp: add socket type (IPv4 vs. 6) to input name
Most importantly, the input name is used for stats counter names as
well. Previously, the same name was used for IPv4 and IPv6, so we had
two counters with an equal name. That left users puzzled.
Unfortunately, this change can potentially require changes to existing
analysis scripts, as the name is now slightly different.
- 2021-08-06: omfwd: add capability for action-specific TLS certificate settings
This permits to override the global definitions for TLS certificates
at the action() level.
- 2021-08-06: imfile bugfix: file handle leak if "freshStartTail" was turned on
- 2021-08-05: imtcp: permit to use different certificate files per input/action
This completes the ability to override global/default TLS settings at the imtcp
input() level. Support for using multiple CAs/Certs per Connection is now provided.
- 2021-08-04: imptcp bugfix: keep alive interval was incorrectly set
The interval was accidentally set to keep alive interval. This has been
- 2021-07-08: openssl network driver bugfix: small memory leak
Fixes a static, non-growing memory leak which existed when parameter
"GnutTLSPriorityString" was used. This was primarily a cosmetic issue,
but caused some grief during development in regard to memory leak
Note: yes, this is for openssl -- the parameter name is historical.
- 2021-07-07: psrv bugfix: abort if no listener could be started
Modules (like imtcp and imdiag) which use tcpsrv could abort or
otherwise malfunction if no listener for a specific input could
be started.
Found during implementing a new feature, no report from practice.
But could very well happen.
- 2021-07-07: mmkubernetes bugfix: apiserver error handling
- Added graceful handling of apiserver errors with unexpected responses,
i.e., anything other than 200, 404, or 429. Idea is that apiserver
transient error state will recover. We don't want mmkubernetes to miss
metadata resolution for containers that don't have cached metadata.
During these transient error states, mmkubernetes will provide basic
container file path based resolution of namespace and pod metadata for
new pods whose metadata is not yet cached. After this error state
recovers, mmkubernetes is expected to resume its metadata resolution as
- Added a unit test case for apiserver return 500 with changes to mock server
- Fixed existing unit test that was failing due to missing expected results file
- Added mmkubernetes unit tests to testbench
Thanks to Abdul Waheed for the patch (submitted from Nelson Yen).
- 2021-07-07: ommongodb bugfixes
- Fix Segmentation fault when server is down
- Add server connexion check while resuming
Thanks to Kevin Guillemot for the patch.
- 2021-06-28: omkafka improvements
- drain librdkafka queues and retry later during rsyslog restart or hup. This
re-injects messages into rsyslog's native queues.
- add statsname on per kafka instance for better visibility
- omkafka - count errors related ssl as "errors_ssl"
Thanks to Nelson Yen for the patch.
- 2021-06-23: some CI/QA improvements, Travis-CI disabled
For the time being, Travis CI is disabled because it was outdated and Travis also
changed their system. We will re-evaluate if we re-enable it. Since quite a while
the Travits tests were redundant with the rest of CI, so this does not reduce
- 2021-06-23: omhttp bugfix: dynrestpath param in batch mode invalid
When batchmode was used, the templates could not be used to
expand dynrestpath. We are now storing the restpath param
within the batch data if we are in batch mode.
When we are in batch mode, and the restpath value changes, the
batch is submitted and reinitialized
- 2021-06-17: add predefined template RSYSLOG_SyslogRFC5424Format
This is essentially the same as RSYSLOG_SyslogProtocol23Format with
a better name and a fix to remove the unnecessary LF at the end of
the message.
The different name also enables us to fix the LF issue without
any concern about backwards compatibility.
- 2021-06-17: impstats/bugfix: _sender_stats reports integer counter as string
Note that this introduces a small backwards incompatibility: in previous output
the field was of string type, now it is integer (as intended). We discussed this
on the mailing list and the overwhelming thought was that this is not a problem
because almost all analysis backends are able to cover that format change. This made
the bugfix essentially costmetic.
HOWEVER, if you still experience issues, please let us know. We can add an option
to provide the previous format, and just spared to do so because there was no
evidence it was needed.
Scheduled Release 8.2106.0 (aka 2021.06) 2021-06-15
NOTE: the prime new feature is support for TLS and non-TLS connections
via imtcp in parallel. Furthermore, most TLS parameters can now be overriden
at the input() level. The notable exceptions are certificate files, something
that is due to be implemented as next step.
- 2021-06-14: new global option "parser.supportCompressionExtension"
This permits to turn off rsyslog's single-message compression extension
when it interferes with non-syslog message processing (the parser
subsystem expects syslog messages, not generic text)
- 2021-05-12: imtcp: add more override config params to input()
It is now possible to override all module parameters at the input() level. Module
parameters serve as defaults. Existing configs need no modification.
- 2021-05-06: imtcp: add stream driver parameter to input() configuration
This permits to have different inputs use different stream drivers
and stream driver parameters.
- 2021-04-29: imtcp: permit to run multiple inputs in parallel
Previously, a single server was used to run all imtcp inputs. This
had a couple of drawsbacks. First and foremost, we could not use
different stream drivers in the varios inputs. This patch now
provides a baseline to do that, but does still not implement the
capability (in this sense it is a staging patch).
Secondly, we now ensure that each input has at least one exclusive
thread for processing, untangling the performance of multiple
inputs from each other.
see also:
- 2021-04-27: tcpsrv bugfix: potential sluggishnes and hang on shutdown
tcpsrv is used by multiple other modules (imtcp, imdiag, imgssapi, and,
in theory, also others - even ones we do not know about). However, the
internal synchornization did not properly take multiple tcpsrv users
in consideration.
As such, a single user could hang under some circumstances. This was
caused by improperly awaking all users from a pthread condition wait.
That in turn could lead to some sluggish behaviour and, in rare cases,
a hang at shutdown.
Note: it was highly unlikely to experience real problems with the
officially provided modules.
- 2021-04-22: refactoring of syslog/tcp driver parameter passing
This has now been generalized to a parameter block, which makes it much cleaner and
also easier to add new parameters in the future.
- 2021-04-22: config script: add re_match_i() and re_extract_i() functions
This provides case-insensitive regex functionality.
Scheduled Release 8.2104.0 (aka 2021.04) 2021-04-20
- 2021-04-19: new contributed module imhiredis
Thanks to Théo Bertin (frikilax) for the patch.
- 2021-04-19: new built-in function get_property() to access property vars
Provides ability to evaluate a rsyslog variable using dynamically
evaluated parameters.
1st param is the rsyslog param, 2nd param is a key, can be an array
index or key string.
Useful for accessing json sub-objects, where a key
needs to be evaluated at runtime. Can be used to access arrays as well.
Thanks to Nelson Yen for contributing this module.
- 2021-04-19: mmdblookup: add support for mmdb DB reload on HUP
Thanks to Théo Bertin (frikilax) for the patch.
- 2021-04-19: script bugfix: empty array in foreach() improperly handled
When running a foreach() loop inside a ruleset, if the json array/object iterated
over is empty but valid, the foreach will make the message processing in the
ruleset abort operation, no following operation (such as actions) will be
executed after this.
Thanks to Théo Bertin (frikilax) for the patch.
- 2021-04-19: imjournal bugfixes (handle leak, empty file)
Flush the FILE* buffer before rename & fsync in order
to not end up syncing an empty file.
Also, close WorkDir on fsync in order to prevent
file descriptor leakage.
Thanks to github user gerd-rausch for the fix.
- 2021-04-06: new contributed function module fmunflatten
This commit adds a new rainerscript function to unflatten keys in a JSON tree. It
provides a way to expand dot separated fields.
<result> = unflatten(<source-tree>, <key-separator-character>);
It allows for instance to produce this: { "source": { "ip": "", "port": 443 } }
from this source data: { "source.ip": "", "source.port": 443 }
Thanks to Julien Thomas for the contribution.
- 2021-02-22: test bugfix: some tests did not work with newer TLS library versions
Newer versions provide TLS versions that cannot be disabled in older versions as they
are unknown there. This is solved by setting restrictions in multiple steps. For
older library versions, the final step will error out, but the other one be applied.
This permits to achieve proper test results.
- some improvements to project CI
Scheduled Release 8.2102.0 (aka 2021.02) 2021-02-16
- 2021-02-15: omfwd: add stats counter for sent bytes
Thanks to John Chivian for suggesting this feature.
- 2021-02-15: omfwd: add error reporting configuration option
RSyslog on a plain TCP cannot guarantee the message delivery
without using RELP protocol. Besides that the logs may be
flooded with connection errors making the rest of messages
difficult to find. To alleviate the problem (see issue 3910),
this patch adds a configuration option that enables to reduce
the number of network errors logged and reported.
For example, if each 10th network error message should be logged,
the rsyslog configuration has to be updated as follows.
action(type="omfwd" Target="<IP_ADDR>" Port="<PORT>" Protocol="tcp" ConErrSkip="10")
Thanks to Libor Bukata for the patch.
- 2021-02-15: action stats counter bugfix: failure count was not properly incremented
In some cases the counter was not incremented, most notably with transaction-enabled
Thanks to github user thinkst-marco for the patch.
- 2021-02-15: action stats counter bugfix: resume count was not incremented
And so it always stayed at zero.
Thanks to github user thinkst-marco for the patch.
- 2021-02-15: omfwd bugfix: segfault or error if port not given
If omfwd is configured via RainerScript config format and the "port"
parameter is not given, a segfault will most likely happen on
connection establishment for TCP connections. For UDP, this is
usually not the case.
Alternatively, in any case, errors may happen.
Note that the segfault will usually happen right on restart so this
was easy to detect.
We did not receive reports from practice. Instead, we found the bug
while conducting other work.
- 2021-01-29: lookup table bugfix: data race on lookup table reload
A data race could happen when a lookup table was reloaded. We found
this while moving to newer version of TSAN, but have no matching
report from practice. However, there is a potential for this to cause
a segfault under "bad circumstances".
- 2021-01-18: testbench modernization
Bump dependency versions, use newer distro versions for some tests.
Make kafka distcheck separate to help diagnose flaky kafka tests.
- 2021-01-16: testbench: fix invalid sequence of kafka tests runs
kafka tests can not run well in parallel (mostly due to ressource
constraints on CI machines). Accidentally, this was not enforced for
one of the tests. That could lead to random failures and false positives.
- 2021-01-14: testbench: fix kafkacat issues
The kafkacat tool has an upper limit of how many messages it can send
at once. Going over that limit causes messages loss. The exact limit
seems to depend on the environment. This causes testbench false positives.
This commit fixes two related issues:
- errors during kafkacat run were not detected - this has been added
- we now have a "max messages at once" setting, after which kafkacat
is restarted for the next batch of messages. It currently is set
to 25,000 msgs per incarnation. All tests loop now to send the
required number of messages. This has been fixed at the testbench
framework level, so no need to adjust individual tests.
- 2021-01-14: testbench: fix year-dependendt clickhouse test
A test had the year value hardcoded and as such failed whenever the
year changed. This patch corrects that.
Scheduled Release 8.2012.0 (aka 2020.12) 2020-12-08
- 2020-12-07: testbench bugfix: some tests did not work in make distcheck
- certificate file missing in dist tarball
- some test cases did not properly specify path to cert file
Thanks to Michael Biebl for alerting us and providing part of
the fix.
- 2020-12-07: immark: rewrite with many improvements
- mark message text can now be specified
- support for rulesets
- support for using syslog API vs. regular internal interface
- support for output template system
- ability to specify is mark message flag can be set
- minor changes and improvements
- 2020-11-30: usability: re-phrase error message to help users better understand cause
see also
- 2020-11-10: add new system property $now-unixtimestamp
Among others, this may be used as a monotonic counter
for doing load-balancing and other things.
Thanks to Nicholas Brown for suggesting this feature.
- 2020-11-04: omfwd: add new rate limit option
Adding new rate limit option to omfwd for rate limiting
syslog messages sent to the remote server
Specifies the rate-limiting interval in seconds.
Default value is 0, which turns off rate limiting.
Specifies the rate-limiting burst in number of messages.
Thanks to Dinesh-Ramakrishnan for the patch.
- 2020-11-03: omfwd bug: param "StreamDriver.PermitExpiredCerts" is not "off" by default
The default behaviour of expired certificates of stream driver in TLS mode, should
have been that the see tcp transmission is closed due to expired certificates, and
error messages emited in rsyslog status. This was not the case. That in turn could
lead to permitting sessions which should not be permitted.
Thanks to Vincent Zhu for alerting us and providing a great problem analysis
Scheduled Release 8.2010.0 (aka 2020.10) 2020-10-20
- 2020-10-13: gnutls TLS subsystem bugfix: handshake error handling
If the tls handshake does not immediatelly finish, gnutls_handShake is called in
doRetry handler again. However the error handling was not
complete in the doRetry handler. A failed gnutls_handShake call
did not abort the connection and properly caused unexpected
problems like in issues:
- 2020-10-13: core/msg bugfix: memory leak
There is a missing call to json_object_put(json) if the call to
jsonPathFindParent() failed. It's leaking memory. Depending on workload and config,
this leak can potentially grow large (albeit we did not see reports from practice).
Thanks to Julien Thomas for the patch.
- 2020-10-13: core/msg bugfix: segfault in jsonPathFindNext() when <root> not an object
The segfault gets happens when <bCreate> is 1 and when the <root>
container where to insert the <namebuf> key is not an object.
Here is simple reproducible test case:
// ensure we start fresh
// unnecessary if there was no previous set
unset $!;
set $! = "";
set $!event!created = 123;
Thanks to Julien Thomas for the patch.
- 2020-10-13: openssl TLS subsystem: improvments of error and status messages
Adding error logs at the ssl handshake failure scenarios.
Adding the header "nsd_ossl:" tag to these logs to identify
the origin module from which logs are generated.
Thanks to Anusha Pai G for the patch.
- 2020-10-06: add 'exists()' script function to check if variable exists
This implements a way to check if rsyslog variables (e.g. '$!path!var') is
currently set of not.
Sample: if exists($!somevar) then ...
- 2020-10-03: core bugfix: do not create empty JSON objects on non-existent key access
Performing a condition (eg: check for an empty string) on a subtree key that do not
exists (depth > 1 from the root container), creates an empty "parent" object.
Depending on your context, you may end up with (kind of...) annoying garbage when
producing object documents (for instance to index in ES).
Also fixes a hypothetical hang condition with an almost (?) unused plugin parameter
passing mode, for details see
Thanks to Julien Thomas for the patch.
- 2020-09-28: gnutls subsysem bugfix: potential hang on session closure
Some TLS servers don't reply to graceful shutdown requests "for
optimization". This results in rsyslog's omfwd+gtls client to wait
forever for a reply of the TLS server which never comes, due to shutting
down the connection with gnutls_bye(GNUTLS_SHUT_RDWR).
On systemd systems, commands such as "systemctl restart rsyslog" just
hang for 1m30 and rsyslogd gets killed upon timeout by systemd.
This is fixed by replacing the call to gnutls_bye(GNUTLS_SHUT_RDWR) by calls to
gnutls_bye(GNUTLS_SHUT_WR) which is sufficient and doesn't wait for a
server reply.
As an example, Kiwi Syslog server is known to cause this issue.
Thanks to Renaud Métrich for the patch.
- 2020-09-23: core/network bugfix: obey net.enableDNS=off when querying local hostname
Local hostname resolution used DNS queries even if the enableDNS was set to off, and
this could cause unexpected delays in the HUP signal handling if the DNS server was
not responsive.
Thanks to Samu Nuutamo for the fix.
- 2020-09-14: core bugfix: potential segfault on query of PROGRAMNAME property
A data race can happen on variable iLenProgram as it is not guarded
by the message mutex at time of query. This can lead to it being
non -1 while the buffer has not yet properly set up.
Thanks to Leo Fang for alerting us and a related
patch proposal.
- 2020-09-14: imtcp bugfix: broken connection not necessariy detected
Due to an invalid return code check, broken TCP sessions could not
necessarily be detected "right in time". This can result is the loss
of one message.
Thanks to Leo Fang for the patch.
- 2020-09-14: new module: imhttp - http input
permits to receive log data via HTTP.
uses http library to provide http input.
user would need to configure an 'endpoint' as input, along
with a ruleset, defining how the input should be routed in
Thanks to Nelson Yen for contributing this module.
- 2020-09-11: mmdarwin bugfix: potential zero uuid when reusing existing one
- fix a use-after-free variable during darwin uuid message extraction
- improve debug/output by logging uuid parse errors
Thanks to github user frikilax for the patch.
- 2020-09-10: imdocker bugfix: build issue on some platforms
An invalid variable type was used, leading to compile errors at least on
all platform that use gcc 10 and above. Otherwise, however, it looks like the
issue caused no real harm.
- 2020-09-07: omudpspoof bugfix: make compatbile with Solaris build
Thanks to Dagobert Michelsen for the patch.
- 2020-09-03: testbench fix: python 3 incompatibility
- 2020-09-02: core bugfix: segfault if disk-queue file cannot be created
When using Disk Queue and a queue.filename that can not be created
by rsyslog, the service does not switch to another queue type as
supposed to and crashes at a later step.
- 2020-08-26: cosmetic: fix dummy module name in debug output
When we have optional components (like imjournal) a dummy module
is used. It's sole purpose is to emit "this module is not available".
During init, the module emitted an invalid module name into the debug
log. This has now been replaced by the generic term "dummy".
Note: it is highly unlikely that someone will ever see that message
at all, as it is unlikely for the dummy modules to be build.
see also:
Thanks to Thomas D. (whissi) for the patch.
- 2020-08-26: config bugfix: intended warning emitted as error
When there are actions configured after a STOP, a warning should be
emitted. In fact, an error message is generated. This prevents the
construct, which may have some legit uses in exotic settings. It
may also break older configs, but as the message is an error
for so long now, this should be no longer of concern.
Scheduled Release 8.2008.0 (aka 2020.08) 2020-08-25
- 2020-08-25: imdocker bugfix: error reporting not always correct
A wrong function to obtain the error code was used. This
could lead to invalid error messages.
Thanks to Steve Grubb for the bug report and fix proposal.
- 2020-08-25: imptcp: add max sessions config parameter
The max is per-instance, not global across all instances.
There is also a bugfix where if epoll failed I think we could leave a
session linked in the list of sessions, this code unlinks it.
Thank to Alfred Perlstein for the patch.
- 2020-08-24: omelasticsearch bugfix: reply buffer reset after health check
The issue happens when more than one server is defined on the
action. On that condition a health check is made through
checkConn() before sending the POST. The replyLen should be
set back to 0 after the health check, otherwise the response
data received from the POST gets appended to the end of the
last health check.
Thanks to Julien Thomas for the patch.
- 2020-08-14: omfile: do no longer limit dynafile cache size in legacy format
When using obsolete legacy config format, omfile had a hard limit of
1,000 dynafile cache entries. This does not play well with very
large installation. This limit is now removed and converted into
a warning if cache size > 25,000 is specified.
Note: the problem can easily be worked-around by using modern
config format (RainerScript).
- 2020-08-13: imudp: fix very small, static memory leak
When ruleset support was used, the ruleset name was not freed upon rsyslog
termination. While this has no consequences for regular runs, it generates
leak errors under memory debuggers and as such makes debugging harder than
Thanks to github user frikilax for the patch.
- 2020-08-13: omelasticsearch: add parameter skipPipelineIfEmpty
When POST'ing a document, Elasticsearch does not allow an empty pipeline
parameter value. This patch introduces boolean option skipPipelineIfEmpty
to the omelasticsearch action. When set to true, the pipeline parameter
won't be posted. Default is false so we do not modify current behavior.
Thanks to Julien Thomas for the patch.
- 2020-08-12: systemd service file removed from project
This was done as distros nowadays have very different service files and it no
longer is useful to provide a "generic" (sic) example.
see also:
- 2020-08-11: gnutls TLS driver bugfix: EKU check not done properly
When the server accepted a new connection, it did not properly set the
dataTypeCheck field based on the listening socket. That resulted in
skipping ExtendedKeyUsage (EKU) check on the client.
Thanks to Daiki Ueno for the patch.
- 2020-08-06: MMDARWIN:: improve configuration flexibility and UUID fix
-t pu now able to get fields from local variables ($.)
- now able to configure a custom root container for mmdarwin fields
- now able to put nested keys ($!key1!key2)
- don't regenerate a UUID each time, but instead check if one exists before
creating it (allow successive calls without losing previous UUID)
Thanks to github user frikilax for the contribution.
- 2020-08-06: add --enable-imjournal=optional ./configure option
- 2020-08-06: IMPCAP::Fixes: segfault, memory and build corrections
* fix bug in ethernet packets parsing
* fix removes build error with gcc10: 'multiple definition of...'
* resolve memory leak during interface init failure (device not freed after post-create error)
* add test 'impcap_bug_ether' to prove ethernet parser fix is working
Thanks to github user frikilax for the contribution.
- 2020-07-14: CI: add support for github actions
- 2020-07-14: imklog: add ruleset support
see also:
see also:
- 2020-07-06: config system fix: ChkDisabled method to make config.enabled work
There was wrong negation in the method so it returned 0/1 in reverse
and also it did not mark the node to not be reported as unknown at all
times which is needed after all.
Thanks to Jiri Vymazal for the patch.
Scheduled Release 8.2006.0 (aka 2020.06) 2020-06-23
- 2020-06-22: queue: permit ability to double size at shutdown
This prevents message loss due to "queue full" when re-enqueueing data
under quite exotic settings.
see also
- 2020-06-22:Fixing imfile segfaulting on selinux denial
If imfile is denied access to file watched trough symlink there is
unchecked condition resulting in access to not initialized memory.
- 2020-06-22: openssl: Fixed memory leak when tls handshake failed.
- 2020-06-22: change systemd service file to wait for network
now that rsyslog is usually only installed for real syslog servers,
we should assume that some network listening or forwarding happens
on start. As such we need to start a bit later, after the network.
This poses no problem as systemd nowadays comes with journal which
is in almost all cases configured to buffer log data while
rsyslog is not yet running.
see also
- 2020-06-22: NEW INPUT MODULE:: impcap, network packets input parser
Thanks to github user frikilax for the contribution.
- 2020-06-22: ksi bugfix: Optimized code in KSI module initialization fixed.
KSI module initialization will not stuck in infinite loop when code is
built with optimization -O2.
- 2020-06-05: operatingstatefile bugfix: month was given too low
The month was printed with the range 0 (January) to 11 (December).
This has now been corrected.
- 2020-06-05: build system: add "optional" build functionality to some components
If used, builds a dummy module which just emits a "module not supported
on this platform" error message when loaded.
Primary use case for this system is Debian-ish builds on SUSE OBS,
where we prefer to have a single package definition for all versions
(else things get much more complicated).
- 2020-05-23: config system bugfix: backticks cat segfault if file cannot be opened
when a `cat <filename>` construct is used in rsyslog.conf and <filename> can not
be accessed (does not exist, no permissions, ...), rsyslog segfaults.
Thanks to Michael Skeffington for notifying us and providing root cause analysis.
- 2020-05-15: imtcp bugfix: octet framing/stuffing problem with discardTruncatedMsg on
When "discardTruncatedMsg" was enabled in imtcp, messages were incorrectly
skipped if the last character before the truncation was the LFdelimiter.
Also adds two testbench tests for this case.
- 2020-05-12: ompipe bugfix: race during HUP
When HUP was received, the write mutex was not acquired. This could
lead to unexpected invalidation of the output file descriptor.
Thanks to Julien Thomas for alerting us on this issue.
see also
- 2020-05-12: ompipe: add action parameter tryResumeReopen
Sometimes we need to reopen a pipe after an ompipe action gets
suspended. Sending an HUP signal to rsyslog does the job but requires
an interraction with rsyslog. The patch adds support for a new boolean
option, tryResumeReopen, for the ompipe action. It mimics what an HUP
signal would do.
Thanks to Julien Thomas for the patch.
- 2020-05-12: imjournal: remove strcat call
Thanks to Jeff Marckel for the patch.
- 2020-05-12: build system: libzcmq version requirement needs to be bumped
Thanks to Thomas Deutschmann for pointing this out.
- 2020-05-12: testbench: download ElasticSearch binaries from
The official ElasticSearch download site sometimes denies the download.
- 2020-05-11: openssl netstream driver bugfix: context leak
The context object was not properly freed.
Thanks to Michael Zimmermann for the fix.
- 2020-05-11: omhttp: Add support for multiple http headers
Allows the inclusion of multiple http headers on the REST call.
Thanks to callmegar for the patch.
- 2020-04-29: core bugfix: group id could not be obtained for very large groups
Thanks to github user emilbart for the patch.
- 2020-04-29: testbench additions (relp broken connection test)
- 2020-04-29: omudpspoof bugfix: issues with oversized messages
First issue was an incorrect packet length in UDP Header. It has to be the FULL UDP Packet
regardless of the MTU Setting. As a result regardless of IP fragmentation, the MTU setting
also limited the siizmax size of the UDP message.
The second issue was incorrect calculation of the UDP Checksum with libnet if
IP fragmentation was used (Based on MTU Setting). As a result, the network packets were
dropped by the tcp stack before they even could reach there target. The workarround for this
problem is, that we set the UDP Checksum to 0x0000 which allows skipping of the checksum
test. Fixing the problem by calculating the correct UDP Checksum would require some
code changes in the libnet.
Also fixed the omudpspoof bigmsg test and increased the testing size to 16KB.
- 2020-04-29: omprog: fix assert failed on HUP with output flag
If the 'output' setting of omprog was used and rsyslog received a HUP
signal just after starting (and before the omprog action received the
first log to process), an internal assertion could fail, causing
rsyslog to terminate. The failure message was "rsyslogd: omprog.c:660:
closeOutputFile: Assertion `pCtx->bIsRunning' failed."
The failure could also occur if rsyslog received a HUP signal during
the shutdown sequence.
This bug was introduced in v8.2004 by PR
Although a test already existed that checked the interaction of HUPs
with the 'output' setting, it didn't always fail in this particular case
due to timing conditions. The test has been improved to cover this case
more reliably.
Thanks to Joan Sala Isern for the patch.
Scheduled Release 8.2004.0 (aka 2020.04) 2020-04-28
- 2020-04-28: ksi bugfix: When KSI module is suddenly closed, files are finalized
In async. mode all pending signature requests are closed immediately and
unsigned block marker is attached with message about sudden closure.
Similar approach is used for blocks that already contain some records.
Empty blocks are just closed without any metadata.
Thanks to Taavi Väljaots for the patch.
- 2020-04-28: ksi bugfix: Signer thread initialization is verified before usage.
When signer thread is created in rsksiInitModule thread successful
initialization is verified before returning the function. This will
prevent adding records to not initialized module and in case of an
error signature files opened will contain only magic bytes.
Thread flags replaced with thread state.
When init module fails, module is disabled.
Thanks to Taavi Väljaots for the patch.
- 2020-04-28: ksi bugfix: Hardcoded default hash algorithm replaced with 'default'
Instead of hardcoded SHA-256 KSI_getHashAlgorithmByName("default")
is used to get default hash function.
Function rsksiSetHashFunction and SetCnfParam updated.
Thanks to Taavi Väljaots for the patch.
- 2020-04-28: imfile bugfix: poential segfault in stream object on file read
- if cstrLen(pThis->prevMsgSegment) > maxMsgSize then len calculation
become negative if cstrLen(thisLine) < cstrLen(pThis->prevMsgSegment)
This causes illegal access to memory location and thus causing segfault.
- assigning len = 0 if cstrLen(pThis->prevMsgSegment) > maxMsgSize so that
it access the correct memory location.
Thanks to github user jaankit
- 2020-04-28: openssl TLS drivers: made more reliable for older openssl versions
OpenSSL can retry some failed operations, but older versions need an explicit
opt-in to do so. This is now done.
- 2020-04-28: omprog: fix bad fd errors in daemon mode
When omprog was used with the 'forceSingleInstance=on' option, and/or
the 'output' setting, "bad file descriptor" errors occurred, which
prevented the external program to be executed and/or the program output
to be correctly captured. The bug could also manifest as "resource
temporarily unavailable" errors, or other errors related to the use of
invalid/reassigned file descriptors. These errors only happened when
rsyslog ran in daemon mode (i.e. they didn't happen if rsyslogd was
run with the '-n' option).
The cause of the bug was that omprog opened the pipe fds needed by
these flags during the configuration load phase (in the 'newActInst'
module entrypoint). This is a bad place since the fork of the daemon
occurs after this phase, and all fds are closed when the daemon process
is started (see 'initAll' in rsyslogd.c), hence invalidating the
previously opened fds.
To correct this, the single child process and the output capture thread
are now started later, when the first log message is received by the
first worker thread. (Note: the 'activateCnf' module entrypoint, despite
being invoked after the fork, cannot be used for this purpose, since it
is invoked per module, not per action instance.)
Currently no automated test exists for this use case since the testbench
always runs rsyslog in non-daemon mode.
Affected versions: v8.38 and later
Thanks to Joan Sala Isern for the patch.
- 2020-04-28: omfile bugfix: $outchannel split log lines at rotation time
- 2020-04-17: openssl: add support for libreSSL
Disable use of "@SECLEVEL" in default cipher string and
avoid SSL_CONF_CTX_set_flags() API when LIBRESSL is used.
This means tlscommands will not work.
- 2020-03-04: imudp bugfix: build problems on some Linux kernel versions
Thanks to Wen Yang for the patch.
- 2020-03-02: conf output bugfix: -o produces missing space between call and rulename
Thanks to Tetiana Ohnieva for the patch.
Scheduled Release 8.2002.0 (aka 2020.02) 2020-02-25
- 2020-02-25: imfile: add per minute rate limiting
Add MaxBytesPerMinute and MaxLinesPerMinute options.
These take integer values and, respectively, limit the number
of bytes or lines that may be sent in a minute.
This can be used to put a limit on the count or volume of logs
that may be sent for an imfile.
Thanks to Greg Farrell for the patch.
- 2020-02-24: core: add global parameter "security.abortOnIDResolutionFail"
This parameter controls whether or not rsyslog aborts when a name ID
lookup fails (for user and group names). This is necessary as a security
measure, as otherwise the wrong permissions can be assigned or privileges
are not dropped.
The default for this parameter is "on". In previous versions, the default
was "off" (by virtue of this parameter not existing). As such, existing
configurations may now error out.
We have decided to accept this change of behavior because of the potential
security implications.
- 2020-02-24: openssl TLS driver bugfix: chained certificates were not accepted
This was supported since always inside GnuTLS driver, but was missing for openssl one.
- 2020-02-24: core bugfix: too early parsing of incoming messages
In theory, rsyslog should call parsers on the queue worker threads whenever
possible. This enables the parsers to be executed in parallel. There are
some cases where parsers needs to be called earlier, namely when parsed
data is needed for rate-limiting.
The logic to do this previously did not work correctly and was fixed six
years ago (!) by b51dd22. Unfortunately, b51dd22 was overly agressive:
it actually makes the early parser call now mandatory, effectively moving
parsing to the input side where there is no to little concurrency.
We still do not need to call the parser when all messages, regardless of
severity, need to be rate-limited. This is the default and very frequent
case. This patch introduces support for this and as such makes parsers
able to run in parallel in the frequent case again.
- 2020-02-20: testbench bugfix: two minor issues in test
lead to false positives during test runs (depending on circumstances)
- 2020-02-20: testbench: set max extra data length for tcpflood from 200 to 512KiB
Added a imrelp test for big messages (256KB).
- 2020-02-20: config system bugfix: 'config.enabled' directive oddities
Previously the directive was processed way too late which caused false
errors whenever it was set to 'off' and possibly other problems.
Thanks to Jiri Vymazal for the patch.
- 2020-02-09: imfile bugfix: timeout did not work on very busy system
The timeout feature was soley based on timeouts of the poll()
system call. On a very busy system, this would probably happen
very seldomly. Moreover, the timeout could occur later than
expected on any system with high load.
The issue was not reported from practice but discovered during
CI system improvements.
- 2020-01-30: build system: change --enable-imfile-tests default to "yes"
This was accidentally set to "no" some time ago (actual commit unknown). Tests for
imfile should by default run when imfile is enabled.
see also
- 2020-01-27: build system: add option --enable-gnutls-tests
This enables us to build GNUtls support but not necessarily
test it in CI. This is useful for some specialised subcomponent
test. The default is enabled if gnutls is enabled and disabled if not.
- 2020-01-26: testbench: new test for loadbalancing via global vars
This is a popular functionality which had not been routinely tested
in the past.
- 2020-01-26: mmdblookup bugfix: invalid data returned when no entry found
Since the upgrade of the package libmaxminddb on FreeBSD (1.3.2_2 -> 1.4.2),
the module mmdblookup returns the first entry of the mmdb database even if the entry
is not found. After some debug, I found the solution in the official maxminddb
repository : to check if the entry is in database, we must check the found_entry
attribute, otherwise the function MMDB_get_entry_data_list will return the first
entry of the database if the entry is not found in it.
Thanks to Kevin Guillemot for the patch.
- 2020-01-23: oversize message log bugfix: do not close fd -1
The oversize message log fd is always closed on HUP, even if it never
was opened (and thus has -1 value). This patch corrects the issue.
The bug had no know-bad effect in practice other than getting an
(ignored) error status from close(). However, it introduced warnings
in test runs (e.g. when running under valgrind).
- 2020-01-22: imfile bugfix: saving of old file_id for statefiles
Previously we saved old file_id unconditionally, which led to not
deleting old statefiles if files changes without rsyslog running.
Now it should work correctly.
Thanks to Jiri Vymazal for the patch.
- 2020-01-22: imfile bugfix: misadressing and potential segfault
Commit 3f72e8c introduced an invalid memory allocation size. This lead to
too-short alloc and thus to overwrite of non-owned memory. That in turn
could lead to segfaults or other hard to find problems.
The issue was detected by our upgraded CI system. We did not receive
any problem reports in practice. Nevertheless, the problem is real and
people should update affected versions to patched ones.
The bug was present in scheduled stable release 8.1911.0 and 8.2001.0.
see also:
see also:
- 2020-01-20: core bugfix: potential race during HUP
when rsyslog is HUPed immediately after startup and before it is fully
initialized, there is a potential race with the list of loaded modules.
This patch ensures no bad things can happen in that case.
Detected by LLVM TSAN, not seen in practice.
- 2020-01-20: testbench improvements and fixes
modernize tests, reduce robustness against slow machines, provide some
test framework functional enhancements, and optimize some tests.
Also includes some code changes to C testing components. Among others,
tests have slightly been speeded up by reducing the wait time at queue
shutdown. This is possible because of better overall completion checks.
Scheduled Release 8.2001.0 (aka 2020.01) 2020-01-14
- 2020-01-12: core bugfix: race condition related to libfastjson when using DA queue
Rsyslogd aborts when writing to disk queue from multiple workers simultaneously.
It is assumed that libfastjson is not thread-safe.
Resolve libfastjson race condition when writing to disk queue.
see also
Thanks to MIZUTA Takeshi for the fix.
- 2020-01-12: omfwd bugfix: parameter streamdriver.permitexpiredcerts did not work
- 2020-01-11: Bugfix: KSI module + dynafile in asynchronous mode fixed
Thanks to Taavi Valjaots for the patch
- 2020-01-08: tls driver: add support to configure certificate verify depth
Support added in omfwd as instance parameter:
Support added in imtcp as module parameter:
Can be 2 or higher.
Support added into ossl driver
Support added into gtls driver
Added testcases for both drivers.
- 2020-01-08: modernization of testbench
moved some tests to newer standards, hardened them against slow testbench machines,
kafka component download improvements, and prevent dangling left-over test tool
instances from aborted tests
- 2020-01-07: tls subsystem bugfix: default for permitExpiredCerts was invalidly "on"
The problem occurred with commit 3d9b8df in December 2018 and went into
scheduled stable 8.1901.0. Unfortunately, the change in default was not detected
until a year later. This commit re-enables the previous default ("off"), which is
also the only sensible default from a security PoV. Unfortunately, new 2019
deployments may begin to see connection rejection when usin expired certs. As
expired certs should not be used, this hopefully will not cause problems in
Thanks to Jiri Vymazal for the patch.
- 2020-01-01: testbench: improve ElasticSearch test speed
We now support re-using suitable running ES instances, which reduces the
number of restarts.
- 2019-12-31: omelasticsearch: improve curl reply buffer handling
The curl reply buffer (pWrkrData->reply) was allocated, realloced and freed with
each request. This has now been reduced to once per module, slightly increasing
overall performance.
- 2019-12-31: config system: emit proper error message on $ in double-quoted string
- 2019-12-30: core bugfix: rsyslog aborts when config parse error is detected
In defaut settings, rsyslog tries to continue to run, but some data
structures are not properly initialized due to the config parsing error.
This causes a segfault.
In the following tracker, this is the root cause of the abort:
see also
- 2019-12-30: fix some alignment issues
So far, this worked everywhere (for years). But it may still have
caused issues on some platforms.
- 2019-12-27: core bugfix: APP-NAME fields could become empty
RFC 5424 specifies that an empty APP-NAME needs to be indicated by
"-". Instead, the field could become empty under certain conditions.
If so, outgoing 5424 messages were invalidly formatted.
This happened under quite unusual conditions, but could be seen
in practice.
- 2019-12-27: core bugfix: reopen /dev/urandom file descriptor after fork on Linux
This patch updates prepareBackground() in tools/rsyslogd.c to reopen any file
descriptors used for random number generation in the child process. This fixes
an issue on Linux systems where the file descriptor obtained for /dev/urandom
by seedRandomNumber() in runtime/srutils.c was left closed after the fork. This
could be observed in procfs, where /proc/fd/ would show no open descriptors to
/dev/urandom in the forked process. /dev/urandom is reopened as the child may be
be operating in a jail, and so should not continue to use file descriptors from
outside the jail (i.e. inherited from the parent process).
I found that this issue led to rsyslog intermittently hanging during seedIV()
in runtime/libgcry.c. After the fork, the closed file descriptor number tended
to get re-assigned. randomNumber() would then read from an incorrect (although
still valid) file descriptor, and could block (depending on the state of that
file descriptor). This gave rise to the intermittent hang that I observed.
Thanks to Simon Haggett for the patch.
- 2019-12-20: imdocker bugfix: did not compile without atomic operations
- 2019-12-20: omclickhouse: new parameter "timeout"
Thanks to Pavlo Bashynskiy for the patch.
- 2019-12-20: omhiredis: add 'set' mode plus some fixes
- new mode 'set' to send SET/SETEX commands
- new parameter 'expiration' to send SETEX instead of SET commands (only applicable to 'set' mode)
- fixes to missing frees
Thanks to github user frikilax for the patch.
- 2019-12-18: relp: Add support setting openssl configuration commands.
Add new configuration parameter tls.tlscfgcmd to omrelp and imrelp.
(Using relpSrvSetTlsConfigCmd and relpCltSetTlsConfigCmd)
OpenSSL Version 1.0.2 or higher is required for this feature.
A list of possible commands and their valid values can be found in the
The setting can be single or multiline, each configuration command is
separated by linefeed (n). Command and value are separated by
equal sign (=). Here are a few samples:
Add to new testcases for librelp and tlscfgcmd.
- 2019-12-18: bugfix core: potential segfault in template engine
under some circumstances (not entirely clear right now), memory
was freed but later re-used as state-tracking structures were not
properly maintained. Github issue mentioned below has full details.
Thanks to github user snaix for analyzing this issue and providing
a patch. I am committing as myself as snaix did not disclose his or
her identity.
- 2019-12-18: fixed some minor issues detected by clang static analyzer 9
- 2019-12-10: core/config bugfix: false error msg when config.enabled="on" is used
When the 'config.enabled="on"' config parameter an invalid error message
was emitted that this parameter is not supported. However, it was still
applied properly. This commit removes the invalid error message.
- 2019-12-03: omsnmp bugfix: "traptype" parameter invalidly rejected value 6
"Traptype" needs to support values 0 to 6.
However, if value 6(ENTERPRISESPECIFIC) was set, an invalid error message
was emitted. Otherwise processing was correct.
This could lead to problems with automatic config deployment,
as valid configurations were invalidly reported as incorrect.
That in turn could make a deployment fail.
- 2019-12-03: omsnmp: add new parameter "snmpv1dynsource"
If set, the source field from SNMPv1 trap can be overwritten
with a template, default is "%fromhost-ip%". The content should be a
valid IPv4 Address that can be passed to inet_addr(). If the content
is not a valid IPv4 Address, the source will not be set.
- 2019-12-02: imfile bugfix: state file renaming sometimes did not work properly
Now checking if file-id changes and renaming - cleaning state file
accordingly and always checking and cleaning old inode-only style
state files.
Thanks to Jiri Vymazal for the patch.
- 2019-12-02: ratelimit: increase rate limit interval parameter max value
The burst parameter in the ratelimit was increased to an unsigned int
but the interval remained an unsigned short. While it may be unusual,
there is possibly a chance to need to represent an interval longer than
about 3/4 of a day.
While here, go through and normalize all the various incarnations of
rate limiting to be explicitly unsigned int for the burst and interval.
Thanks to github user frikilax for the patch.
- 2019-12-02: ommongodb: Add other supported formats for 'time' and 'date' fields
Thanks to github user frikilax for the patch.
- 2019-12-02: imjournal bugfix: too many messages in error case
Under certain error conditions, `ignorePreviousMessages="on"` could be ignored
an existing messages be processed.
Thanks to github user 3chas3 for the patch.
- 2019-11-27: core bugfix: action on retry mangles messages
When a failed action goes into retry, template content is rendered
invalid if the action uses more than 1 template.
Thanks to Mikko Kortelainen for the patch.
- 2019-11-27: testbench: improve mysql testing support
tests can now run in parallel and are hardened against several glitches
- 2019-11-22: omhttp: add basic support for Loki Rest
Loki is a new message indexer and querier from Grafana Labs. See for details on Loki.
This change provides the initial message structure to send bulk message
payloads to the Loki Rest endpoint. omhttp, received a new bulk message
format called lokirest. Additionally, the plugin relies on the user to
provide the correct "stream" read message format.
A loki template must be json compatible and include a "stream" key of
key value tags, and a values key of an array of 2 element arrays, where
each 2 element array is the unix epoch in nanoseconds followed by an
unstructured message.
An example:
template(name="array_loki" type="string" string="{\"stream\":{\"host\":\"%HOSTNAME%\",\"facility\":\"%syslogfacility-text%\",\"priority\":\"%syslogpriority-text%\",\"syslogtag\":\"%syslogtag%\"},\"values\": [[ \"%timegenerated:::date-unixtimestamp%000000000\", \"%msg%\" ]]}")
- 2019-11-22: testbench: obtain python binary path via AM_PATH_PYTHON
see also
- 2019-11-22: omprog: detect violation of interface protocol
The spec for the omprog interaction with the program it calls specifies
that the program receives one message via one line. In other words:
it must be a string terminated by LF.
However, omprog does currently rely on a proper template to fulfill this
requirement, If the template does not provide for the LF, it is never
written. For the called program, this looks like it does not receive any
input at all. Even if it finally reads data (e.g. due to full buffer),
it will not properly be able to discern the messages.
This handling is improved with this commit.
We cannot just check the template, because at the end of the template
may by a non-constant value. As such, we do not know at config load
time if there is this problem or not.
So the correct approach is to, during runtime, check if each message
is properly terminated. For those that are not:
* we append a LF, because anything else makes matters worse
* log a warning message, at least for a sample of the messages
The warning is useful in the (expected most often) case that the template
is simply missing the LF. While appending works, it slows down processing.
As such the user should be given a chance to correct the config bug.
To avoid clutter, the warning is emitted at most once every 30 seconds.
This value is hardcoded as we do not envision a need to adjust it. Usually
users should quickly fix the template.
- 2019-11-19: core queue: emit warning if parameters are set for direct queue
Direct queues do not apply queue parameters because they are actually
no physical queue. As such, any parameter set is ignored. This can
lead to unintentional results.
The new code detects this case and warns the user.
- 2019-11-19: imjournal bugfix: do not wait too long on recovery try
When trying to recover journal errors, imjournal waited a hardcoded
period of 10s between tries. This was pretty long and could lead to
loss of journal data.
This commit adjust it to 100ms, which should still be fully sufficient
to prevent the journal from "hammering" the CPU.
It may be worth considering to make this setting configurable - but
let's first see if there is real demand to actually do that.
- 2019-11-19: mmutf8fix: enhance handling of incorrect UTF-8 sequences
1. Invalid utf8 detection didn't handle 3 and 4-byte overlong encodings (2
byte overlong encodings were handled explicitly by rejection E0 and E1
start bytes). Unified checks for overlong encodings.
2. Surrogates U+D800..U+DFFF are not valid codepoints (Unicode Standard, D92)
3. Replacement of characters in invalid 3 or 4-bytes encodings was too
eager. It must not replace bytes which are valid UTF-8 sequences. For
example, in [0xE0 0xC2 0xA7] sequence the 0xC2 is invalid as a continuation
byte, but it starts a valid UTF8 symbol [0xC2 0xA7]. That is, with current
code processing the sequence will result in "???" but the correct result is "?§"
(provided that the replacement character is "?").
4. Various tests for UTF-8 invalid/valid sequences.
Thanks to Sergei Turchanov for the patch.
- 2019-11-14: imfile: add new input parameter escapeLF.replacement
The new parameter permits to specify a replacement to be configured
when "escapeLF" is set to "on". Previously, a fixed replacement string
was used ("#012"/"\n") depending on circumstances. If the parameter is
set to an empty string, the LF is simply discarded.
Scheduled Release 8.1911.0 (aka 2019.11) 2019-11-12
- 2019-11-12: core queue: add config param "queue.takeFlowCtlFromMsg"
This is a fine-tuning option which permits to control whether or not
rsyslog shall alays take the flow control setting from the message. If
so, non-primary queues may also block when reaching high water mark.
This permits to add some synchronous processing to rsyslog core engine.
However, it is dangerous, as improper use may make the core engine
stall. As such, enabling this option requires very careful planning
of the rsyslog configuration and deep understanding of the consequences.
Note that the option is applied to individual queues, so a configuration
with a large number of queues can (and must if use) be fine-tuned to
the exact use case.
The rsyslog team strongly recommends to let the option turned off,
which is the default setting.
see also
- 2019-11-12: imrelp: add new config parameter "flowcontrol"
This permits to fine-tune the flowControl parameter. Possible values are
"no", "light", and "full". With light being the default and previously
only value.
Changing the flow control setting may be useful for some rare applications,
but be sure to know exactly what you are doing when changing this setting.
Most importantly, whole rsyslog may block and become unresponsive if you
change flowcontrol to "full". While this may be a desired effect when
intentionally trying to make it most unlikely that rsyslog needs to
lose/discard messages, usually this is not what you want.
see also
- 2019-11-11: imrelp: remove unsafe debug instrumentation
dbgprintf, which is not signal safe, was called from a signal handler
to get better understanding during debugging. While this usually works,
it can occasionally (5%) lead to a hang during shutdown. We have now
removed that debug info as it is no longer vital.
Note: this could only happen during debug runs. Production mode was
not affected. As such, this fix is only relevant to developers.
However, it caused some confusion in the following issue tracker.
see also
- 2019-11-06: ossl driver bugfix: fix wrong OpenSSL Version check
Fix OpenSSL Version check in:
- SetGnutlsPriorityString function in nsd_ossl.c
- initTLS() function tcpflood.c
for more.
This bug lead to not enabling some functionality correctly.
Removed "MinProtocol=TLSv1.1" from two testcases because MinProtocol
is only supported by OpenSSl 1.1.0 or higher and was not really
necessary for the testcases.
- 2019-11-05: mmdarwin: Optimizations, new parameters, update to protocol header
- use permanent worker-dependent buffers to avoid malloc/free for each entry
- move socket structures to worker data, remove global mutex
- add log lines for parameters and general workflow
- don't send body if empty/incomplete (see new parameters)
- don't close/reopen socket every time -> let session open or create new every X
entry (see new parameters)
- clean up code
- added 'send_partial', to let mmdarwin send body if not all fields were
retrieved, or not; default false = only send complete bodies
- added 'socket_max_use' to open new session every X packet, useful for
some versions of Darwin (prior to 1.1)
default is 0 = do not open new session/keep only one
- added 'evt_id' to the darwin header (Darwin v1+ compatibility)
Note: mmdarwin is a contributed module
Thanks to github user frikilax for the patch.
- 2019-11-01: mmkubernetes bugfix: improper use of realloc()
could cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-31: imjournal: set the journal data threshold to MaxMessageSize
When data is read from the journal using sd_journal_get_data it may be
truncated to a certain threshold (64K by default).
If the rsyslog MaxMessageSize is larger than the threshold, there is a
chance rsyslog will receive incomplete messages from the journal.
Empirically, this appears to happen reliably when XZ compression is
used by journald. Systems where journald uses LZ4 compression do not
appear to suffer this issue reliably--if at all.
This change sets the threshold to the MaxMessageSize when the
journal is opened.
Thanks to Robert Winslow Dalpe for the patch.
- 2019-10-30: improg bugfix: allow improg to handle multi-line inputs
miscellaneous bug fixes in improg:
* properly truncate string after an input event is submitted
* set msgoffset to 0.
* tests added to check above fixes
Thanks to Nelson Yen for the fix.
- 2019-10-30: mmdblookup bugfix: missing space in city name
This fixes the issue that spaces in city names are dropped. However, the
fix is more or less a work-around. As it turns out, the libmaxminddb API
is not correctly used. In the somewhat longer term, we should fix this.
see also
- 2019-10-30: core/queue: provide ability to run diskqueue on multiple threads
Up until this release, disk queues could only use a single thread,
what limited their performance with outputs like ElasticSearch.
Now disk queues can utilize multiple threads just like any other
queue type. Most importantly, the disk queue part of a DA queue
now inherits the max number of threads from its memory queue
NOTE: the new multi-threaded DA disk queue is actually a change of
behavior. We have not guarded it by a new config switch as we
assume the new behavior is most often exactly within user
expectations. In any case, we cannot see any harm from running
the disk queue on multiple threads.
see also
- 2019-10-25: omfile bugfix: file handle leak
The stream class does not close re-opened file descriptors.
This lead to leaking file handles and ultimately to the inability
to open any files/sockets/etc as rsyslog ran out of handles.
The bug was depending on timing. This involved different OS
thread scheduler timing as well as workload. The bug was more
common under the following conditions:
- async writing of files
- dynafiles
- not committing file data at end of transaction
However it could be triggered under other conditions as well.
The refactoring done in 8.1908 increased the likelihood of
experiencing this bug. But it was not a real regression, the new
code was valid, but changed the timing so that the race was more
Thanks to Michael Biebl for reporting this bug and helping to
analyze it.
- 2019-10-22: imfile bugfix: improper use of calloc()
could cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-22: TLS driver bugfix: improper use of calloc()
can cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-22: imuxsock bugfix: improper use of calloc()
can cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-17: build system bugfix: incorrect default in ./configure help text
Thanks to Michael Biebl for pointing this out.
- 2019-10-17: mmkubernetes bugfix: improper use of calloc()
can cause problems under extreme memory shortage - very unlikely
credits to LGTM.COM for detecting this
- 2019-10-16: core queue bugfix: propagate batch size to DA queue
This was a long-standing bug where the DA queue always had a fixed small batch
size because the setting was not propagated from the memory queue. This also
removes a needless and counter-productive "debug aid" which seemed to be in
the code for quite some while. It did not cause harm because of the batch
size issue.
- 2019-10-16: testbench: fix unreliable gzipwrite test
The test was timing-sensitive as we did not properly check all data
was output to the output file - we just relied on sleep periods.
This has been changed. Also, we made some changes to the testing
framework to fully support sequence checking of multiple ZIP files.
- 2019-10-16: core queue bugfix: handle multi-queue-file delete correctly
Rsyslog may leave some dangling disk queue files under the following
- batch sizes and/or messages are large
- queue files are comparatively small
- a batch spans more than two queue files (from n to n+m with m>1)
In this case, queue files n+1 to (n+m-1) are not deleted. This can
lead to problems when the queue is re-opened again. In extreme cases
this can also lead to stalled processing when the max disk space is
used up by such left-over queue files.
Using defaults this scenario is very unlikely, but it can happen,
especially when large messages are being processed.
- 2019-10-16: imjournal: fix regression from yesterday's patch
commit 78976a9bc059 introduced a regression that caused writing
the journal state file to fail. This happens when the state file
is given as relative file name and the working directory is also
a relative path. This situation is very uncommon. So most deployments
will never experience it. We discovered the issue during CI runs
where the trigger condition is given. Note that it also takes
multiple times of loading the journal to actually see the bug.
see also
- 2019-10-15: imjournal plugin code restructuring, added remote option
Decomposed ReadJournal() a bit, also now coupling journald
variables in one struct, added few warning messages and debug
prints to help with bug hunts in future, also got rid of two
needless journald calls. WorkAroundJournalBug now deprecated.
Added option to pull journald records from outside local machine.
Thanks to Jiri Vymazal for the patch.
- 2019-10-11: core bugfix: potential abort on very long action name
The action name is stored in modified form for the debug header and
some messages. If it is extremely long, a buffer can be overrun,
resulting in misaddressing and potential segfault for rsyslog. This
can also happen if the action is NOT named, but a custom path to
the output module is given and that path is very long. This triggers
the same issue because by default the module load path is included
in the action name.
This patch corrects the problem and truncates overly long names
when being used for name generation.
The problem was detected during testbench work. We did never receive
a bug report from practice.
- 2019-10-10: testbench: add test for mmpstrucdata with RFC5424 escape sequences
Scheduled Release 8.1910.0 (aka 2019.10) 2019-10-01
- 2019-10-01: core bugfix: incorrect error message on duplicate module load
A Null-pointer was passed to printf instead of the module name.
On some platforms this may lead to a segfault. On most platforms
printf check's for NULL pointers and uses the string "(null)"
instead. In any case, the module name is missing from the error message.
- 2019-10-01: imczmq nitfix: potential NULL ptr in printf on out-of-memory condition
very unlikely to happen but if it does without any real issue on most platforms.
- 2019-10-01: work around some compiler warning messages induced by pthreads API
- 2019-10-01: core ratelimiting: more verbose message when rate-limiting happens
When messages are rate-limited, the error message now also contains the
rate limiter setting. This enables the user to more quickly understand what
the problem is (especially if default values apply).
Thanks to Jiri Vymazal for the patch.
- 2019-10-01: openssl TLS driver: do not emit unnecessary error message
On older openssl versions, an API was missing to set user-defined parameters. If we
had such an older version, rsyslog emitted an error message even if the user did
not configure such parameters. This has been corrected, so that a message is only
emitted if there really is a problem. Based on user feedback the severity has also
been downgraded to "warning".
- 2019-10-01: pmcisconames (contributed module) bugfix: potential misaddressing
- 2019-09-30: pmaixforwardedfrom (contributed module) bugfix: potential misaddressing
- 2019-09-30: pmdb2diag (contributed module) bugfix: Out of bounds issue
Add a new sanity check after determining the level len.
Thanks to Philippe Duveau for the patch.
see also:
- 2019-09-02: ability to set stricter TLS operation modes
- checking of extendedKeyUsage certificate field
- stricter checking of certificate name/addresses
Thanks to Jiri Vymazal for the patch.
- 2019-08-21: testbench: add basic test for immark
- 2019-08-20: core: do not unnecessarily set hostname on each HUP
- 2019-08-20: build system: support cross-platform build for mysql/mariadb
rsyslog fails to cross build from source, because it uses mysql_config
and mysql_config is unfixably broken for cross compilation. It would be
better to use pkg-config. The attached patch makes rsyslog try
pkg-config first and fall back to mysql_config.
Thanks to Helmut Grohne for providing a base patch.
- 2019-08-20: core/tcpsrv: potential race on startup/shutdown
if the tcpsrv component is started and quickly terminated, it may hang
for a short period of time. Also a very small amount of memory is leaked
immediately before shutdown. While this leak is irrelevant in practice
(the OS clean up the process anyways), it leads to CI failures. The hang,
however, can lead to longer than expected shutdown times for rsyslog.
The problem can be experienced via imtcp, imgssapi and imdiag (users
of affected core component).
Scheduled Release 8.1908.0 (aka 2019.08) 2019-08-20
- 2019-08-19: testbench: add test for $allowedSender functionality
- 2019-08-19: testbench: harden some tests against very slow CI machines
- 2019-08-16: testbench: make most tests use a port file and assign listen port 0
This makes the test much more robust against heavily loaded test systems.
- 2019-08-16: core/action: guard action.externalstate.file content against whitespace
remove trailing whitespace before checking the status string. This is
most important as a line usually ends with \n, which is considered
trailing whitespace. Accepting this increases usability.
- 2019-08-16: imtcp bugfix: multiple listenerPortFile parameter did not work
... because they were treated as module-global. If we had multiple imtcp
listeners with multiple port files, only the last filename was always used.
- 2019-08-16: testbench: improve testbench plumbing for gzip and fail cases
We have added new capabilities to the testbench plumbing to automatically
deal with gzip-compressed files. This also permits to use the wait_seq_check
function to work for gzip tests as well. The known-timing-sensitive
gzipwr_large test now makes use of the new capabilities. This enables us
to more reliably detect when we can savely shutdown the tested instance.
This commit also adds an ability to "abort" the full testbench run on
first test failure. This is especially useful during CI.
- 2019-08-13: testbench: add test for imuxsock legacy format
This was never tested. Ensures we don't accidentally break existing
- 2019-08-13: omelasticsearch bugfix: segfault on unknown retryRuleset
omelasticsearch does some "interesting tricks" for an output module.
This causes a segfault if the retryRuleset is now known.
The action module interface currently expects that all config errors
be detected during instance creation. Instead omelasticsearch defers
the retry ruleset check to a later state. The reason is that it wants
to support the use the same rulesetname it is defined in - and this
is not yet available at action parsing.
We fix this by ensuring that any deleted instance is properly unlinked
from the instance list. One may argue the module interface should get
upgrade for such cases, but this is a longer-term approach.
- 2019-08-12: imptcp bugfix: port="0" parameter did not work as expected
when multiple interfaces and/or protocols could be bound, each of
them used a different listener ports were assigned. While this is
basically correct, it makes things unusable, especially as
listenPortFileName will only contain the port number used for
the latest listener.
This patch now follows the model of nsd_ptcp.c to assign only
the first port randomly and then use that port consistently.
- 2019-08-10: omelasticsearch bugfix: potential resource leak with "rebindinterval"
If the "rebindInterval" parameter was used connections could be linked. This
was especially the case with small intervals (such as "2"). This is fixed by
forcing libcurl to close the connection on rebind.
Thanks to Noriko Hosoi for providing the patch.
- 2019-08-10: imjournal bugfix: state file close with fsync() was incorrect
This lead to fsync() now always applied where expected.
Thanks to Jiri Vymazal for the patch.
- 2019-08-10: testbench: add addtl test for multithreading and HUP
- 2019-08-10: imptcp bugfix: received bytes counter improperly maintained
imptcp counts the number of bytes received. However, receives
happen on different worker thread. The access to the counter
was not synchronized, which can cause loss of updates. Also,
thread debuggers validly flag this as an error, which creates
problems under CI.
This commit fixes the situation via atomic operations and
falls back to mutex calls if they are not available.
Detected by LLVM thread sanitizer.
- 2019-08-07: testbench: add basic tests for omusrmsg
- 2019-08-05: omhttp bugfix: enable checkpath configuration parameter
omhttp, 'checkpath' option, was not configurable in the past.
- add 'checkpath' to the cnfparamdescr table.
- fix issue with checkpath passing extra garbage characters in string.
- add 'checkpath' into unit test -
Thanks to Nelson Yen for the fix.
- 2019-08-05: testbench bugfix: some tests were executed when req module was missing
In actual case if --enable-impstats was not given some other tests failed.
- 2019-08-03: iminternal bugfix: race on termination
This could in theory lead to loss of shutdown messages, but was mostly a
cosmetic issues. We primarily fixed it to get TSAN-clean so that we can
utilize LLVM TSAN in CI.
- 2019-08-02: testbench: new test for omfile outchannel functionality
- 2019-08-02: core/janitor bugfix: properly maintain dynafile cache
When the janitor cleans out timed-out files, it does not
properly indicate the entry is gone. Especially when running
in async mode this can lead to use-after-free and thus
memory corruption or segfault.
see also
- 2019-08-01: omfile bugfix: race file when async writing is enabled
This seems to be a long-standing bug, introduced around 7 years ago.
It became more visible by properly closing files during HUP, which
was done in 8.1905.0 (and was another bugfix). Note that due to this
race a memory corruption can occur under bad circumstances. As such,
this may have also caused segfaults or system hangs (mutexes could
have been affected).
- 2019-08-01: testbench: additional tests for HUP
- 2019-07-31: imrelp bugfix: hang after HUP
termination condition was not properly checked; this lead to
premature termination after patch 1c8712415b9 was applied.
It is open to debate if patch 1c8712415b9 changed the module
interface. Actually it looks like this was previously not
well thought out.
- 2019-07-24: mmdarwin: add new module
This is a contributed module. For details see doc.
Thanks to the Advens team for contributing it.
- 2019-07-23 iminternal bugfix: suppress mutex double-unlock
If there is a burst of log messages during a time when rsyslog is unable
to output (either during log rotation, an out-of-space condition, or
some other similar condition), rsyslog can SEGFAULT due to a mutex
- 2019-07-23 imtcp: enable listenPortFileName parameter
this parameter was added, but it had no effect as it was not
passed down to the driver layer. This has been fixed. That also
now enables us to use dynamically-assigned port, which are
very useful for further testbench stabilization. Quite some
false positives occurred because the pre-selected port was
already in use again when rsyslog started.
- 2019-07-19 imtcp: enable listenPortFileName parameter
this parameter was added, but it had no effect as it was not
passed down to the driver layer. This has been fixed. That also
now enables us to use dynamically-assigned port, which are
very useful for further testbench stabilization. Quite some
false positives occurred because the pre-selected port was
already in use again when rsyslog started.
- 2019-07-18 core/action: no error file written if act suspended on TX commit
when an action was already disabled while the action was tried to be
committed, no error file was written. Note that this state is highly
unlikely to happen. Most probably, it can only happen if parameter
action.externalstate.file is used.
Version 8.1907.0 (aka 2019.07) 2019-07-09
NOTE TO MAINTAINERS: libee is not used by rsyslog for quite some while.
However, we never included this info into the changelog. So if you still
make rsyslog depend on libee (some do this), you should stop doing so now.
Libee is dead and no longer been maintained nor hosted by us. Old versions
can still be found at github for those in need.
GENERAL NOTE: during 8.1907 scheduled release timeframe we changed the ChangeLog
format to include the date a change went into master branch. This is to provide
an easy way to identify which changes went into the respective daily stable.
- 2019-07-05 imuxsock: support FreeBSD 12 out of the box
FreeBSD 12 uses RFC5424 on the system log socket by default. This
format is not supported by the special parser used in imuxsock.
Thus for FreeBSD the default needs to be changed to use the
regular parser chain by default. That is all this commit does.
- 2019-07-05 function bugfix: "ipv42num" misspelled as "ip42mum" (without "v")
To fix the issue but keep compatible with existing deployments
both function names are now supported.
- 2019-07-04 fix leading double space in rsyslog startup messages
see also
- omamqp1: port to latest api, add tests
This brings omamqp1 up-to-date with the latest qpid-proton-c
api version. This also adds a test for the plugin, to test
the basic functionality. The test requires the user to
install qdrouterd and the python qpid-proton library in order
to use the test program.
Thanks to Richard Megginson for the patch.
- omclickhouse bugfix: potential segfault on omclickhouse batchmode
segfault happened when the template did not contain the string
Thanks to github user wdjwxh for the fix.
- core bugfix: message duplication copied incorrect timestamp
MsgDup() placed timereported into timegenerated property, resulting
in invalid property values. Original timegenerated was lost. This
occurred always when a message needed to be duplicated. Most
importantly this is the case when queues are used.
- core bugfix: segfault on startup depending on queue file names
rsyslog will segfault on startup when a main queue file name has
been set and at least on other queue contains a file name. This
was cased by too-early freeing config error-detection data
structures. It is a regression caused by commit e22fb205a3.
Thanks to Wade Simmons for reporting this issue and providing
detailed analysis. That greatly helps fixing it quickly.
- core "bugfix": alignment issue
This was not a hard error on current platforms, but a
to-be-considered compiler warning regarding invalid alignment.
While it works well on current platforms, alignment issues may
turn into real issues in future platforms. So we try to fix them
if possible. As not only a side-effect this resolves compiler
warnings even on current platforms.
This fix has some regression potential. If so, the problems
may occur during IP address resolution.
see also
- omfile bugfix: potential hang/segfault on HUP of dynafile action
when omfile was HUPed it did not sufficiently clear all dynafile
cache maintenance data structures. This usually lead to misaddressing
and could result in various issues, including a hang of rsyslog
processing or segfaults. It could also have "no effect" by pure
luck of not hitting anything important. This actually seems to
have been the most frequent case.
This seems to be a long-standing bug, but the likelihood of its
appearance seems to have been increased by commit 62fbef7
introduced in 8.1905. Note: the commit itself has no regression,
just increases the likelihood to trigger the pre-existing bug.
special thanks to Alexandre Guédon for his help in analyzing
the issue - without him, we would probably still not know
what actually went wrong.
- imjournal bugfix: potential message duplication
When journal was preloaded from previously saved cursor it was not advanced
to next entry so reading begun from last message which was therefore
Thanks to Jiri Vymazal for the patch.
- rfc5424 parser bugfix: leading space sometimes lost
if structured data is present a leading space in MSG field is lost
- queue subsystem bugfix: oversize queue warning message shown as error
The warning message was emitted as an error message, which is misleading
and may also break some automated procedures.
- core bugfix: HUP did not work reliable on all platforms
most notably not on FreeBSD, maybe others. The reason was obviously
different handling of signals in respect to multiple threads.
- build system bugfix: missing files in distribution tarball
- testbench
* fixed "make distcheck" settings which were missing some modules
This lead to incomplete "make distcheck" run; some errors were not
detected due to that.
* testbench framework: use ip tool instead of outdated ifconfig
The framework now first checks if "ip" is available and falls back
to "ifconfig" only if this is not the case.
Thanks to Michael Biebl for the suggestion.
Version 8.1905.0 (aka 2019.05) 2019-05-28
- templates: add datatype template option for JSON generation
The new "datatype" and "onEmpty" template options permits to
generate non-string data rather easily. It works together with
jsonf formatting, which is what people should use nowadays.
- config processing: check disk queue file is unique
If the same name is specified for multiple queues, the queue files
will become corrupted. This commit adds a check during config parsing.
If duplicate names are detected the config parser errors out and the
related object is not created.
Note: this may look to a change-of-behavior to some users. However,
this never worked and it was pure luck that these users did not run
into big problems (e.g. DA queues were never going to disk at the
same time). So it is acceptable to error out in this hard error case.
- global config: new parameters for ruleset queue defaults
* default.ruleset.queue.timeoutshutdown
* default.ruleset.queue.timeoutactioncompletion
* default.ruleset.queue.timeoutenqueue
* default.ruleset.queue.timeoutworkerthreadshutdown
- add capability to write full config file (-o cmdline option)
Introduces the capability to create an output config file that explodes
all "includes" into a single file. This provides a much better overview
of how exactly the configuration is crafted. That could often be a great
troubleshooting aid.
This commit also contains some slight not-really-related cleanup.
- queue subsystem: permit to disable "light delay mark"
New semantic: if lightDelayMark is 0, it is set to the max queue
size, effectively disabling the "light delay" functionality.
Thanks to Yury Bushmelev to mentioning issues related to light
delay mark and proposing the solution (which actually is what
this commit does).
- queue subsystem: provide better user status messages
The queue subsystem now provides additional information messages which
may help a regular user to maintain system health. Most importantly,
DA queues now output when they persist queue data at end of run and
when they restart the queue based on persisted data.
- core: emit a warning message for ultra-large queue size definitions
We see error reports from users who have configured excessively large queues
and receive an OOM condition or other problems.
With that patch we generate a warning message if a queue is configured very
large. "Very large" is defined to be in excess of 500000 messages.
see also
- new global config parameter "internalmsg.severity"
permits to specify a severity filter for internal message. Only
messages with this severity level or more severe are logged.
Originally this was done in rsyslog.conf as usual: you can filter
rsyslog messages on severity, just like any other. But with systemd,
we now emit primarily to the journal, and this is outside of rsyslog's
rule engine and so regular filters do not apply (at least in regard
to the journal). Logging to journal is good, because finally
folks begin to see the messages (traditional distro configs discard
them, for whatever is the reason).
This commit implements a global setting for a severity-based filter
for internal messages, before submitted to journal. So it's not 100%
of what rsyslog can do, but at least some way to customize.
see also
- config processing bugfix: error messages if config.enabled="off" is used
Using config.enabled="off" could lead to error messages on
"parameter xxx not known", which were invalid. They occurred
because the config handler expected them to be used, which
was not the case due to being disabled.
This commit fixes that issue.
- core portability bugfix: harden shutdown processing on FreeBSD
On FreeBSD, rsyslog does not always terminate immediately on SIGTERM.
Root cause seems to be that SIGTERM is delivered differently under
FreeBSD. This causes the main thread to not be awaken, and so it
takes until the next janitor interval to come back to life - which
can be far too long. Fixed this bug explicitly awaking the main
- imtcp bugfix: oversize message truncation causes log to be garbled
The actual problem is in the tcpserver component. However, the prime user
is imtcp and so users will likely experience this as imtcp problem.
When a too-long message is truncated, the byte after the truncation
position becomes the first byte of the next message. This will garble
the next messages and in almost all cases render it is syslog-noncompliant.
The same problem does NOT occur when the message is split.
This commit fixes the issue. It also includes a testbench fix.
Unfortunately the test for exactly this feature was not properly
crafted and so could not detect the problem.
- omfile bugfix: FlushOnTXEnd does not work reliably with dynafiles
The flush was only done to the last dynafile in use at end of
transactions. Dynafiles that were also modified during the
transaction were not flushed.
Special thanks to Duy Nguyen for pointing us to the bug and
suggesting a solution.
This commit also contains a bit of cosmetic cleanup inside
the file stream class.
- lmcry_gcry build bugfix: was not always properly build
Due to an invalid definition in build system this seems to have not
been correctly build on at least some platforms (but it worked on
others as it passed CI testing). This has now been corrected.
Thanks to Remi Locherer for the patch.
- dnscache bugfix: very unlikely memory leak
This fixes a memory leak that can only occur under OOM conditions.
Detected by Coverity Scan, CID 203717
- testbench bugfix: wrong parameter check in (tcpflood())
When first parameter is check_only, the tcpflood funtion shall not
abort the test itself (The fail is intended if this option is set).
closes issue #3625
- testbench bugfix: imfile-symlink test failed w/ parallel test run
The test sometimes failed. It used a symlink to a hardcoded name
rsyslog-link.*.log. This symlink was created but then disappears.
The reason is that upon (every!) test exit, rsyslog-link.*.log is
deleted. So a parallel test running the exit procedure just at the
"right" time can removed that file.
The bug is that the file name should be created using the tests's
dynamic name. This is done now.
Version 8.1904.0 (aka 2019.04) 2019-04-16
- omfile: provide more helpful error message on file write errors
now contains actual file name plus a link to probable causes for this type
of problem
- imfile: emit error on startup if no working directory is set
When the work directory has not been set or is invalid, state files
are created in the root of the file system. This is neither expected
nor desirable. We now complain loudly about this fact. For backwards
compatibility reasons, we still need to support running imfile in
this case.
- dnscache: add global parameter dnscache.default.ttl
This permits to control default TTL for cache entries. If set
to 0, the DNS cache is effectively disabled.
- omelasticsearch: new parameter rebindinterval
Thanks to Richard Megginson for the patch.
- omelasticsearch: new parameter skipverifyhost
Add ability to specify the libcurl CURLOPT_SSL_VERIFYHOST
option to skip verification of the hostname in the peer cert.
WARNING: This option is insecure, and should only be used
for testing. The default value is off, meaning, the hostname
will be verified by default.
Thanks to Richard Megginson for the patch.
- omelasticsearch: set rawmsg to data from original request
Previously, when constructing the message to submit for a retry
for an original request, if the original request did not contain
the field `message`, the system property `rawmsg` was set to
the entire metadata + data from the original request. This was
causing problems with Elasticsearch. This patch changes
the code so that the `rawmsg` will be set to only the data part
of the original request if there is no `message` field.
Thanks to Richard Megginson for the patch.
- mmkubernetes - support for metadata cache expiration
New parameters for mmkubernetes (module and action):
* `cacheexpireinterval`
If `cacheexpireinterval` is -1, then do not check for cache expiration.
If `cacheexpireinterval` is 0, then check for cache expiration.
If `cacheexpireinterval` is greater than 0, check for cache expiration
if the last time we checked was more than this many seconds ago.
* `cacheentryttl` - maximum age in seconds for cache entries
New statistics counters:
* `podcachenumentries` - the number of entries in the pod metadata cache.
* `namespacecachenumentries` - the number of entries in the namespace
metadata cache.
* `podcachehits` - the number of times a requested entry was found in the
pod metadata cache.
* `namespacecachehits` - the number of times a requested entry was found
in the namespace metadata cache.
* `podcachemisses` - the number of times a requested entry was not found
in the pod metadata cache, and had to be requested from Kubernetes.
* `namespacecachemisses` - the number of times a requested entry was not
found in the namespace metadata cache, and had to be requested from
- imdocker: new contributed module
imdocker will get (docker) container logs from a host as well as filling
out some basic container metadata as id, name, image, labels.
Thanks to Nelson Yen for the contribution.
- mmtaghostname: new contributed module
This module allows one to force hostname after parsing to the localhostname of
rsyslog and/or add a tag to messages received from input modules without
tag parameter.
Thanks to Philippe Duveau for the contribution.
- imbatchreport: new contributed input module
This input module manage batches' reports : complete file as a single log.
Thanks to Philippe Duveau for the contribution.
- imtuxedolog: new contributed input module for Tuxedo ULOG
Thanks to Philippe Duveau for the contribution.
- openssl network driver: Added support setting openssl configcommands
We are using the gnutlsPriorityString setting variable, to pass
configuration commands to openssl.
- omkafka: drop messages rejected due to being too large
Drop messages that were rejected due to
Thanks to Nelson Yen for the patch
- core/action: implement capability to resume/suspend via external file
It has been reported that some TCP receivers exists that accept syslog tcp
messages at any rate, even if they do not manage to actually process them.
Instead, they silently drop the message. This behavior is not configurable.
All in all, it can lead to considerate message loss.
To support such use cases, we need to provide an ability to externally
trigger actions suspension and resumption.
We do this via a configured file which contains the status of the action.
Rsyslog periodically reads the file and if it contains "SUSPEND", it
suspend the action (and likewise for resume).
- improg bugfix: some memory leaks
Thanks to Philippe Duveau for the contribution.
- msg object bugfix: regression from 1255a67
- pmnormalize: fix memory leaks, improve tests
This patch fixes a set of problems plus provides more and enhanced
tests for the module.
Most important problem was a memory leak that occurred when a message
could not be passed at all. For each message that could not be parsed
memory of at least the size the message is leaked. Depending on
traffic pattern this can quickly lead to OOM. Note, however, that
this leak was never reported - it was discovered as part of code
- omkafka bugfix: build failure due to inconsistent type
fails depending on platform and settings; was somehow undetected by CI
- imjournal bugfix: potential segfault on some API failure returns
In one case there was possibility of free()'d value of journal
cursor not being reset, causing double-free and crash later on.
- openssl subsystem bugfix: better error handling
Handling of SSL_ERROR_SYSCALL has been hardened.
Handling for SSL_Shutdown errors has been corrected.
Also fixed SSL Shutdown handling in tcpflood (openssl code).
If SSL_Shutdown returns error, we call SSL_read as described in
the documentation to do a bidirectional shutdown.
- imjournal bugfix: Fetching journal cursor only for valid journal
The sd_journal_get_cursor() got called regardless of previous
retcodes from other journal calls which flooded logs with journald
errors. Now skipping the call in case of previous journal call
non-zero result. Fixed success checking of get_cursor() call
to eliminate double-free possibility.
Also, making WorkAroundJournalBug true by default, as there were no
confirmed performance regressions for a quite long time.
Thanks to Jiri Vymazal for the patch.
- omamqp: fix build errors
They occur on some, newer, platforms. We do not really fix them, but rather
make the compiler ignore them. This is not really good, but the module is
contributed and so that's for now the best thing we can do.
- testbench: change to use a larger connection count again
not sure why it was reduced, maybe related to
also, modernize this and another test
- tcpflood bugfix: make soft connection limit work again
It looks like the soft limit became defunct when tcpflood was enhanced to
request more open file handles from OS.
- testbench bugfix: omhttp tests were not run during "make distcheck"
- build system bugfix: omhttp test files were not included in dist tarball
Thanks to Thomas D. (whissi) for the patch.
Version 8.1903.0 (aka 2019.03) 2019-03-05
- omrabbitmq: add features (RabbitMQ HA management, templatize routing_key,
populate amqp message headers, delivery_mode and expiration parameters)
- improg: create input module to use external program as input datas
- imtuxedoulog: create input module to consume Tuxedo ULOG files
- omhttp: rewritten with large feature enhancements
Many thanks to Gabriel Intrator for this work. Gabriel also has adopted the
module and plans to support it in the future.
- pmdb2diag: create parser module for DB2 diag logs
- TLS subsystem: add support for certless communication
both openssl and GnuTLS drivers have been updated to support certless
communications. In this case e.g. Diffie-Helman is used.
NOTE: this is an insecure mode, as it does NOT guard against
man-in-the-middle attacks. We implemented it because of the large demand,
not because we think it makes sense to use this mode. We strongly recommend
against it.
- imrelp/omrelp: add capability to specify tlslib for librelp
- build system: introduce a better way to handle compiler pragmas
we now use macros and _Pragma(). This requires less code lines and is more
- omkafka: add support for dynamic keys
A new configuration property "dynaKey" is added that, when "on", changes the
value of property "key" to a template names instead of a constant value.
This is similar in approach to the DynaTopic implementation.
Thanks to Ludo Brands for the patch.
- AIX port: add AIX linking extensions on many plugins and contributions to
allow building them on this os.
- template: add Time-Related System Property $wday which is the day of week
This allows one to get a week based rotation of log as AIX does.
- ksi subsystem: add high availability mode
Note: ksi subsystem now REQUIRES libksi 3.19.0 or above
Thanks to Allan Park for the patch.
- imfile bugfix: file reader could get stuck
State file handling was invalid. When a file was moved and re-created
rsyslog could use the file_id if the new file to write the old files'
state file. This could make the file reader stuck until it reached the
previous offset. Depending on file sizes this could never happen AND
would cause large message loss. This situation was timing dependent
(a race) and most frequently occurred under log rotation. In polling
mode the bug was less likely, but could also occur.
- imfile bugfix: potential segfault when working with directories or symlinks
see also
Thanks to Nelson Yen for the patch
- omhttp bugfix: header items could not have spaces in them
Thanks to Nathan Brown for the patch.
- core bugfix: enlarged msg offset types for bigger structured messages
using a large enough (dozens of kBs) structured message
it is possible to overflow the signed short type which leads
to rsyslog crash. (applies to msg.c, the message object)
Thanks to Jiri Vymazal for the patch.
- core bugfix for AIX: timeval2syslogTime now handle the bias according to
local time zone as documented by IBM.
- imfile feature: add configuration parameter to force parsing of read logs
- imczmq bugfix:
Release zframe following read from socket
Make the 0MQ frame pointer local to the receive loop and destroy the
frame as soon as the contents have been copied. This avoids:
* a memory leak should the receive loop execute more than once
* referencing an un-initialized value during cleanup (finalize_it)
Thanks to Mark Gillott for the patch.
- omclickhouse bugfix: default template unusable
STDSQL option added to the default template used in output module of clickhouse
Thanks to gagandeep trivedi for the patch.
- omclickhouse "bugfix": work-around failed error detection
omclickhouse uses a questionable method to check if a request generated
an error. We have seen the method to fail when we slightly upgraded clickhouse
server in CI testing.
This commit makes the method a bit more reliable without really fixing it.
But it's at least a short-term solution.
This should be changed to a proper status check. I assume such is possible.
see also
- imptcp bugfix: overly long socket bind path can lead to segfault
if the `path` input parameter is overly long (e.g. more than 108
characters on some platforms) a non-terminated string is generated
and then passed to OS API. This can lead to all sorts of problems
including segfault.
We detected that based on gcc-8 warnings during code inspection.
No real-world problem case is known.
- ommongodb bugfix: improper stpncpy() calls
- testbench tcpflood: add new transport option relp-tls
Tcpflood can now send messages via relp with tls support.
- testbench: mmdb valgrind tests failed is srcdir env was not set
- testbench: add omclickhouse tests
- testbench bugfix: some long-running tests had too low runtime allowance
- testbench bugfix: daqueue-dirty-shutdown test
This test occasionally failed with left-over spool files. As far as we
have analyzed, this is due to the use of an invalid shutdown timeout
(very short) in the second phase of the test. It looks like this is
actually a copy&paste error from phase one. Behavior of rsyslog was
correct, but the test itself created a false positive.
We have corrected the timeout now and also modernized the test
a bit.
- testbench bugfix: some omhttp tests had compatibility issues with Python 3
Thanks to Thomas D. (whissi) for the patch.
Version 8.1901.0 (aka 2019.01) 2019-01-22
- new version scheme: 8.yymm.0 - version now depends on release date
see also
- queue: add support for minimum batch sizes
- change queue.timeoutshutdown default to 10 for action queues
The previous default of 0 gave action queues no real chance to
shutdown - at the time they were applied, they were usually already
expired (computing the absolute timeout took a small amount of time).
So we change this now to 10ms, which still is very quick but gives
the queue at least a chance to shutdown itself. That in turn
smoothes the whole shutdown process.
If a very large number of action queues is used this may lead
to a very slightly longer shutdown time, albeit this is very
- omclickhouse: new output module for clickhouse
This output module adds the possibility to send
INSERT querys to a Clickhouse database. See doc for details.
The messages are sent via a REST interface.
This commit also adds support of the testbench
for clickhouse tests, as well as various tests.
- omkafka: Add ability to dump librdkafka statistics to a file
Use statsFile to specify statistics output file; also requires
setting confparam to a non-zero value.
Thanks to github user pcullen65 for the contribution.
- tls(ossl/gtls): add new Option "StreamDriver.PermitExpiredCerts"
The new Option can have one of the following values:
on = Expired certificates are allowed
off = Expired certificates are not allowed
warn = Expired certificates are allowed but warning will be logged (Default)
Includes necessary tests to validate new code.
- action: add "action.resumeIntervalMax" parameter
This parameter permits to set an upper limit on the growth of the
retry interval. This is most useful when a target has extended
outage, in which case retries can happen very infrequently.
- report child process exit status according to config parameter
Add new global setting 'reportChildProcessExits' with possible values
'none|errors|all' (default 'errors'), and new global function
'glblReportChildProcessExit' to report the exit status of a child
process according to the setting.
Invoke the report function whenever rsyslog reaps a child, namely in:
- rsyslogd.c (SIGCHLD signal handler)
- omprog
- mmexternal
- srutils.c (execProg function, invoked from stream.c and omshell)
Remove redundant "reaped by main loop" info log in omprog.
Promote debug message in mmexternal indicating that the child has
terminated prematurely to a warning log, like in omprog.
Thanks to Joan Sala for contributing this.
- build system: add capability to turn off helgrind tests
we add configure switch --enable-helgrind. We need to turn helgrind off
when we use clang coverage instrumentation. The instrumentation injects
mt-unsafe counter updates which we seem to be unable to suppress.
Note: for gcc this was possible, because they all occurred in a utility
function. For clang, they are inlined so we get many -and changing- violations.
see also
- imzmq3/omzmq3: remove modules
according to @brianknox (their author) these modules are outdated:
They are replaced by imczmq/omczmq and are no longer maintained. We put a
depreciation notice into the modules a year ago, and now it finally is time
to remove them. They do NOT build in any case, except if very old versions
of the 0mq ecosystem are used.
see also
- bugfix omusrmsg: don't overwrite previous set _PATH_DEV value
Since commit 56ace5e418d149af27586c7c1264fccfbc6badf1, omusrmsg was broken
because "memcpy()" is not a suitable substitute for "strncat()" in this
context, it is actually replacing the previous added content.
Thanks to Thomas D. (whissi) for the patch.
- bugfix ossl TLS driver: fixed authentication mode anon
authentication mode "anon" was not properly supported in ossl TLS
driver; if selected, did still require a full certificate.
- bugfix tls subsystem: Receiver hang due to insufficient TLS buffersize.
gtls and ossl driver used a default buffersize of 8KiB to store received
TLS packets. When tls read returned more than buffersize, the additional
buffer was not processed until new data arrived on the socket again.
TLS RFCs require up to 16KiB+1 buffer size for a single TLS record.
- bugfix pmpanngfw: build issue due to non-matching data types in comparison
Thanks to Narasimha Datta for the patch.
- omfile: work-around for "Bad file descriptor" errors
This works-around an issue we can reproduce e.g. via the test. Here, omfile gets a write
error with reason EBADF. So far, I was not able to see an actual
coding error. However I traced this down to a multithreaded race
on open and close calls. I am very surprised to see this type
of issue, as I think the kernel guarantees that it does not happen.
Here is what I see in strace -f:
openssl accepts a socket:
[pid 66386] accept(4, {sa_family=AF_INET, sin_port=htons(59054), sin_addr=inet_addr("")}, [128->16]) = 10
then, it works a bit with that socket, detects a failure and shuts it down. Sometimes, at the very same instant omfile on another thread tries to open on output file. Then the following happens:
[pid 66386] close(10) = 0
[pid 66389] openat(AT_FDCWD, "./rstb_356100_31fa9d20.out.log", O_WRONLY|O_CREAT|O_NOCTTY|O_APPEND|O_CLOEXEC, 0644 <unfinished ...>
[pid 66386] close(10 <unfinished ...>
[pid 66389] <... openat resumed> ) = 10
[pid 66386] <... close resumed> ) = 0
[pid 66386] poll([{fd=4, events=POLLIN}, {fd=5, events=POLLIN}], 2, -1 <unfinished ...>
[pid 66389] write(2, "file './rstb_356100_31fa9d20.out"..., 66file './rstb_356100_31fa9d20.out.log' opened as #10 with mode 420
) = 66
[pid 66389] ioctl(10, TCGETS, 0x7f59aeb89540) = -1 EBADF (Bad file descriptor)
This is **literally** from the log, without deleting or reordering
lines. I read it so that there is a race between `open` and `close`
where fd 10 is reused, but seemingly closed - resulting in the `EBADF`
While it smells like a kernel issue, it may be a well-hidden program
bug - if so, one I currently do not find. HOWEVER, this commit
works around the issue by reopening the file when we receive EBADF.
That's the best thing to do in that case, especially if it really is
a kernel bug. Data loss should not occur, as the previous writes
succeeded in that case.
The backdraw of this work-around is that it only "fixes" omfile. In
theory every part of rsyslog can be affected by this issues (queue
files, for example). So this is not to be considered a final solution
of the root issues (but a big step forward for known problem cases).
see also
- omhttp bugfix: segfault due to NULL pointer access
many thanks to Gerardo Puerta for the patch
- omkafka bugfix: segfault when running in debug mode using dynamic topics
This should only affect test environments, as debug mode is not
suitable for production (and really does not work when running for
extended period of time).
- testbench bugfix: TLS syslog tests for "anon" mode were broken
They did not detect when "anon" mode was not properly supported by the
- test tooling bugfix: correct tcpflood error messages
it looks like tcpflood's openssl code stems partly back to tcpdump, at
least the error messages indicate this. Thankfully tcpdump is BSD licensed,
so this should not be a big issue. Nevertheless, the incorrect program name
in error messages needs to be corrected, and this is what this commit does.
- tcpflood bugfix: tool did not terminate on certificate error
when tcpflood detected a certificate error, it reported an
error message but did not abort. This could make errors undetectable
during CI runs.
also fix tests which did not properly provide CA cert (which than
caused the error).
- testbench: fix issues with journal testing
The configure/Makefile checks were not correct, leading to the
build of journal components when not necessary, even if not
supported by the platform. Thus lead to invalid build and test
- testbench: add tests for "certless" tcp/tls
This adds a test to ensure that a client without certificate can
connect to a server with certificates. So it is not exactly
The prime intent of this test is to match config suggestions given
by log hosting companies (like loggly) and so ensure that we do
not accidentally break them. This is especially important as the
capability for certless clients was not properly documented and
also become forgotten by the rsyslog team.
see also
- CI
- further improve testbench robustness against slow machines
- testbench: add tests for parser.EscapeControlCharacterTab global option
- testbench: Updated all expired x.509 certs
- fix a potential race in CI debug mode which can lead to segfault
only when instructed to do so, rsyslog may emit a "final worker thread shutdown"
messages. This is usually only enabled in CI and/or other testing. If enabled,
the code has a race on the pWti object which can lead to segfault or abort.
Only system which explicitly enable this CI aid are affected (running in debug
mode alone is NOT sufficient).
This is a regression from 8.40.0.
- testbench: improve robustness against slow CI, gen. improvements
* add an overall timeout value for tests - if running longer,
testbench framework tries to FAIL and end test. Note that
this is not bullet-proof and not intended to be so.
* guard against hanging rsyslog instances via a new imdiag
feature to abort after n number of seconds; among others,
this guards as against timeout-cancel in CI, which is always
pretty hard to diagnose - now we see these errors in test-suite.log
* fix a bug in tcp zip test, which actually did not use zip mode
* experimentally add debug output to better understand
shutdown_when_empty operation; goal is to improve understanding
and then remove that code again.
* improve shutdown predicate for a couple of tests
* made travis run make check with two parallel threads, for which
we seem ready now. Nevertheless, it's still experimental and we
may roll this back if required.
* testbench: disable omprog tests that hang under coverage instrumentation
When gcc coverage instrumentation is used, these tests hang. They work
with clang coverage instrumentation, but for some reason clang does not
give us full reports (at least not when used together with
We have tried to troubleshoot this for hours and hours - now is time to
give up until someone comes up with a bright idea. So we make the affected
tests skip themselves when they detect gcc with coverage instrumentation.
* testbench: add new test for imfile and logrotate in copytruncate mode
* testbench: add new omkafka tests for dynamic topics
* travis: do no longer run 0mq tests
This often causes trouble when the packages are rebuild by the 0mq project
(which happens frequently). We already do intensive testing of the 0mq
components in the buildbot infrastructure, where we use dedicated containers.
This is reliable, as the containers already contain everything needed and so
do not need to reach out to the 0mq package archives. In the light of this,
let's save us the trouble of Travis failures. The only downside is that
users cannot pre-test with their local Travis when modifying 0mq modules,
which is quite acceptable.
Version 8.40.0 [v8-stable] 2018-12-11
- mmkubernetes: add support for sslpartialchain for openssl
If `"on"`, this will set the OpenSSL certificate store flag
`X509_V_FLAG_PARTIAL_CHAIN`. This will allow you to verify the Kubernetes API
server cert with only an intermediate CA cert in your local trust store, rather
than having to have the entire intermediate CA + root CA chain in your local
trust store. See also `man s_client` - the `-partial_chain` flag.
This option is only available if rsyslog was built with support for OpenSSL and
only if the `X509_V_FLAG_PARTIAL_CHAIN` flag is available. If you attempt to
set this parameter on other platforms, you will get an `INFO` level log
message. This was done so that you could use the same configuration on
different platforms.
- openssl driver: improved error messages
also fixes misleading wording of some error messages
- imfile: disable file vs directory error on symlinks
The file/directory node-object alignment now ignores symlinks. Previously
it reported error on each directory symlink spamming user error logs.
Thanks to Jiri Vymazal for the patch.
- cleanup: remove no longer needed --enable-rtinst code
configure option --enable-rtinst is gone-away since a while, but there were
still some supporting code left. It required careful analysis what could
actually be removed. This is now done and the code fully cleaned up. This
greatly simplifies the code and also makes it better readable for
developers which are not deep inside the rsyslog code base.
As a positive side effect, we could eliminate mutex calls inside
the debug system. This means we are more likely to reproduce race
conditions in runs with debugging enabled.
- bugfix imfile: rsyslog re-sends data for files larger 2GiB
This occurs always if and only if
- reopenOnTruncate="on" is set
- file grows over 2GiB in size
Then, the data is continuously re-sent until the file becomes smaller
2GiB (due to truncation) or is deleted.
It is a regression introduced by 2d15cbc8221e385c5aa821e4a851d7498ed81850
- config: fix segfault in backticks "echo" expansion of undefined variables
The bug was introduced in commit abe0434 (config: enhance backticks "echo"
capability). The getenv() result passed to strlen() and es_addBuf() may be
NULL if the environment variable does not exist, resulting in a segfault.
Thanks to Julien Thomas for the patch.
- bugfix imsolaris: message timestamps on Solaris
On Solaris messages don't have their time directly in the raw body but in
a separate log_ctl structure which is currently not used.
When message is logged and processed, rsyslogd gives it current time because
it ignores the actual one. That means that old messages (e.g. from system
reboot) get timestamp of processing instead of the reboot itself (it is
not a problem for live logging where now is used anyway).
Thanks to Jakub Kulik for the patch.
- bugfix build system: "make distcheck" did not work for mysql tests
- bugfix build system: don't link liblogging-stdlog when available but not enabled
When liblogging-stdlog was available but configure option "--disable-liblogging-stdlog"
was set, rsyslog was still linking against liblogging-stdlog.
This commit will ensure that rsyslog will only link against liblogging-stdlog when
"--enable-liblogging-stdlog" was set.
see also:
- bugfix RainerScript: abs() could return negative value, now in range [0..max]
Thanks to Harshvardhan Shrivastava for providing the patch
- bugfix debug output: date property options output wrongly
inside debug logging, the date property options were not all
properly converted into strings. Some of the newer ones were
invalidly flagged as "UNKNOWN". This is primarily a cosmetic
problem and has no effect other than puzzling folks looking at
the debug log.
- bugfix omhttp: did not compile on some platforms
- CI
* made mysql-based tests (ommysql and omlibdbi) work inside containers
* bugfix testbench: do not execute libgcrypt tests if disabled
* testbench: grep failed when string starting with "-" was used
The search term was mistakenly interpreted as an option.
* testbench: support auto-start/-stop of mysqld
This is required to run mysql/mariadb tests inside containers.
* improve bash coding style and fix a some bug in testbench
- duplicate init call was not detected due to typo
- queue-persists test did not work correctly
- some general testbench framework improvements
issues found be shellcheck, fixes brought up other work to do
* testbench: improve journal tests and testbench framework
improving both style and reliability of journal tests; along that way
also improve testbench framework:
- do cleanup on error_exit and skip
- explicit skip handler (vs exit 77)
this permits us to do better cleanup
- new testbench functions for journal-specific functionality
reduce code duplication and make things easier to maintain in the
- provide a way to do valgrind and non-valgrind tests with a single
test file
see also
* testbench: improve framework, harden rscript http test
- the test now tries to detect unavailable http server, which
should not result in test failure
- equivalent valgrind test changed to new method, removing code
- testbench supports
* new exit code 177, which indicates environment error, makes
test SKIP but still reports the failure
* new exitcode, logurl stats reporting fields
* report buildbot builder (if provided) in failure report
* testbench: add test for mmjsonparse with unparsable data
* testbench: make es-bulk-retry test more reliable
We now no longer depend on a fixed 'sleep' command but rather
check the output file for what we expect. This is much more
robust on slow test machines.
We believe this closes the below-mentioned issue. If not, it
should be re-opened.
* testbench: suppress valgrind error caused by pthreads lib
finally I give up and honestly think this is a problem in pthreads and
not in rsyslog code. See issue below and previous commit for more
Unfortunately, this will also mask off cases where we do not properly
call pthread_join() albeit it is needed. Nevertheless, this bug is
causing so much CI grief that it is definitely worth it.
* testbench: made a couple of (unnamed due to too many) test more robust
against slow (CI) machines
Version 8.39.0 [v8-stable] 2018-10-30
- imfile: improve truncation detection
previously, truncation was only detected at end of file. Especially with
busy files that could cause loss of data and possibly also stall imfile
reading. The new code now also checks during each read. Obviously, there
is some additional overhead associated with that, but this is unavoidable.
It still is highly recommended NOT to turn on "reopenOnTruncate" in imfile.
Note that there are also inherent reliability issues. There is no way to
"fix" these, as they are caused by races between the process(es) who truncate
and rsyslog reading the file. But with the new code, the "problem window"
should be much smaller and, more importantly, imfile should not stall.
see also
see also
- imjournal: work around journald excessive reloading behavior
This is workaround for possible imjournal interaction with systemd
where journal invalidate fix is not present. The code tries to
detect SD_JOURNAL_INVALIDATE loop and not reload after each call.
Thanks to Jiri Vymazal for the patch.
- errmsg: remove no longer needed code
refactored code (over a long time) so that object-ish style is no longer
needed and could now finally be removed; We also refactored the last
component (omhttp contrib module) that used the old interface.
- queue bugfix: invalid error message on queue startup
due to some old regression (commit not exactly identified, but for
sure a regression, 9 years ago it was correct) an error message
is emitted when no .qi file exists on startup of the queue, which
is a normal condition.
Actually, the code should not have tried to open the .qi file in
the first place because it detected that it did not exist. That
(necessary) shortcut had been removed a while ago.
- bugfix imrelp: regression with legacy configuration startup fail
Startup of a relp listener failed if legacy configuration was used.
caused by commit: 32b71daa8aadb8f16fe0ca2945e54d593f47a824
- bugfix imudp: stall of connection and/or potential segfault
There was a regression in 493279b790a8cdace8ccbc2c5136985e820dd2fa.
This regression may cause stop (or delay) of reception from some systems
and may also cause a segfault. Triggering condition is that at least
one listener could not be created.
Thanks to Jens Låås for the patch.
- bugfix gcry crypto driver: small memleak
If a crypto key is specified directly via the key="" parameter,
the storage for that key is not freed, causing a small memleak.
Note that the problem occurs only once per context, so this
should not cause real issues. Even more so, as specifying a
key directly is meant only for testing purposes and is strongly
discouraged for production use.
Detected by internal testing, no actual fail case known.
- fix potential misaddressing in encryption subsystem
could happen if e.g. disk queues were encrypted
not seen in practice but caught by testbench test
- ksi subsystem changes
* enhance debug logging
* disable unsafe SHA1 algorithm
Thanks to Allan Park for the patch.
- bugfix core: regex compile error messages could be incorrect
- bugfix core: potential hang on rsyslog termination
The root cause was a deadlock during worker startup. This could
happen for example when a DA queue needed to persist data during
Fail condition:
* startup request for a new worker
* initialization of that worker
* immediate detection that the worker can or must shutdown
* main thread waiting for worker running state, which it skips,
and so the main thread hangs inside a loop
- bugfix imkafka: system hang when backgrounded
imkafka initializes librdkafka too early (before the fork). This leads
to hangs in various parts of the system - not only im imkafka but
other functions as well (e.g. getaddrinfo() calls).
- bugfix imfile: file change was not reliably detected
A change in the inode was not detected under all circumstances,
most importantly not in some logrotate cases.
Includes new tests made by Andre Lorbach. They now use the
logrotate tool natively to reproduce the issue.
- bugfix imrelp: do not fail build if librelp does not have relpSrvSetLstnAddr
- bugfix queue subsystem: DA queue did ignore encryption settings
- bugfix KSI: lmsig-ksils12 module skips signing the last block
Thanks to Allan Park for the patch.
- bugfix fmhash: function hash64mod sometimes returned wrong result
Thanks to Harshvardhan Shrivastava for providing the patch
- bugfix core/debug: data written to random fd 2 under some debug settings
This happens only during auto-backgrounding, where we cannot any longer
access stderr. Whatever is opened with fd2 receives some debug messages.
Note that the specific feature is usually turned on only in CI runs.
- cleanup: removed no longer needed code
Code that was unused for quite a while or did not really belong to the
project identified and removed.
- overall code cleanup
e.g. remove unused code, replace bad bash constructs, etc...
- CI:
* some small improvements in testbench plumbing
e.g. (`cmd` replaced by $(cmd), removed useless use of cat, ...)
* testbench: improve plumbing for kafka tests
- Removed all sleeps where possible.
- Moved all kafka start/stop/download logic into functions.
- Moved kafka/zookeeper stop into error_exit and exit_test.
- Kafka/Zookeeper cleanup only done on success now.
- Kafka/Zookeeper logfiles automatically dumped on error_exit only now.
- Added cleanup for Kafka/Zookeeper instances into CI/
- added new tests
* testbench: fix incompatibility of one omprog test with Python3
Python3 writes to stderr immediately, and this caused the
captured output to differ with respect to Python2. Simplified
the test to do a single write to stderr. Also a cast to int
was needed when calculating 'numRepeats'.
* testbench: fixed imfile parallel issues
- Fixed timing issues in some imfile wildcard/regex tests
- Added touch command in imfile wildcard tests to make sure directories
exist before files are created in it if IO is under stress.
- changed content checking in some tests to use "content_check_with_count"
with check timeouts instead of using fixed sleeptimes.
* testbench: new basic tests
These ensure that for some modules that did not have any tests at all
we have at least a minimal coverage (module loads, activates, is able
to emit error messages). Of course, further improvements would make
much sense. Modules:
- ommail
- testbench: new tests for disk queue encryption
- testbench: improved auto-diagnostics for hanging instance
- testbench: hardened kafka test against failing kafka subsystem,
not in 100% of the cases, but at least in some that frequently occur
- failing tests now report failure status so that we can get stats
on unreliable tests
- testbench tooling: fix incorrect tcpflood TLS parameter check
could lead to segfault when started
- bugfix testbench tooling: tcpflood invalid type in calloc (openssl mode)
It is unlikely that this has caused a real issue, as long as pointers
are all of the same size (what is highly probable).
detected by cppcheck via
Version 8.38.0 [v8-stable] 2018-09-18
- AIX: make basic modules work again
- make rsyslog build on AIX again
... at least for a limited set of default modules
- imfile: support for endmsg.regex
This adds support for endmsg.regex. It is similar to
startmsg.regex except that it matches the line that denotes
the end of the message, rather than the start of the next message.
This is primarily for container log file use cases such as this:
date stdout P start of message
date stdout P middle of message
date stdout F end of message
The `F` means this is the line which contains the final part of
the message. The fully assembled message should be
`start of message middle of message end of message`.
`startmsg.regex="^[^ ]+ stdout F "` will match.
Thanks to Richard Megginson for the patch.
- imkafka: add parameter "parseHostName"
This enables imkafka to parse the hostname from log message.
Previously that was not possible. It was most likely a bug, but
one that users may count on. The new parameter "ParseHostName"
(default is off) controls this behavior. Default is to NOT
parse the hostname.
Thanks to github user snaix for the contribution.
- im[p]tcp: improve error message on connect failure
Now a message with the actual OS error is emitted, making things far
easier to troubleshoot.
- imkafka: implement multithreading support for kafka consumers.
Each consumer runs in it's own consumer thread now. New tests have also
been added for this.
- omelasticsearch: write all header metadata to $.omes for retries
Write all of the original request metadata fields to $.omes for
the retry, if present. This may include all of the following:
_index, _type, _id, _parent, pipeline
This is in addition to the fields from the response. If the same
field name exists in the request metadata and the response, the
field from the request will be used, in order to facilitate
retrying the exact same request.
Thanks to Richard Megginson for the patch.
- core: improve error message on module load fail
The error message now lists all dlopen() errors in depth. This is
especially useful if the error is due to missing symbols or file
format errors.
- core/queue: add error message if queue file cannot be accessed
When having a disk-assisted queue without permission to write to the specified
queue file an error will now be generated.
- imtcp/imudp: new option preservecase for managing the case of FROMHOST value
default is left at current behavior
see also
see also
- omprog: add feedback timeout and keep-alive feature
- Restart the program if it does not respond within timeout.
- New setting 'confirmTimeout' (default 10 seconds).
- Allow the program to provide keep-alive feedback when a
message requires long-running processing.
- Improve efficiency when reading feedback line (use buffer).
Retry interrupted writes/reads to/from pipe.
- New setting 'reportFailures' for reporting error messages
from the program.
- Report child termination when writing to pipe.
- Minor refactor: renamed writePipe function to sendMessage,
renamed readPipe to readStatus.
Thanks to Joan Sala for contributing this.
- omprog: fix forceSingleInstance configuration option
The forceSingleInstance option did not work as intended. Even
if set multiple instances were spawned. This most probably
was a regression from 0453b1670fc34c96d31ee7c9a370f0f5ec24744a
The code was broken roughly 3.5yrs ago, so it looks like the
issue was little-noticed. This also means that potentially some users
may see the bugfix as change of behavior. If so, just remove
the option.
Thanks to Joan Sala for contributing this.
- imfile: implement file-id, used in state file
This ensures that files with the same inodes are not accidentally treated
as equal, at least within the limits of the file id hash (see doc for
We use the siphash reference implementation to generate our non-cryptographic
- imfile: experimental input throttling feature
The new input parameter delay.message has been added. It specifies
a delay in microseconds after each line read.
- core: emit TZ warning on startup not on Linux non-container
On Linux it seems common that the TZ variable is NOT properly set.
There are some concerns that the warning related to rsyslog correcting
this confuses users. It also seems that the corrective action rsyslog
takes is right, and so there is no hard need to inform users on that.
In Linux containers, however, the warning seems to be useful as the
timezone setup there seems to be frequently-enough different and
rsyslog's corrective action may not be correct.
So we now check if we are running under Linux and not within a container.
If so, we do not emit the warning. In all other case, we do. This is
based on the assumption that other unixoid systems still should have
TZ properly set.
- omkafka:
* better debug information
* Fixed minor issue in omkafka producing wrong kafka timestamps when
msgTimestamp was NULL.
* Setting RD_KAFKA_V_KEY(NULL, 0) in rd_kafka_producev now when KEY is not
* Fixed minor issue when rsyslog is compiled with --enable-debug and