Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

SIGBUS after journal rotation #4645

Open
amdrsantos opened this issue Sep 21, 2023 · 6 comments
Open

SIGBUS after journal rotation #4645

amdrsantos opened this issue Sep 21, 2023 · 6 comments
Labels

Comments

@amdrsantos
Copy link

syslog-ng

Version of syslog-ng

syslog-ng 4 (4.1.0)
Config version: 4.0
Installer-Version: 4.1.0
Revision: 4.1.0-1
Compile-Date: Feb 28 2023 15:33:51
Module-Directory: /usr/lib/syslog-ng/4.1
Module-Path: /usr/lib/syslog-ng/4.1
Include-Path: /usr/share/syslog-ng/include
Available-Modules: disk-buffer,regexp-parser,affile,correlation,rate-limit-filter,afsql,tags-parser,afmongodb,syslogformat,afuser,pseudofile,csvparser,linux-kmsg-format,afprog,metrics-probe,cryptofuncs,afsocket,system-source,confgen,kvformat,basicfuncs,pacctformat,cef,timestamp,sdjournal,hook-commands,appmodel,json-plugin
Enable-Debug: off
Enable-GProf: off
Enable-Memtrace: off
Enable-IPv6: on
Enable-Spoof-Source: on
Enable-TCP-Wrapper: on
Enable-Linux-Caps: on
Enable-Systemd: on

Service file

[Unit]
Description=System Logger Daemon
Documentation=man:syslog-ng(8)
After=network.target

[Service]
Type=notify
ExecStart=/usr/sbin/syslog-ng -F $SYSLOGNG_OPTS --caps "cap_net_bind_service,cap_net_broadcast,cap_net_raw,cap_dac_read_search,cap_chown,cap_fowner=p cap_dac_override,cap_syslog=ep"
ExecReload=/bin/kill -HUP $MAINPID
EnvironmentFile=-/etc/default/syslog-ng
EnvironmentFile=-/etc/sysconfig/syslog-ng
StandardOutput=journal
StandardError=journal
Restart=on-failure

[Install]
WantedBy=multi-user.target

Platform

PRETTY_NAME="Debian GNU/Linux 12 (bookworm)"
NAME="Debian GNU/Linux"
VERSION_ID="12"
VERSION="12 (bookworm)"
VERSION_CODENAME=bookworm
ID=debian
HOME_URL="https://www.debian.org/"
SUPPORT_URL="https://www.debian.org/support"
BUG_REPORT_URL="https://bugs.debian.org/"

Debug bundle

Not Aviable

Issue

Failure

2023-09-20T21:24:59.455535+0000 machine systemd[1]: Started systemd-coredump@0-18522-0.service - Process Core Dump (PID 18522/UID 0).
2023-09-20T21:25:00.259365+0000 machine systemd-coredump[18523]: [🡕] Process 2984 (syslog-ng) of user 0 dumped core.

                                                                  Module libsystemd.so.0 from deb systemd-252.6-1.amd64
                                                                  Stack trace of thread 17642:
                                                                  #0  0x00007f5d61dd4f9e n/a (libsystemd.so.0 + 0x5ff9e)
                                                                  #1  0x00007f5d6147671e n/a (libsdjournal.so + 0x771e)
                                                                  #2  0x00007f5d61476ac9 n/a (libsdjournal.so + 0x7ac9)
                                                                  #3  0x00007f5d62345061 n/a (libsyslog-ng-4.1.so.0 + 0x58061)
                                                                  #4  0x00007f5d61e49692 n/a (libivykis.so.0 + 0x5692)
                                                                  #5  0x00007f5d61e48043 n/a (libivykis.so.0 + 0x4043)
                                                                  #6  0x00007f5d61e4dfe5 n/a (libivykis.so.0 + 0x9fe5)
                                                                  #7  0x00007f5d61e4a259 n/a (libivykis.so.0 + 0x6259)
                                                                  #8  0x00007f5d61e4b841 iv_main (libivykis.so.0 + 0x7841)
                                                                  #9  0x00007f5d61e49463 n/a (libivykis.so.0 + 0x5463)
                                                                  #10 0x00007f5d61e4cb75 n/a (libivykis.so.0 + 0x8b75)
                                                                  #11 0x00007f5d6204ffd4 n/a (libc.so.6 + 0x88fd4)
                                                                  #12 0x00007f5d620cf8d0 __clone (libc.so.6 + 0x1088d0)

                                                                  Stack trace of thread 2984:
                                                                  #0  0x00007f5d620cfcb6 epoll_wait (libc.so.6 + 0x108cb6)
                                                                  #1  0x00007f5d61e4ddc5 n/a (libivykis.so.0 + 0x9dc5)
                                                                  #2  0x00007f5d61e4a381 n/a (libivykis.so.0 + 0x6381)
                                                                  #3  0x00007f5d61e4b841 iv_main (libivykis.so.0 + 0x7841)
                                                                  #4  0x00007f5d62343e69 main_loop_run (libsyslog-ng-4.1.so.0 + 0x56e69)
                                                                  #5  0x000056528d39d62b main (syslog-ng + 0x262b)
                                                                  #6  0x00007f5d61fee18a n/a (libc.so.6 + 0x2718a)
                                                                  #7  0x00007f5d61fee245 __libc_start_main (libc.so.6 + 0x27245)
                                                                  #8  0x000056528d39d701 _start (syslog-ng + 0x2701)

                                                                  Stack trace of thread 17640:
                                                                  #0  0x00007f5d620cfcb6 epoll_wait (libc.so.6 + 0x108cb6)
                                                                  #1  0x00007f5d61e4ddc5 n/a (libivykis.so.0 + 0x9dc5)
                                                                  #2  0x00007f5d61e4a259 n/a (libivykis.so.0 + 0x6259)
                                                                  #3  0x00007f5d61e4b841 iv_main (libivykis.so.0 + 0x7841)
                                                                  #4  0x00007f5d61e49463 n/a (libivykis.so.0 + 0x5463)
                                                                  #5  0x00007f5d61e4cb75 n/a (libivykis.so.0 + 0x8b75)
                                                                  #6  0x00007f5d6204ffd4 n/a (libc.so.6 + 0x88fd4)
                                                                  #7  0x00007f5d620cf8d0 __clone (libc.so.6 + 0x1088d0)

                                                                  Stack trace of thread 17639:
                                                                  #0  0x00007f5d620cfcb6 epoll_wait (libc.so.6 + 0x108cb6)
                                                                  #1  0x00007f5d61e4ddc5 n/a (libivykis.so.0 + 0x9dc5)
                                                                  #2  0x00007f5d61e4a259 n/a (libivykis.so.0 + 0x6259)
                                                                  #3  0x00007f5d61e4b841 iv_main (libivykis.so.0 + 0x7841)
                                                                  #4  0x00007f5d61e49463 n/a (libivykis.so.0 + 0x5463)
                                                                  #5  0x00007f5d61e4cb75 n/a (libivykis.so.0 + 0x8b75)
                                                                  #6  0x00007f5d6204ffd4 n/a (libc.so.6 + 0x88fd4)
                                                                  #7  0x00007f5d620cf8d0 __clone (libc.so.6 + 0x1088d0)

                                                                  Stack trace of thread 18311:
                                                                  #0  0x00007f5d620cfcb6 epoll_wait (libc.so.6 + 0x108cb6)
                                                                  #1  0x00007f5d61e4ddc5 n/a (libivykis.so.0 + 0x9dc5)
                                                                  #2  0x00007f5d61e4a259 n/a (libivykis.so.0 + 0x6259)
                                                                  #3  0x00007f5d61e4b841 iv_main (libivykis.so.0 + 0x7841)
                                                                  #4  0x00007f5d62329ede n/a (libsyslog-ng-4.1.so.0 + 0x3cede)
                                                                  #5  0x00007f5d62344f01 n/a (libsyslog-ng-4.1.so.0 + 0x57f01)
                                                                  #6  0x00007f5d62226cfd n/a (libglib-2.0.so.0 + 0x7ecfd)
                                                                  #7  0x00007f5d6204ffd4 n/a (libc.so.6 + 0x88fd4)
                                                                  #8  0x00007f5d620cf8d0 __clone (libc.so.6 + 0x1088d0)

                                                                  Stack trace of thread 17641:
                                                                  #0  0x00007f5d620cfcb6 epoll_wait (libc.so.6 + 0x108cb6)
                                                                  #1  0x00007f5d61e4ddc5 n/a (libivykis.so.0 + 0x9dc5)
                                                                  #2  0x00007f5d61e4a259 n/a (libivykis.so.0 + 0x6259)
                                                                  #3  0x00007f5d61e4b841 iv_main (libivykis.so.0 + 0x7841)
                                                                  #4  0x00007f5d61e49463 n/a (libivykis.so.0 + 0x5463)
                                                                  #5  0x00007f5d61e4cb75 n/a (libivykis.so.0 + 0x8b75)
                                                                  #6  0x00007f5d6204ffd4 n/a (libc.so.6 + 0x88fd4)
                                                                  #7  0x00007f5d620cf8d0 __clone (libc.so.6 + 0x1088d0)
                                                                  ELF object binary architecture: AMD x86-64
2023-09-20T21:25:00.287037+0000 machine kernel: syslog-ng (2984) used greatest stack depth: 11544 bytes left
2023-09-20T21:25:00.303109+0000 machine systemd[1]: syslog-ng.service: Main process exited, code=killed, status=7/BUS
2023-09-20T21:25:00.303327+0000 machine systemd[1]: syslog-ng.service: Failed with result 'signal'.
2023-09-20T21:25:00.303642+0000 machine systemd[1]: syslog-ng.service: Consumed 3min 24.547s CPU time.

Steps to reproduce

I think it happening after the journal rotation:

2023-09-20T21:24:59.412233+0000 machine systemd-journald[1554]: Data hash table of /var/log/journal/397d4151bc50a273e5879e5064b7d611/system.journal has a fill level at 75.0 (51202 of 68266 items, 25165824 file size, 491 bytes per hash table item), suggesting rotation.
2023-09-20T21:24:59.412800+0000 machine systemd-journald[1554]: /var/log/journal/397d4151bc50a273e5879e5064b7d611/system.journal: Journal header limits reached or header out-of-date, rotating.

Configuration

In attachment GX-101569.syslog-ng.zip

Input and output logs (if possible)

<30>1 2023-09-20T21:24:47.412+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2941818"] Reloading syslog-ng.service - System Logger Daemon...
<30>1 2023-09-20T21:24:47.416+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2941821"] Reloaded syslog-ng.service - System Logger Daemon.

<46>1 2023-09-20T21:24:59.412+00:00 GX-208-xmm4-1-1 systemd-journald 1554 - [meta sequenceId="2943204"] Data hash table of /var/log/journal/397d4151bc50a273e5879e5064b7d611/system.journal has a fill level at 75.0 (51202 of 68266 items, 25165824 file size, 491 bytes per hash table item), suggesting rotation.
<46>1 2023-09-20T21:24:59.412+00:00 GX-208-xmm4-1-1 systemd-journald 1554 - [meta sequenceId="2943205"] /var/log/journal/397d4151bc50a273e5879e5064b7d611/system.journal: Journal header limits reached or header out-of-date, rotating.
<30>1 2023-09-20T21:24:59.452+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943348"] Created slice system-systemd\x2dcoredump.slice - Slice /system/systemd-coredump.
<30>1 2023-09-20T21:24:59.455+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943349"] Started systemd-coredump@0-18522-0.service - Process Core Dump (PID 18522/UID 0).
<28>1 2023-09-20T21:25:00.303+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943352"] syslog-ng.service: Main process exited, code=killed, status=7/BUS
<28>1 2023-09-20T21:25:00.304+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943353"] syslog-ng.service: Failed with result 'signal'.
<30>1 2023-09-20T21:25:00.304+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943354"] syslog-ng.service: Consumed 3min 24.547s CPU time.
<30>1 2023-09-20T21:25:00.311+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943355"] systemd-coredump@0-18522-0.service: Deactivated successfully.
<30>1 2023-09-20T21:25:00.523+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943356"] syslog-ng.service: Scheduled restart job, restart counter is at 1.
<30>1 2023-09-20T21:25:00.524+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943357"] Stopped syslog-ng.service - System Logger Daemon.
<30>1 2023-09-20T21:25:00.524+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943358"] syslog-ng.service: Consumed 3min 24.547s CPU time.
<30>1 2023-09-20T21:25:00.525+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943359"] Starting syslog-ng.service - System Logger Daemon...
<45>1 2023-09-20T21:25:00.662+00:00 GX-208-xmm4-1-1 syslog-ng 18560 - [meta sequenceId="2906730"] syslog-ng starting up; version='4.1.0'
<30>1 2023-09-20T21:25:03.956+00:00 GX-208-xmm4-1-1 systemd 1 - [meta sequenceId="2943744"] Started syslog-ng.service - System Logger Daemon.
@amdrsantos amdrsantos added the bug label Sep 21, 2023
@amdrsantos
Copy link
Author

Still visible in version 4.4:

root@machine:/home# coredumpctl dump 24897
           PID: 24897 (syslog-ng)
           UID: 0 (root)
           GID: 0 (root)
        Signal: 7 (BUS)
     Timestamp: Thu 2023-10-19 15:52:25 UTC (9min ago)
  Command Line: /usr/sbin/syslog-ng -F
    Executable: /usr/sbin/syslog-ng
 Control Group: /system.slice/syslog-ng.service
          Unit: syslog-ng.service
         Slice: system.slice
       Boot ID: ac13e43f9b6e41c89d2f4568c4337440
    Machine ID: 397d4151bc50a273e5879e5064b7d611
      Hostname: xmm4-1-1
       Storage: /var/lib/systemd/coredump/core.syslog-ng.0.ac13e43f9b6e41c89d2f4568c4337440.24897.1697730745000000.zst (present)
  Size on Disk: 1.2M
       Message: Process 24897 (syslog-ng) of user 0 dumped core.

                Module libsystemd.so.0 from deb systemd-252.6-1.amd64
                Stack trace of thread 27185:
                #0  0x00007f5ba0370f9e n/a (libsystemd.so.0 + 0x5ff9e)
                #1  0x00007f5b9fb3d86e n/a (libsdjournal.so + 0x786e)
                #2  0x00007f5b9fb3dc19 n/a (libsdjournal.so + 0x7c19)
                #3  0x00007f5ba0907db2 n/a (libsyslog-ng-4.4.so.0 + 0x5cdb2)
                #4  0x00007f5ba03e5692 n/a (libivykis.so.0 + 0x5692)
                #5  0x00007f5ba03e4043 n/a (libivykis.so.0 + 0x4043)
                #6  0x00007f5ba03e9fe5 n/a (libivykis.so.0 + 0x9fe5)
                #7  0x00007f5ba03e6259 n/a (libivykis.so.0 + 0x6259)
                #8  0x00007f5ba03e7841 iv_main (libivykis.so.0 + 0x7841)
                #9  0x00007f5ba03e5463 n/a (libivykis.so.0 + 0x5463)
                #10 0x00007f5ba03e8b75 n/a (libivykis.so.0 + 0x8b75)
                #11 0x00007f5ba060dfd4 start_thread (libc.so.6 + 0x88fd4)
                #12 0x00007f5ba068d820 __clone (libc.so.6 + 0x108820)

                Stack trace of thread 27183:
                #0  0x00007f5ba068dc06 epoll_wait (libc.so.6 + 0x108c06)
                #1  0x00007f5ba03e9dc5 n/a (libivykis.so.0 + 0x9dc5)
                #2  0x00007f5ba03e6259 n/a (libivykis.so.0 + 0x6259)
                #3  0x00007f5ba03e7841 iv_main (libivykis.so.0 + 0x7841)
                #4  0x00007f5ba08eb3ce n/a (libsyslog-ng-4.4.so.0 + 0x403ce)
                #5  0x00007f5ba0907c31 n/a (libsyslog-ng-4.4.so.0 + 0x5cc31)
                #6  0x00007f5ba07e4cfd n/a (libglib-2.0.so.0 + 0x7ecfd)
                #7  0x00007f5ba060dfd4 start_thread (libc.so.6 + 0x88fd4)
                #8  0x00007f5ba068d820 __clone (libc.so.6 + 0x108820)

                Stack trace of thread 24897:
                #0  0x00007f5ba068dc06 epoll_wait (libc.so.6 + 0x108c06)
                #1  0x00007f5ba03e9dc5 n/a (libivykis.so.0 + 0x9dc5)
                #2  0x00007f5ba03e6381 n/a (libivykis.so.0 + 0x6381)
                #3  0x00007f5ba03e7841 iv_main (libivykis.so.0 + 0x7841)
                #4  0x00007f5ba0906a89 main_loop_run (libsyslog-ng-4.4.so.0 + 0x5ba89)
                #5  0x0000560a3daad58a main (syslog-ng + 0x258a)
                #6  0x00007f5ba05ac18a __libc_start_call_main (libc.so.6 + 0x2718a)
                #7  0x00007f5ba05ac245 __libc_start_main_impl (libc.so.6 + 0x27245)
                #8  0x0000560a3daad6f1 _start (syslog-ng + 0x26f1)

                Stack trace of thread 27186:
                #0  0x00007f5ba068dc06 epoll_wait (libc.so.6 + 0x108c06)
                #1  0x00007f5ba03e9dc5 n/a (libivykis.so.0 + 0x9dc5)
                #2  0x00007f5ba03e6259 n/a (libivykis.so.0 + 0x6259)
                #3  0x00007f5ba03e7841 iv_main (libivykis.so.0 + 0x7841)
                #4  0x00007f5ba03e5463 n/a (libivykis.so.0 + 0x5463)
                #5  0x00007f5ba03e8b75 n/a (libivykis.so.0 + 0x8b75)
                #6  0x00007f5ba060dfd4 start_thread (libc.so.6 + 0x88fd4)
                #7  0x00007f5ba068d820 __clone (libc.so.6 + 0x108820)
                ELF object binary architecture: AMD x86-64

@bazsi
Copy link
Collaborator

bazsi commented Dec 10, 2023

SIGBUS indicates that the backing pages of a mmap()-ed region is gone. Normally, this should not happen with a normal file system, as if the file itself is removed, as long as there's a reference to it (e.g. mmap), it should not be removed in reality, the backing storage would remain to be allocated while the filename itself is removed from the containing directory.

What kind of filesytem hosts the systemd journal file? Is it a normal file system or a ramdisk, perhaps something else? We really can't do too much in this situation, at least if my assumptions are correct.

But the question remains, what do you have as backing storage for the journal files?

@bazsi
Copy link
Collaborator

bazsi commented Dec 10, 2023

This can be somewhat related to: systemd/systemd#12042

As you can see, sometimes the journald will truncate a file, which might cause a SIGBUS in the reading side (e.g. syslog-ng)

@bazsi
Copy link
Collaborator

bazsi commented Dec 10, 2023

And here's the possible fix: systemd/systemd#24320

It is fixed in systemd v255, I am not sure which one you are running. But the same bug affects rsyslog and filebeat

@amdrsantos
Copy link
Author

The journal is in a type ext4 partition of an SSD.
The machine is still using: systemd 252 (252.17-1~deb12u1).

Probably the issue that you mentioned can be the cause. I will let you know when we decide to upgrade.

@bazsi
Copy link
Collaborator

bazsi commented Dec 12, 2023 via email

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants