Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

logging with short hostname instead of FQDN until first HUP #5218

Closed
mbiebl opened this issue Aug 30, 2023 · 4 comments · Fixed by #5289
Closed

logging with short hostname instead of FQDN until first HUP #5218

mbiebl opened this issue Aug 30, 2023 · 4 comments · Fixed by #5289
Assignees
Labels
Milestone

Comments

@mbiebl
Copy link
Contributor

mbiebl commented Aug 30, 2023

Version: 8.2302.0-1

Originally filed as downstream bug https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1050874
But it's also reproducible with the latest version 8.2308.0-1, notice the usage of $PreserveFQDN on

Hi,

it looks like the rsyslog version in bookworm is affected by
something related to #4975,
quoting from there:

| After rsyslog start, local hostname is short name only. It changes to FQDN after the first HUP.

I observed this and can reproduce it on several bookworm systems:

| synpromika@example01 ~ % hostname
| example01
| synpromika@example01 ~ % hostname --fqdn
| example01.in.example.com
| synpromika@example01 ~ % logger mika-was-here
| synpromika@example01 ~ % sudo tail /var/log/syslog
| [...]
| 2023-08-30T16:54:12.429181+02:00 example01 synpromika: mika-was-here

Only when sending HUP to rsyslog it behaves as expected:

| synpromika@example01 ~ % sudo pkill -HUP rsyslogd
| synpromika@example01 ~ % logger mika-was-here2
| synpromika@example01 ~ % sudo tail /var/log/syslog
| [....]
| 2023-08-30T16:54:12.429181+02:00 example01 synpromika: mika-was-here
| 2023-08-30T16:54:20.194863+02:00 example01.in.example.com rsyslogd: [origin software="rsyslogd" swVersion="8.2302.0" x-pid="5181" x-info=["https://www.rsyslog.com"](https://www.rsyslog.com/)] rsyslogd was HUPed
| 2023-08-30T16:54:23.708793+02:00 example01.in.example.com synpromika: mika-was-here2

FTR, the systems have the default /etc/rsyslog.conf:

| synpromika@example01 ~ % grep -v '^#' /etc/rsyslog.conf | sed '/^$/d'
| module(load="imuxsock") # provides support for local system logging
| module(load="imklog")   # provides kernel logging support
| $FileOwner root
| $FileGroup adm
| $FileCreateMode 0640
| $DirCreateMode 0755
| $Umask 0022
| $WorkDirectory /var/spool/rsyslog
| $IncludeConfig /etc/rsyslog.d/*.conf
| *.*;auth,authpriv.none          -/var/log/syslog
| auth,authpriv.*                 /var/log/auth.log
| cron.*                          -/var/log/cron.log
| kern.*                          -/var/log/kern.log
| mail.*                          -/var/log/mail.log
| user.*                          -/var/log/user.log
| *.emerg                         :omusrmsg:*

And /etc/rsyslog.d/postfix.conf as shipped by the postfix package,
plus a custom /etc/rsyslog.d/graylog.conf for central logging are
also in place:

| synpromika@example01 ~ % grep -v '^#' /etc/rsyslog.d/*.conf | sed '/^$/d'
| /etc/rsyslog.d/graylog.conf:$PreserveFQDN on
| /etc/rsyslog.d/graylog.conf:*.* @logging01.in.example.com:5140;RSYSLOG_SyslogProtocol23Format
| /etc/rsyslog.d/postfix.conf:$AddUnixListenSocket /var/spool/postfix/dev/log

This is especially tricky, as rsyslog's logrotate configuration also
sends the HUP:

| synpromika@example01 ~ % cat /etc/logrotate.d/rsyslog
| /var/log/syslog
| /var/log/mail.log
| /var/log/kern.log
| /var/log/auth.log
| /var/log/user.log
| /var/log/cron.log
| {
|         rotate 4
|         weekly
|         missingok
|         notifempty
|         compress
|         delaycompress
|         sharedscripts
|         postrotate
|                 /usr/lib/rsyslog/rsyslog-rotate
|         endscript
| }
| synpromika@example01 ~ % cat /usr/lib/rsyslog/rsyslog-rotate
| #!/bin/sh
|
| if [ -d /run/systemd/system ]; then
|     systemctl kill -s HUP rsyslog.service
| fi

So this causes systems to log with short hostname after fresh system
start (and also after every single rsyslog service restart), but
once logrotate kicks in it switches to logging with FQDN / full
hostname instead:

| 2023-08-18T00:00:08.311354+02:00 testsuitecrm01 systemd[1]: Reloaded apache2.service - The Apache HTTP Server.
| 2023-08-18T00:00:08.331611+02:00 testsuitecrm01.in.example.com systemd[1]: rsyslog.service: Sent signal SIGHUP to main process 550 (rsyslogd) on client request.
| 2023-08-18T00:00:08.332430+02:00 testsuitecrm01.in.example.com rsyslogd: [origin software="rsyslogd" swVersion="8.2302.0" x-pid="550" x-info=["https://www.rsyslog.com"](https://www.rsyslog.com/)] rsyslogd was HUPed
| 2023-08-18T00:00:08.553169+02:00 testsuitecrm01.in.example.com systemd[1]: logrotate.service: Deactivated successfully.
| 2023-08-18T00:00:08.553771+02:00 testsuitecrm01.in.example.com systemd[1]: Finished logrotate.service - Rotate log files.

Especially for folks with log monitoring/alerting based on FQDNs
(e.g. with central loggging or logwatch) this might result in
quite some unexpected behavior.

The issue feels similar to what was reported and supposedly fixed
with https://bugs.debian.org/cgi-bin/bugreport.cgi?bug=1022128 and
#5004, though it looks like
to be incomplete yet?

@mika
Copy link

mika commented Sep 8, 2023

FYI, I'm the original bug reporter towards Debian and am subscribed to the issue also here, let me know if I can provide any further information or alike. :)

Thanks @mbiebl for forwarding this towards upstream.

@eciii
Copy link

eciii commented Nov 10, 2023

I also run into this problem and think I know the reason why it is happening.

The function net.getLocalHostname (in runtime/net.c) is the one that actually tries to query the local FQDN of the machine. It tries to do so in two steps:

  1. It uses gethostname to query the machine's hostname. If that hostname has a dot in it it means that it is already a FQDN so that hostname is returned.
  2. If the hostname has no dot in it that means it is unqualified. In this case the function tries to query the canonical name associated to that hostname by using getaddrinfo. But there is one caveat. Before actually calling getaddrinfo the function first need to check if the user didn't disabled DNS in the configuation (by using net.EnableDNS). To do that the function needs access to the runConf object. But if the runConf object is not yet initialized then the function cannot check if the user disabled DNS, the call to getaddrinfo is skipped and the unqualified hostname is returned.

All this happens during the startup of rsyslogd (in the function initAll in tools/rsyslogd.c). In fact, early in the startup process, the function rsconf.Load (in runtime/rsconf.c) is called to load the configuration. At some point during the execution of this function the following call stack is created:

  • tellCoreConfigLoadDone
  • glblDoneLoadCnf (in runtime/glbl.c)
  • queryLocalHostname (back again in tools/rsyslogd.c)
  • net.getLocalHostname (the function that I described above).

Since at that point the runConf object is not yet initialized rsyslog is not able to query the FQDN of the host and has to continue with the unqualified name. In fact, the runConf object is initialized only later in the startup process when rsconf.Activate is called.

Since I'm not familiar with the source code of rsyslog I cannot tell how difficult would it be to fix the issue. I'd love to have more time to get familiar with the code and attempt to provide a PR but unfortunately I'm currently quite busy with other stuff. Hopefully someone else find this comment useful to fix the issue :)

@rgerhards
Copy link
Member

@eciii Thanks for the great analysis. I'll see what I can pin down.

@rgerhards rgerhards self-assigned this Nov 10, 2023
@rgerhards rgerhards added the bug label Nov 10, 2023
@rgerhards rgerhards added this to the v8.2312 milestone Nov 10, 2023
rgerhards added a commit to rgerhards/rsyslog that referenced this issue Dec 6, 2023
Even if hostname FQDN is configured, rsyslog internal messages generated
after rsyslog startup and before the first HUP will not necessarily have
FQDN but instead only the shortname of the local host. This commit
fixes the situation.

Special thanks to github user eciii for doing a great bug analysis
and helping us considerably to fix the issue.

closes rsyslog#5218
@rgerhards
Copy link
Member

I have refactored some internal processing in #5289, but as of my experience, this now fixes the issue. If anyone wants to try, please feel free to do so.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants