Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

config: ensure SASL socket file is not inside a volume mount #3131

Merged
merged 5 commits into from Mar 3, 2023

Conversation

georglauterbach
Copy link
Member

Description

Change the SASL auth socket path. The SASL socket was on a path that would end up getting volume mounted (/var/mail-state) in some setups, this could cause issues. The socket is not actually state, so it doesn't need to end up there.

Fixes #3110
Superseeds #3125

Type of change

  • Bug fix (non-breaking change which fixes an issue)

Checklist:

  • My code follows the style guidelines of this project
  • I have performed a self-review of my own code
  • I have commented my code, particularly in hard-to-understand areas
  • I have made corresponding changes to the documentation (README.md or the documentation under docs/)
  • If necessary I have added tests that prove my fix is effective or that my feature works
  • New and existing unit tests pass locally with my changes

And drop redundant repetition of default value in `master.cf`.
The `user-patches.sh` was not very robust, hence I adopted the solution
@polarathene proposed by adding a dedicated file in Dovecot's
configuration for the LMTP service.
This commit also includes the removal of files as mentioned in #3118 (see
#3116 (comment)).

This is a cleanup measure. The old socket file is the primary focus of
this commit (and its assocated PR), but I figured this would be _the_
chance to incorporate the removal of the other, unneeded files in order
to not open yet another PR.
@georglauterbach
Copy link
Member Author

georglauterbach commented Feb 28, 2023

What even is this:

not ok 85 [General] system: postfix should not log to syslog in 123ms
# (from function `assert_failure' in file test/test_helper/bats-assert/src/assert_failure.bash, line 66,
#  in test file test/tests/serial/tests.bats, line 208)
#   `assert_failure' failed
#
# -- command succeeded, but it was expected to fail --
# output : Feb 28 14:51:12 mail rsyslogd: cannot create '/var/spool/postfix/dev/log': No such file or directory [v8.2102.0 try https://www.rsyslog.com/e/2176 ]
# --
#

Can somebody please tell me what's happening here? 🤣 🤣 🤣 Why, if Postfix does not log to syslog, is syslog still trying to create /var/spool/postfix/dev/log? @casperklein any idea? I mean, the reason it fails at the moment is obvious (the directory does not exist in the first place), but I don't see why that would be necessary either way? On my setup, after deleting /var/spool/postfix/dev/, Postfix (presumably?) created the directory anew. The file /var/spool/postfix/dev/log is also there, but empty, and therefore pretty useless?!

polarathene
polarathene previously approved these changes Feb 28, 2023
@polarathene
Copy link
Member

polarathene commented Feb 28, 2023

Can somebody please tell me what's happening here? Why, if Postfix does not log to syslog, is syslog still trying to create /var/spool/postfix/dev/log? @casperklein any idea? I mean, the reason it fails at the moment is obvious (the directory does not exist in the first place)

This reminds me that I forgot to look into an extra change for dropping the chroot Postfix config... it was about logging 😅

There is some Postfix docs about configuring for chroot:

Additionally, you almost certainly need to configure syslogd so that it listens on a socket inside the Postfix queue directory.
Examples of syslogd command line options that achieve this for specific systems:

Linux, OpenBSD: syslogd -a /var/spool/postfix/dev/log

So it's possibly due to syslogd being configured to setup logging there or something?


EDIT: And here it is:

$ cat /etc/rsyslog.d/postfix.conf

# Create an additional socket in postfix's chroot in order not to break
# mail logging when rsyslog is restarted.  If the directory is missing,
# rsyslog will silently skip creating the socket.
$AddUnixListenSocket /var/spool/postfix/dev/log

And here is the Postfix docs I recalled for adjusting the logs. AFAIK, we'd want to set main.cf:maillog_file to /dev/stdout instead of a log file, this way supervisor service running postfix start-fg should be capturing the logs and writing that to a file AFAIK?

This would just require the following:

# Dockerfile (after packages installed, or potentially part of postfix package install script)
RUN rm /etc/rsyslog.d/postfix.conf

# /etc/postfix/master.cf
postlog   unix-dgram n  -       n       -       1       postlogd

# /etc/postfix/main.cf
maillog_file = /dev/stdout

Should I raise a separate PR for that, or do you want to bundle those changes into this PR?

There are some limitations cited in those Postfix logging docs. Not entirely clear if syslog is still relevant there. I think we'd be okay, but another set of eyes going over those limitations might be good 👍

@georglauterbach
Copy link
Member Author

georglauterbach commented Feb 28, 2023

Should I raise a separate PR for that, or do you want to bundle those changes into this PR?

Since I can review quickly, and only 1 review is required ATM, please raise another PR :)

The removal of the syslog config file for postfix that you proposed to do in the Dockerfile, if you can also do that in the scripts, I'd do it in the scripts :)

@georglauterbach georglauterbach added the pr/waiting for other PR to get merged This PR is waiting for another / other PR(s) to get merged label Feb 28, 2023
@polarathene
Copy link
Member

polarathene commented Feb 28, 2023

Since I can review quickly, and only 1 review is required ATM, please raise another PR :)

Done, feel free to merge after approving 👍

EDIT: Oh awesome.. failing tests, should have expected that 😂 I'll look into resolving them.

@casperklein
Copy link
Member

Forgive me, I haven't read in detail the initial bug report. I was wondering if the other sockets that are also volume mounted are also problematic:

Details
./mailstate/lib-amavis/amavisd.sock
./mailstate/spool-postfix/public/cleanup
./mailstate/spool-postfix/public/qmgr
./mailstate/spool-postfix/public/showq
./mailstate/spool-postfix/public/sender-cleanup
./mailstate/spool-postfix/public/flush
./mailstate/spool-postfix/private/rewrite
./mailstate/spool-postfix/private/relay
./mailstate/spool-postfix/private/verify
./mailstate/spool-postfix/private/proxywrite
./mailstate/spool-postfix/private/maildrop
./mailstate/spool-postfix/private/anvil
./mailstate/spool-postfix/private/bsmtp
./mailstate/spool-postfix/private/tlsmgr
./mailstate/spool-postfix/private/error
./mailstate/spool-postfix/private/tlsproxy
./mailstate/spool-postfix/private/ifmail
./mailstate/spool-postfix/private/scalemail-backend
./mailstate/spool-postfix/private/defer
./mailstate/spool-postfix/private/proxymap
./mailstate/spool-postfix/private/retry
./mailstate/spool-postfix/private/mailman
./mailstate/spool-postfix/private/smtpd
./mailstate/spool-postfix/private/dnsblog
./mailstate/spool-postfix/private/auth
./mailstate/spool-postfix/private/virtual
./mailstate/spool-postfix/private/uucp
./mailstate/spool-postfix/private/local
./mailstate/spool-postfix/private/lmtp
./mailstate/spool-postfix/private/policyd-spf
./mailstate/spool-postfix/private/trace
./mailstate/spool-postfix/private/smtp
./mailstate/spool-postfix/private/bounce
./mailstate/spool-postfix/private/discard
./mailstate/spool-postfix/private/scache
./mailstate/spool-postfix/private/smtp-amavis

@polarathene
Copy link
Member

I was wondering if the other sockets that are also volume mounted are also problematic

Are all those postfix files sockets? I wasn't sure why they seemed to be empty when I saw them. All I know is they're various programs that Postfix uses to run (bulk are configured in master.cf).

I know they're added in a fresh image without a volume mounted, but I didn't investigate if they were created as part of the package install or from Postfix service being started. I think it's the latter due to this being the queue_directory location where Postfix does it thing (chroot jail or not).


amavisd.sock would probably fall into the same category if the location was an issue. I'm fairly positive that it's unlikely anyone is running a host with an old linux kernel, so the original Docker with sockets in locations using overlayfs bug should already be fixed (since July 2016, but broader deployment adoption would take a while).

AFAIK, something caused a users permissions for the socket to change from the configured 0666 to 0660, which would fail due to the docker / docker UID + GID ownership not applying to the postfix user trying to access it. The other failure case we know about seems related to rootless containers (several reports).

Correcting the user and group assigned to the socket should resolve the main issue. The new socket location is likely not necessary, just following the same workaround that was used for supervisor (Sep 2017, a time when the Docker bug was likely on systems with old enough kernels).

@casperklein
Copy link
Member

Are all those postfix files sockets? I wasn't sure why they seemed to be empty when I saw them.

Yes (find -type s).

in some setups, this could cause issues

In detail, Podman rootless? Not going to block this PR, just curious about this, because afaik we didn't include patches for "not supported" stuff in the past?

@polarathene
Copy link
Member

polarathene commented Mar 3, 2023

EDIT: Whoops, forgot to read the test content that it failed with 😝

  • It's just rsyslog config not able to create a socket due to missing dev/ directory 🤦‍♂️
  • I have confirmed that Postfix without the chroot jail defaults to /dev/log, so everything is actually working fine 👍

Revised the PR to handle that.


Alright, I have since learned that /dev/log is created by rsyslog (or any other log service that emits syslog format?).

Notes on logging in container
  • If we replaced rsyslog in future, such as with Vector, it just needs to create a socket at /dev/log to receive logs from various processes emitting them there, and we're good. I've verified this works with Vector.
  • The failing test was implemented back in Sep 2016 to prevent duplicate content on disk.
  • Their solution to prevent it was to opt-out of logs of a given facilitiy (such as mail) with the .none priority, after the initial grab everything (*.*), which prevents those logs from going to /var/log/syslog.
  • rsyslog (at least in Debian, not the original Ubuntu base image) was configured (via postfix package install) to create a separate socket to listen to, which Postfix used in chroot jail to receive logs from (as it could not send to /dev/log due to the jail). Presumably the Ubuntu base image may have been different, since the test was specifically to check for Postfix logs, so no chroot jail back then perhaps (which has been the default for Postfix for a while, but Debian package defaults to chroot config).

Presently this is what /etc/rsyslog.conf has configured for this log destination:

*.*;mail.none;mail.error;auth,authpriv.none             -/var/log/syslog

NOTE: Github Actions service was acting up, I went to inspect the error logs, and got an error about the workflow run logs not being available anymore, so re-ran the test, and now it's only run the lint test instead of the test suite, hence the PR status of "All checks passed" 😅 (hence the tests will still fail)

This still seems to be having issues, so will need to wait to re-run tests again.

@polarathene
Copy link
Member

In detail, Podman rootless? Not going to block this PR, just curious about this

There is no confirmation yet that /dev/shm works any better for those users. The choice to move the socket there is due to:

  • Avoiding creation in /var/mail-state since these have no relevance to persist, only relevant at runtime.
    • Potentially additional complications from volume mounting (storage driver, host platform, container runtime, SELinux).
  • /dev/shm chosen only because we currently use it for the supervisor socket (but this workaround is probably no longer relevant).

Socket files could instead live at /run / /var/run or similar appropriate locations, in which case we probably should move the Supervisor one too.

The reports from users experiencing issues related to sockets AFAIK is specific to this Dovecot created auth socket and a lack of permissions for Postfix (and Dovecot in some cases IIRC) from accessing it. It's highly likely it was just related to bad config on our end with the ownership.

@georglauterbach
Copy link
Member Author

So what do we want to do now? (in one sentence please). Move all sockets to /run/? That's fine with me. How do we move the other socket files (that @casperklein mentioned) too?

@polarathene
Copy link
Member

So what do we want to do now? (in one sentence please). Move all sockets to /run/?

No action, leave as-is with /dev/shm for now.

How do we move the other socket files (that @casperklein mentioned) too?

This PR does not need to worry about that.

  • Amavis socket could be part of the Amavis config in it's /etc config folder, or possibly a default with a setting that could change it.
  • The others are for Postfix programs. Presumably they shouldn't be persisted, but these will be created at the queue_directory setting, where other relevant state is.

@casperklein
Copy link
Member

How do we move the other socket files (that @casperklein mentioned) too?

I have no idea at the moment. Just mentioned it, while I stumbled upon it.

Dockerfile Outdated Show resolved Hide resolved
@casperklein
Copy link
Member

Move all sockets to /run/?

I would stick with "standards", so yes. I am not sure if switching to using a tmpfs could not introduce other issues in the future 🤷

However, there is no urgent need. We can do it later.

Co-authored-by: Casper <casperklein@users.noreply.github.com>
@georglauterbach georglauterbach removed the pr/waiting for other PR to get merged This PR is waiting for another / other PR(s) to get merged label Mar 3, 2023
@georglauterbach georglauterbach enabled auto-merge (squash) March 3, 2023 22:29
@georglauterbach
Copy link
Member Author

Auto-merge is on, just FYI :)

@georglauterbach georglauterbach merged commit f0edcc2 into master Mar 3, 2023
@georglauterbach georglauterbach deleted the config/sasl-socket-file branch March 3, 2023 22:42
@polarathene
Copy link
Member

polarathene commented Mar 3, 2023

I have no idea at the moment.

AFAIK, they need to reside in /var/spool/postfix (the main.cf:queue_directory location).

I'm not sure why the socket files are created in the public/ + private/ dirs there if not using a chroot jail, I haven't found a way to configure that differently. Everything else AFAIK is state that should persist.

We could handle it as a special-case I guess, here's two approaches to symlink subdirs and avoid some.


I would stick with "standards", so yes.

v13 could look at changing that to /run/dms/socket-name (overlayFS) and moving any other sockets like Amavis there.

I am not sure if switching to using a tmpfs could not introduce other issues in the future

df -h

Filesystem           Size  Used Avail Use% Mounted on
overlay              120G  115G  5.1G  96% /
tmpfs                 64M     0   64M   0% /dev
shm                   64M     0   64M   0% /dev/shm
/dev/mapper/ventoy2  120G  115G  5.1G  96% /etc/hosts
tmpfs                 16G     0   16G   0% /proc/asound
tmpfs                 16G     0   16G   0% /proc/acpi
tmpfs                 16G     0   16G   0% /proc/scsi
tmpfs                 16G     0   16G   0% /sys/firmware
  • /etc/hosts is from host OS, so whatever filesystem that belongs to on the host will be displayed for it in the container (same with Size of my host partition).
  • These are the only tmpfs locations available, unless mounting a tmpfs storage driver volume.
  • All I know is we've had supervisor socket moved to /dev/shm and that doesn't appear to be causing any problems.

@casperklein
Copy link
Member

v13 could look at changing that to /run/dms/socket-name (overlayFS) and moving any other sockets like Amavis there.

👍

All I know is we've had supervisor socket moved to /dev/shm and that doesn't appear to be causing any problems.

Sure. I once had a problem with /dev/shm, but can't remember exactly atm what that was 😞

@casperklein
Copy link
Member

My memory returned 😆 Back then I could not execute files in /dev/shm. But I later figured out, that it wasn't a tmpfs issue, but how it was mounted (with the noexec option):

tmpfs on /dev/shm type tmpfs (rw,nosuid,nodev,noexec,relatime,size=3221180k)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

[BUG] cannot send email, private/auth failed: permission denied
3 participants