Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Socketmap Permission denied / Postfix chroot problem? #141

Closed
onlime opened this issue Feb 26, 2023 · 13 comments
Closed

Socketmap Permission denied / Postfix chroot problem? #141

onlime opened this issue Feb 26, 2023 · 13 comments

Comments

@onlime
Copy link

onlime commented Feb 26, 2023

I am trying to get PostSRSd 2.0.2 running on a Debian Bullseye host with Postfix 3.5.17, but can't get it working (PostSRSd 1.12 working fine) with the following Postfix setup (see #76 - Only rewrite sender when forwarding and dynamically exclude local domains):

### main.cf
default_transport = smtp:127.0.0.1:10027
recipient_canonical_maps = socketmap:unix:srs:reverse
recipient_canonical_classes= envelope_recipient,header_recipient


### master.cf
cleanup-srs  unix  n    -       -       -       0       cleanup
  -o syslog_name=postfix-srs
  -o sender_canonical_maps=mysql:/etc/postfix/mysql-virtual_local_email.cf,socketmap:unix:srs:forward
  -o sender_canonical_classes=envelope_sender

127.0.0.1:10027  inet  n  -     -       -       -       smtpd
  -o syslog_name=postfix-srs
  -o smtpd_milters=
  -o cleanup_service_name=cleanup-srs
  -o smtpd_tls_security_level=none
  -o content_filter=smtp:
  -o smtpd_recipient_restrictions=permit_mynetworks,reject

### mysql-virtual_local_email.cf
hosts = 127.0.0.1
user = mailmap
password = ***********
dbname = mailsync
query = SELECT '%s' FROM `maildomains` WHERE `domain` = '%d' AND `active` = '1'

Installed PostSRSd 2.0.2 (over previous 1.12) from sources like this:

$ git clone https://github.com/roehling/postsrsd.git
$ cd postsrsd
$ git checkout 2.0.2
$ mkdir _build && cd _build
$ cmake .. -DCMAKE_BUILD_TYPE=Release -DCMAKE_INSTALL_PREFIX=/usr/local
$ make -j
$ make install

$ cp postsrsd.conf /usr/local/etc/postsrsd.conf
# + some minor tuning (srs-domain)

$ systemctl stop postsrsd
$ make install
$ systemctl daemon-reload
$ systemctl start postsrsd

Previous PostSRSd 1.12 config vars (merged from /usr/local/share/postsrsd/postsrsd-systemd-launcher and /etc/default/postsrsd):

SRS_FORWARD_PORT=10001
SRS_REVERSE_PORT=10002
SRS_DOMAIN="`postconf -h mydomain || true`"
SRS_SECRET="/etc/postsrsd.secret"
SRS_SEPARATOR="="
SRS_HASHLENGTH=4
SRS_HASHMIN=4
SRS_EXTRA_OPTIONS=
RUN_AS=nobody
SRS_LISTEN_ADDR=127.0.0.1
CHROOT="/usr/local/lib/postsrsd"
SRS_EXCLUDE_DOMAINS=

New PostSRSd 2.0.2 /usr/local/etc/postsrsd.conf (copied from _build/postsrsd.conf):

domains = {}
#domains-file =
srs-domain = example.com
socketmap = unix:/var/spool/postfix/srs
keep-alive = 30
#milter =
original-envelope = embedded
#envelope-database = "sqlite:./senders.db"
secrets-file = "/usr/local/etc/postsrsd.secret"
separator = "="
hash-length = 4
hash-minimum = 4
always-rewrite = off
unprivileged-user = "nobody"
chroot-dir = "/usr/local/var/lib/postsrsd"
syslog = off

I am getting a cannot chdir to home directory of user nobody warning:

$ systemctl status postsrsd
● postsrsd.service - Sender Rewriting Scheme daemon for Postfix
     Loaded: loaded (/etc/systemd/system/postsrsd.service; enabled; vendor preset: enabled)
     Active: active (running) since Sun 2023-02-26 16:34:16 CET; 3s ago
   Main PID: 2098669 (postsrsd)
      Tasks: 1 (limit: 618968)
     Memory: 204.0K
        CPU: 2ms
     CGroup: /system.slice/postsrsd.service
             └─2098669 /usr/local/sbin/postsrsd -C /usr/local/etc/postsrsd.conf

Feb 26 16:34:16 mail systemd[1]: Started Sender Rewriting Scheme daemon for Postfix.
Feb 26 16:34:16 mail postsrsd[2098669]: postsrsd: warn: cannot chdir to home directory of user nobody: No such file or directory

and in mail.log, I am getting this:

$ tail -f /var/log/mail.log | grep srs
Feb 26 16:35:24 mail postfix/smtpd[2099358]: warning: connect to srs: Permission denied
Feb 26 16:35:24 mail postfix/smtpd[2099358]: warning: table socketmap:unix:srs:reverse lookup error: Permission denied
Feb 26 16:35:24 mail postfix/smtpd[2099358]: warning: socketmap:unix:srs:reverse lookup error for "foo@example.org"

So this looks like some chroot issue. Both /usr/local/lib/postsrsd (from previous PostSRSd 1.12 install) and new /usr/local/var/lib/postsrsd directories exist. I would like to stick with the new recommended pathes, so I kept the ones from the compiled postsrsd.conf.

Can you please give me some advice what is configured wrong here? I know, this doesn't seem to be an issue with PostSRSd, rather a misunderstanding from my side. But it might help others to get it running on a Debian with chrooted Postfix.

@roehling
Copy link
Owner

The warning about the home directory of nobody is harmless and can be ignored. Can you check the permissions assigned to the socket file /var/spool/postfix/srs? It is supposed to be 0666, i.e. read-write permissions for everyone.

@onlime
Copy link
Author

onlime commented Feb 27, 2023

Can you check the permissions assigned to the socket file /var/spool/postfix/srs? It is supposed to be 0666, i.e. read-write permissions for everyone.

Thanks for looking into this. Looks like it has wrong permissions 0755:

srwxr-xr-x+ 1 root root 0 Sun 2023-02-26 16:34:16 /var/spool/postfix/srs

Should I just manually change the permissions or will this file get recreated on every Systemd service restart? Any idea why this happened in the build/install process? maybe because of my umask under root, which should be the default on a Debian system:

$ umask
0022

@onlime
Copy link
Author

onlime commented Feb 27, 2023

I confirm this working with 0666 permissions. But after every Systemd postsrsd.service restart, I need to manually fix permissions again:

$ chmod a+w /var/spool/postfix/srs

As a workaround I tried to put this in a Systemd override, /etc/systemd/system/postsrsd.service.d/override.conf:

[Service]
ExecStartPost=/bin/chmod a+w /var/spool/postfix/srs

but that won't work - that post command seems to get executed but probably to early, so perms are still wrong (0755)

@roehling Can you please fix this upstream by explicitly assigning correct permissions to that socket?

@onlime
Copy link
Author

onlime commented Feb 28, 2023

I had to give up on PostSRSd 2.0.2 again, as again it caused major problems on my production host.

Even with chmod 0666 /var/spool/postfix/srs, after some hours of running I got those errors again:

Feb 28 05:09:32 mail postfix/smtpd[3310311]: warning: connect to srs: Resource temporarily unavailable
Feb 28 05:09:32 mail postfix/smtpd[3310311]: warning: table socketmap:unix:srs:reverse lookup error: Resource temporarily unavailable
Feb 28 05:09:32 mail postfix/smtpd[3310311]: warning: socketmap:unix:srs:reverse lookup error for "foo@example.com"
Feb 28 05:09:32 mail postfix/smtpd[3310311]: NOQUEUE: reject: RCPT from mx1.example.com[46.231.xxx.xxx]: 451 4.3.0 <foo@example.com>: Temporary lookup failure; from=<bar@example.com> to=<foo@example.com> proto=ESMTP helo=<mx1.example.com>

only a systemctl restart postsrsd and resetting the permissions /var/spool/postfix/srs again helped, but unsure for how long, so I had to revert back to PostSRSd 1.12

@roehling
Copy link
Owner

I'm not sure what is going on with your system. The socket file is newly created on startup, but PostSRSd sets the umask first, so the permissions should include write permissions for everyone.

One thing I did notice is

srwxr-xr-x+ 1 root root 0 Sun 2023-02-26 16:34:16 /var/spool/postfix/srs

The + sign in the output indicates that you have an ACL applied, which begs the question whether there is also an ACL on the folder /var/spool/postfix itself that affects the permissions of the socket.

If you cannot figure what messes with your permissions, a possible workaround would be using TCP sockets instead of Unix sockets, i.e. set socketmap = inet:localhost:10023 in postsrsd.conf and configure Postfix with socketmap:localhost:10023:forward and socketmap:localhost:10023:reverse.

@onlime
Copy link
Author

onlime commented Feb 28, 2023

The + sign in the output indicates that you have an ACL applied, which begs the question whether there is also an ACL on the folder /var/spool/postfix itself that affects the permissions of the socket.

Thanks for pointing me into the right direction! For years now I had those ACLs set for Telegraf monitoring Postfix queue directories, a legacy solution which was replaced by Zabbix/Grafana monitoring long time ago:

$ setfacl -m g:telegraf:rX /var/spool/postfix
$ setfacl -Rm g:telegraf:rX /var/spool/postfix/{active,hold,incoming,deferred,maildrop}
$ setfacl -Rdm g:telegraf:rX /var/spool/postfix/{,active,hold,incoming,deferred,maildrop}

I thought this was just about giving extra permissions for group telegraf and having applied this as default ACL to the whole /var/spool/postfix. But you might be right and this was the cause for my problems. I have now removed the ACLs recursively like this:

$ setfacl -Rbk /var/spool/postfix

So far, PostSRSd 2.0.2 runs fine and on a restart generates the socket with 0777 perms:

srwxrwxrwx 1 root root 0 Tue 2023-02-28 14:29:31 /var/spool/postfix/srs

will keep you posted if this keeps running stable. Because last night, it suddenly stopped working after 00:58AM, even though permissions were still correct.

@onlime
Copy link
Author

onlime commented Feb 28, 2023

Still having the same problem, after 1-2hrs, PostSRSd 2.0.2 stops working and I get warning: connect to srs: Resource temporarily unavailable errors, even though /var/spool/postfix/srs permissions stayed untouched.

I have now migrated to socketmap = inet:localhost:10003 (postsrsd.conf) and socketmap:inet:localhost:10003:forward/ socketmap:inet:localhost:10003:reverse (Postfix). Hopefully, this is going to work.

@onlime
Copy link
Author

onlime commented Feb 28, 2023

Bad luck. Even with TCP socketmaps I am experiencing the same issue, PostSRSd service only works for ~80mins and then reports time limit exceeded:

Feb 28 17:52:07 mail postfix-srs/cleanup[3789293]: warning: table socketmap:inet:localhost:10003:forward lookup error: time limit exceeded
Feb 28 17:52:07 mail postfix-srs/cleanup[3789293]: warning: socketmap:inet:localhost:10003:forward lookup error for """"
Feb 28 17:52:07 mail postfix-srs/cleanup[3789293]: warning: B313866FFB2: sender_canonical_maps map lookup problem for "" -- message not accepted, try again later

Feb 28 17:52:12 mail postfix-srs/smtpd[3793281]: warning: table socketmap:inet:localhost:10003:reverse lookup error: time limit exceeded
Feb 28 17:52:12 mail postfix-srs/smtpd[3793281]: warning: socketmap:inet:localhost:10003:reverse lookup error for "foo.bar@hotmail.com"
Feb 28 17:52:12 mail postfix-srs/smtpd[3793281]: NOQUEUE: reject: RCPT from localhost.localdomain[127.0.0.1]: 451 4.3.0 <foo.bar@hotmail.com>: Temporary lookup failure; from=<> to=<foo.bar@hotmail.com> proto=ESMTP helo=<mail.example.com>
Feb 28 17:52:12 mail postfix-srs/smtpd[3793281]: disconnect from localhost.localdomain[127.0.0.1] ehlo=1 mail=1 rcpt=0/1 data=0/1 rset=1 quit=1 commands=4/6

Restarting the service works, but probably just for another 80mins.
So I have again reverted to PostSRSd 1.12

@roehling
Copy link
Owner

roehling commented Mar 1, 2023

Do you have any logging output from PostSRSd that might help tracking down the issue?

@onlime
Copy link
Author

onlime commented Mar 1, 2023

Do you have any logging output from PostSRSd that might help tracking down the issue?

Currently not, as I only noticed the last successful not reverse-rewritten: Not an SRS address. message in syslog and after lookup error: time limit exceeded was popping up, nothing else was written to syslog from postsrsd.
Can you give me a hint how to raise log verbosity? Simply enabling syslog = on in postsrsd.conf?

It looks like this is rather a Postfix issue with some kind of timeout / Postfix not closing socketmap connections and running out of some limit?? But the ~80mins seem quite weird - it was a similar timespan after it stopped working also when using /var/spool/postfix/srs socket instead of TCP connections.

Or was it maybe that weird lookup of "" (instead of a real email) that caused PostSRSd to crash somehow? That was the first lookup that caused problems and after that it stopped working:

Feb 28 17:52:07 mail postfix-srs/cleanup[3789293]: warning: table socketmap:inet:localhost:10003:forward lookup error: time limit exceeded
Feb 28 17:52:07 mail postfix-srs/cleanup[3789293]: warning: socketmap:inet:localhost:10003:forward lookup error for """"
Feb 28 17:52:07 mail postfix-srs/cleanup[3789293]: warning: B313866FFB2: sender_canonical_maps map lookup problem for "" -- message not accepted, try again later

@roehling
Copy link
Owner

roehling commented Mar 1, 2023

I'll look into it and see if I can reproduce the problem. Thank you for now.

roehling added a commit that referenced this issue Mar 3, 2023
This is a follow-up for #141. Now that the main process properly closes
the connection socket, the connection is properly terminated if the
child exits.
@onlime
Copy link
Author

onlime commented Mar 3, 2023

@roehling wonderful! Thanks for the 2.0.3 release. Have it already running in production and will give you a feedback soon.

@onlime
Copy link
Author

onlime commented Mar 3, 2023

I confirm this working. At least for 3h10min now, without any issues. Now running it with default socketmap = unix:/var/spool/postfix/srs
@roehling Thanks a lot for this great product. Pls tell me where I can drop a small donation / Github sponsors?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants