New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
scripts: remove PostSRSD wrapper #3128
Conversation
The setup is now completely done during _actual_ setup phase. The wrapper did not even catch signals (SIGINT, etc.), which I think is strange. I also added all the ENVs the wrapper relied on (which previously could have been unset) to the variables script.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for tackling this! No more wrappers 🎉
@@ -148,7 +148,7 @@ autostart=false | |||
autorestart=true | |||
stdout_logfile=/var/log/supervisor/%(program_name)s.log | |||
stderr_logfile=/var/log/supervisor/%(program_name)s.log | |||
command=/usr/local/bin/postsrsd-wrapper.sh | |||
command=/etc/init.d/postsrsd start |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I am pretty sure this doesn't work, as the command exits and doesn't run in foreground. How would supervisor handle restarting that service?
Edit:
Running DMS with
ENABLE_SRS=1
SUPERVISOR_LOGLEVEL=info
we can see an endless loop of supervisor starting postsrsd:
Details
mailserver | 2023-03-03 01:02:52,127 INFO success: postsrsd entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
mailserver | 2023-03-03 01:02:52,128 INFO exited: postsrsd (exit status 0; expected)
mailserver | 2023-03-03 01:02:52,129 INFO spawned: 'postsrsd' with pid 972
mailserver | 2023-03-03 01:02:52,137 INFO success: postsrsd entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
mailserver | 2023-03-03 01:02:52,139 INFO exited: postsrsd (exit status 0; expected)
mailserver | 2023-03-03 01:02:52,139 INFO spawned: 'postsrsd' with pid 977
mailserver | 2023-03-03 01:02:52,148 INFO success: postsrsd entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
mailserver | 2023-03-03 01:02:52,149 INFO exited: postsrsd (exit status 0; expected)
mailserver | 2023-03-03 01:02:52,150 INFO spawned: 'postsrsd' with pid 982
mailserver | 2023-03-03 01:02:52,158 INFO success: postsrsd entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
mailserver | 2023-03-03 01:02:52,159 INFO exited: postsrsd (exit status 0; expected)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can see an endless loop of supervisor starting postsrsd:
Could we better detect this issue in tests?
I touched on similar concerns observed in the past with supervisor. In addition to the other improvement we have an existing issue for, I'll add mention of this problem too.
Good catch :)
I just noticed, that the old wrapper also just called So this must be broken for a long time.. 😆 Afaik |
You might want to check with a release prior to
I tried looking into that and that seems to be the case. We could probably open a feature request. The dev is fairly responsive :) They've also released v2, not sure if that'll be part of the upcoming Debian 12 release later this year.
This is an issue specific to Our PostSRSd tests only bother to check config expectations it seems. I also looked over other tests that use |
It looks like we can run the Although with Worth noting, v2 of PostSRSd uses a real config file but would presently require building from source AFAIK. NotesFrom v2 release of PostSRSd:
For reference (with some additional insights in each collapsed section): /etc/init.d/postsrsd (upstream init script)Same content as source (after applying interpolation during install) (last updated Dec 2020 in #! /bin/sh
#
# postsrsd
# start/stop the postsrsd daemon for Postfix
#
### BEGIN INIT INFO
# Provides: postsrsd
# Required-Start: $syslog $network $remote_fs
# Required-Stop: $syslog $network $remote_fs
# Default-Start: 2 3 4 5
# Default-Stop: 0 1 6
# Short-Description: Start/stop the postsrsd daemon
# Description: postsrsd offers Sender Rewriting Scheme support for Postfix
### END INIT INFO
set -e
PATH=/sbin:/bin:/usr/sbin:/usr/bin
DAEMON=/usr/sbin/postsrsd
NAME=postsrsd
DESC="Postfix Sender Rewriting Scheme daemon"
PIDFILE=/var/run/$NAME.pid
SCRIPTNAME=/etc/init.d/$NAME
# Gracefully exit if the package has been removed.
test -x $DAEMON || exit 0
. /lib/lsb/init-functions
# Default configuration
SRS_FORWARD_PORT=10001
SRS_REVERSE_PORT=10002
SRS_DOMAIN=`postconf -h mydomain || true`
SRS_SECRET=/etc/postsrsd.secret
SRS_SEPARATOR==
SRS_HASHLENGTH=4
SRS_HASHMIN=4
SRS_EXTRA_OPTIONS=
RUN_AS=postsrsd
CHROOT=/var/lib/postsrsd
SRS_LISTEN_ADDR=127.0.0.1
SRS_EXCLUDE_DOMAINS=
# Read config file if it is present
if [ -r /etc/default/$NAME ]
then
. /etc/default/$NAME
fi
test -r "$SRS_SECRET" -a -n "$SRS_DOMAIN" || exit 1
ret=0
case "$1" in
start)
log_daemon_msg "Starting $DESC" "$NAME"
if start-stop-daemon --start --oknodo --quiet \
--pidfile $PIDFILE \
--name $NAME \
--startas $DAEMON \
-- $SRS_EXTRA_OPTIONS -f "$SRS_FORWARD_PORT" -r "$SRS_REVERSE_PORT" \
-d "$SRS_DOMAIN" -s "$SRS_SECRET" -a "$SRS_SEPARATOR" \
-n "$SRS_HASHLENGTH" -N "$SRS_HASHMIN" -u "$RUN_AS" -p "$PIDFILE" \
-c "$CHROOT" -l "$SRS_LISTEN_ADDR" -D -X"$SRS_EXCLUDE_DOMAINS"
then
log_end_msg 0
else
ret=$?
log_end_msg 1
fi
;;
stop)
log_daemon_msg "Stopping $DESC" "$NAME"
if start-stop-daemon --stop --oknodo --quiet \
--pidfile $PIDFILE --exec $DAEMON --name $NAME
then
log_end_msg 0
else
ret=$?
log_end_msg 1
fi
rm -f $PIDFILE
;;
reload|force-reload|restart)
$0 stop
$0 start
ret=$?
;;
status)
log_daemon_msg "postsrsd is running"
if [ -s $PIDFILE ]; then
PID=`cat $PIDFILE`
if kill -0 "$PID" 2>/dev/null; then
log_end_msg 0
else
log_end_msg 1
fi
else
log_end_msg 1
fi
;;
*)
echo "Usage: $SCRIPTNAME {start|stop|restart|reload|force-reload|status}" >&2
exit 1
;;
esac
exit $ret `postsrsd -h` (no longer valid in 2.x)
/etc/default/postsrsd (out of sync with upstream)We have been supplying our own copy of the PostSRSd config: Line 104 in 9e2f964
Out of sync with upstream config. Debian
Thus we're not making any changes relevant to DMS during the build and could just use the updated config from the package. Despite being out of sync here, the |
@polarathene could you open an issue about this, possibly describing what we need to do to adopt postsrsd v2? |
Yeah I'll raise some TODO issues tomorrow. Ran out of steam for the day already 😅
We're probably not going to get v2 for a while unless we build it ourselves in a separate stage 🤷♂️ (or get the project to use Github workflow to build and distribute Versions:
Sadly the Aug 2022 release with the 1.12 release that resolves this issue without the extra config we use in tests that have On the plus side, no need to worry about a v2 update arriving with breaking changes to adjust for. |
Yes. I can look into making a PR tomorrow for the two sections below if you like. PostSRSdSo to wrap this up, the init script we're using presently calls But also there is an We can grab them all by filtering out comments and blank lines, then call with env $(grep -vE "^(#.*|\s*)$" /etc/default/postsrsd) postsrsd -e As mentioned before, we don't need our own copy of
PostgreySimilar could be done for Postgrey. #! /bin/bash
source /etc/dms-settings
postgrey \
--inet=127.0.0.1:10023 \
--syslog-facility=mail \
--delay="${POSTGREY_DELAY}" \
--max-age="${POSTGREY_MAX_AGE}" \
--auto-whitelist-clients="${POSTGREY_AUTO_WHITELIST_CLIENTS}" \
--greylist-text="${POSTGREY_TEXT}" Then we don't need the extra ENV in I don't think that requires a wrapper, when the command stops (eg if it
postgrey -h
|
Nice analysis! I'd probably wait though, this does not have a high priority, right? Releasing v12.0.0 and then relaxing the whole situation is better I think. |
The whole point of looking into the above (postsrsd at least) was because of a bug from this PR that @casperklein noticed. Presently I am not able to try verifying the failure locally via a new build atm, but running |
Oh, didn't notice! Yeah, this should be resolved before we publish v12.0.0. 🤣 So running |
I believe so, but like I said I can't verify that right now. I should be able to within 24-48 hours. If you can reproduce the same failure that Casper demonstrated earlier, then apply these changes and the failure no longer reproduces, that'd be a fix :) If I could rely on CI tests for this, I'd happily create a PR, but we don't have a test that catches this (and it's been an issue I've seen with other services during the v12 cycle). |
I did a quick test and it does not work for me. |
I am able to build and run locally again 😅 I tried to reproduce the failure you're experiencing, but I'm not seeing it? Something is different between our systems? $ docker run --rm -it --hostname example.test --env SMTP_ONLY=1 --env SUPERVISOR_LOGLEVEL=info --env ENABLE_SRS=1 mailserver-testing:ci
Restarted supervisord
2023-03-05 04:59:57,227 INFO Included extra file "/etc/supervisor/conf.d/saslauth.conf" during parsing
2023-03-05 04:59:57,227 INFO Included extra file "/etc/supervisor/conf.d/supervisor-app.conf" during parsing
2023-03-05 04:59:57,227 INFO Set uid to user 0 succeeded
2023-03-05 04:59:57,228 INFO RPC interface 'supervisor' initialized
2023-03-05 04:59:57,228 INFO supervisord started with pid 7
2023-03-05 04:59:58,230 INFO spawned: 'mailserver' with pid 30
[ INF ] Welcome to docker-mailserver 11.3.1
2023-03-05 04:59:58,266 INFO success: mailserver entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
[ INF ] Checking configuration
[ INF ] Configuring mail server
[ WARNING ] !! INSECURE !! SSL configured with plain text access - DO NOT USE FOR PRODUCTION DEPLOYMENT
[ INF ] Starting daemons
2023-03-05 04:59:58,968 INFO spawned: 'postsrsd' with pid 310
2023-03-05 04:59:58,969 INFO success: postsrsd entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-03-05 04:59:59,114 INFO spawned: 'cron' with pid 317
2023-03-05 04:59:59,115 INFO success: cron entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-03-05 04:59:59,266 INFO spawned: 'rsyslog' with pid 321
2023-03-05 04:59:59,267 INFO success: rsyslog entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-03-05 04:59:59,414 INFO spawned: 'update-check' with pid 327
2023-03-05 04:59:59,415 INFO success: update-check entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-03-05 04:59:59,713 INFO spawned: 'opendkim' with pid 355
2023-03-05 04:59:59,714 INFO success: opendkim entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-03-05 04:59:59,866 INFO spawned: 'opendmarc' with pid 371
2023-03-05 04:59:59,867 INFO success: opendmarc entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-03-05 05:00:00,016 INFO spawned: 'postfix' with pid 383
2023-03-05 05:00:00,016 INFO success: postfix entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-03-05 05:00:00,173 INFO spawned: 'amavis' with pid 419
2023-03-05 05:00:00,174 INFO success: amavis entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
2023-03-05 05:00:00,334 INFO spawned: 'changedetector' with pid 480
2023-03-05 05:00:00,334 INFO success: changedetector entered RUNNING state, process has stayed up for > than 0 seconds (startsecs)
[ INF ] example.test is up and running
Mar 5 05:00:00 example amavis[419]: starting. /usr/sbin/amavisd-new at example.test amavisd-new-2.11.1 (20181009), Unicode aware, LC_CTYPE="C.UTF-8"
Mar 5 05:00:00 example amavis[419]: perl=5.032001, user=, EUID: 109 (109); group=, EGID: 111 111 (111 111)
Mar 5 05:00:00 example amavis[419]: Net::Server: Group Not Defined. Defaulting to EGID '111 111'
Mar 5 05:00:00 example amavis[419]: Net::Server: User Not Defined. Defaulting to EUID '109'
Mar 5 05:00:00 example amavis[419]: No ext program for .zoo, tried: zoo
Mar 5 05:00:00 example amavis[419]: No ext program for .doc, tried: ripole
Mar 5 05:00:00 example amavis[419]: No decoder for .F
Mar 5 05:00:00 example amavis[419]: No decoder for .doc
Mar 5 05:00:00 example amavis[419]: No decoder for .zoo That is with no changes discussed here. Just git clone and |
Very strange. On debian 11: Details
|
I am able to replicate @casperklein's behaviour on Ubuntu 22.04! |
Could we just set |
This would fix the infinite loop, but would not bring the functionality to proper stop/restart this service. |
As mentioned in #3160 I realized that this was due to my container running with Running with a sane limit via `--ulimit "nofile=1024" reproduces the failure. Adding this as the
This is with command: docker run --rm -it --hostname example.test --env SMTP_ONLY=1 --env SUPERVISOR_LOGLEVEL=info --env ENABLE_SRS=1 --ulimit "nofile=$(ulimit -Sn):$(ulimit -Hn)" mailserver-testing:ci It doesn't exit since it's running in the foreground (unlike the |
Description
The setup is now completely done during actual setup phase. The wrapper did not even catch signals (SIGINT, etc.), which I think is strange.
I also added all the ENVs the wrapper relied on (which previously could have been unset) to the variables script.
Type of change
Checklist:
docs/
)