Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

postfix/postdrop didn't work anymore over NFS after update to 0.7.x #6949

Closed
ggzengel opened this issue Dec 13, 2017 · 10 comments
Closed

postfix/postdrop didn't work anymore over NFS after update to 0.7.x #6949

ggzengel opened this issue Dec 13, 2017 · 10 comments

Comments

@ggzengel
Copy link
Contributor

ggzengel commented Dec 13, 2017

This issue exist only in combination with nfs and zfs.
I think there is a compatibility issue with sticky bits and nfs groups in 0.7.x. But I don't know how to discover this.
I don't use "--manage-gids" on nfs server.

The problem exists with kernel 3.16 and 4.9.

After upgrading of 3 ZFS systems from 0.6.11 to 0.7.3 all diskless VMs didn't send mail over postfix any more.
On the 3rd system I updated all packages except zfs, made a reboot and verified the mail function.
After the mail function worked as before I updated zfs and got this issue.

Sending mail as non root with /usr/bin/mail (package bsd-mail) uses postdrop which wants to write to /var/spool/postfix/maildrop and freezes.

The command is designed to run with set-group ID privileges, so that it can write to the maildrop queue directory and so that it can connect to Postfix daemon processes.

$ ls -la /usr/sbin/postdrop
-r-xr-sr-x 1 root postdrop 14456 Sep 27 04:56 /usr/sbin/postdrop

$ ls -la /var/spool/postfix/maildrop/
drwx-wx--T  2 postfix postdrop  3 Dec 12 22:58 .
drwxr-xr-x 20 root    root     20 Nov 21 02:09 ..
-rw-r--r--  1 nagios  postdrop  0 Dec 12 22:58 887007.1468

For testing I use the following command which sends successfully mails on 0.6.11 but not with 0.7.3.
On 0.7.3 it's hanging endless and I have to use CTRL-C.
As root I can send mails. So it's a access issue on zfs/nfs server.

If I replace /var/spool/postfix/maildrop with a tmpfs filesystem I can send mails too.

# mount -t tmpfs none /var/spool/postfix/maildrop
# chmod 1730 /var/spool/postfix/maildrop
# chown postfix:postdrop /var/spool/postfix/maildrop
# su nagios -c "mail -vv root" -s /bin/bash
Subject: Test
a
.
Cc: 
Mail Delivery Status Report will be mailed to <nagios>.

# umount /var/spool/postfix/maildrop
# su nagios -c "mail -vv root" -s /bin/bash
Subject: Test
a
.
Cc: 
^C
Session terminated, terminating shell...send-mail: warning: command "/usr/sbin/postdrop -r" exited with status 2
send-mail: fatal: nagios(108): unable to execute /usr/sbin/postdrop -r: Success
Can't send mail: sendmail process failed with error code 75
 ...terminated.

While it's hanging it looks like:

root       387  0.0  0.1  72020  6300 ?        Ss   22:53   0:00 /usr/sbin/sshd -D
root      1427  0.0  0.1  98940  6816 ?        Ss   22:58   0:00  \_ sshd: root@pts/0
root      1439  0.0  0.1  21024  4940 pts/0    Ss   22:58   0:00  |   \_ -bash
root      3522  0.0  0.0  55780  3124 pts/0    S+   23:17   0:00  |       \_ su nagios -c mail -vv root -s /bin/bash
nagios    3523  0.0  0.0  15408  2292 ?        Ss   23:17   0:00  |           \_ mail -vv root
nagios    3530  0.0  0.1  82200  6532 ?        S    23:17   0:00  |               \_ send-mail -i -t -v
nagios    3531 20.3  0.1  82192  6440 ?        R    23:17   0:04  |                   \_ /usr/sbin/postdrop -r

After each try I have one more zero length file in maildrop:

# ls -la /var/spool/postfix/maildrop/
total 51
drwx-wx--T  2 postfix postdrop  6 Dec 12 23:49 .
drwxr-xr-x 20 root    root     20 Nov 21 02:09 ..
-rw-r--r--  1 nagios  postdrop  0 Dec 12 23:15 790064.3271
-rw-r--r--  1 nagios  postdrop  0 Dec 12 22:58 887007.1468
-rw-r--r--  1 nagios  postdrop  0 Dec 12 23:46 888454.6275
-rw-r--r--  1 nagios  postdrop  0 Dec 12 23:17 898887.3531

postdrop doesn't say a lot, but with NFS/ZFS it freezes and I have to use CTRL-C:

# mount -t tmpfs none /var/spool/postfix/maildrop
# chmod 1730 /var/spool/postfix/maildrop
# chown postfix:postdrop /var/spool/postfix/maildrop
# su nagios -c 'echo t | /usr/sbin/postdrop -rvvv' -s /bin/bash
queue_id1762D9DCBpostdrop: warning: stdin: unexpected EOF in data, record type 116 length 10
postdrop: fatal: uid=108: malformed input

# umount /var/spool/postfix/maildrop
# su nagios -c 'echo t | /usr/sbin/postdrop -rvvv' -s /bin/bash
^C
Session terminated, terminating shell... ...terminated.

Client (XEN + diskless)

# uname -a
Linux icinga-ex1.extern1.local 4.9.0-4-amd64 #1 SMP Debian 4.9.65-3 (2017-12-03) x86_64 GNU/Linux

# mount | grep nfs
10.190.0.101:/srv/nfs/xen/icinga-ex1.extern1.local on / type nfs (rw,relatime,vers=3,rsize=1048576,wsize=1048576,namlen=255,hard,nolock,proto=tcp,port=2049,timeo=7,retrans=3,sec=sys,local_lock=all,addr=10.190.0.101)

# cat /proc/cmdline 
boot=nfs ip=10.190.0.249:10.190.0.101:10.190.0.1:255.255.255.0:icinga-ex1.extern1.local:eth0:off nfsroot=10.190.0.101:/srv/nfs/xen/icinga-ex1.extern1.local,rw,hard,intr console=/dev/hvc0

Server

# dkms status
spl, 0.7.3, 4.9.0-0.bpo.4-amd64, x86_64: installed
zfs, 0.7.3, 4.9.0-0.bpo.4-amd64, x86_64: installed

# uname -a
Linux server-ex2 4.9.0-0.bpo.4-amd64 #1 SMP Debian 4.9.51-1~bpo8+1 (2017-10-17) x86_64 GNU/Linux

# cat /etc/exports 
/srv/nfs/ *(async,no_subtree_check,no_all_squash,rw,nohide,fsid=0)

# zfs get sharenfs zpool1/xen/icinga-ex1.extern1.local
NAME                                 PROPERTY  VALUE                                                          SOURCE
zpool1/xen/icinga-ex1.extern1.local  sharenfs  rw,async,no_subtree_check,no_root_squash,no_all_squash,nohide  inherited from zpool1/xen

# exportfs -v | grep icinga
/srv/nfs/xen/icinga.extern1.local
		<world>(rw,async,wdelay,nohide,no_root_squash,no_subtree_check,mountpoint,sec=sys,rw,no_root_squash,no_all_squash)

# cat /etc/xen/icinga-ex1.extern1.local.conf 
kernel  = '/srv/nfs/xen/icinga-ex1.extern1.local/vmlinuz'
ramdisk = '/srv/nfs/xen/icinga-ex1.extern1.local/initrd.img'
memory  = '4096'
vcpus     = '4'
name     = 'icinga-ex1.extern1.local'
hostname = 'icinga-ex1.extern1.local'
vif     = [ 'bridge=br_intern_zmt,mac=02:0A:7D:50:02:21' ]
extra      = 'boot=nfs ip=10.190.0.249:10.190.0.101:10.190.0.1:255.255.255.0:icinga-ex1.extern1.local:eth0:off nfsroot=10.190.0.101:/srv/nfs/xen/icinga-ex1.extern1.local,rw,hard,intr console=/dev/hvc0'
@behlendorf
Copy link
Contributor

behlendorf commented Dec 13, 2017

@ggzengel can you try 0.7.4 which was released yesterday, it includes a fix, 1030f80, for an incorrect permission denied error which could result when using sticky bits over NFS.

[edit] link fixed

@ggzengel
Copy link
Contributor Author

@behlendorf Your link is not working. Please fix it.
I use debian packages and they are at 0.7.3 but there is a request for 0.7.4 at http://lists.alioth.debian.org/pipermail/pkg-zfsonlinux-devel/2017-December/001375.html.
I will write there too with reference to this issue and hope they will make a update soon.
@aerusso Is this fix included in your actual package?

@ggzengel
Copy link
Contributor Author

@behlendorf I found the right commit. Github search didn't find. I had to patch an other commit url.
1030f80

@ggzengel
Copy link
Contributor Author

@aerusso I found this patch in your 0.7.4 code base. So I hope they will update soon.

@aerusso
Copy link
Contributor

aerusso commented Dec 13, 2017

@ggzengel Can you confirm that 0.7.4 fixes your problem? Building the zfs and spl Debian packages should be relatively easy. The branch debian/0.7.4-0 in both of those repositories has an updated changelog.

@ggzengel
Copy link
Contributor Author

@aerusso The patch makes sense and would match to the problem I have.
I haven't tried it yet.
Can you give me a short explain how to create the deb files I need?

@aerusso
Copy link
Contributor

aerusso commented Dec 14, 2017

  1. apt-src build-dep zfsutils-linux
  2. Grab both repositories debian/0.7.4-0 branches
  3. In each of those directories, run debuild -b -us -uc
  4. The debs will show up one directory above where you called debuild
  5. dpkg -i all the debs at the same time (they have a bunch of versioned dependencies now to help people upgrade).

@janlam7
Copy link

janlam7 commented Dec 15, 2017

I experienced postfix hangs too, but they seem random, and not easily reproducible, in this case in lxc containers mounted on glusterfs hosted on zfs. I only had to put /var/spool/postfix/pid/ on a tmpfs as a workaround. I'm wondering if it could be related. @ggzengel was your case easy to reproduce ?

@ggzengel
Copy link
Contributor Author

@janlam7 Sending mail from localhost with maildrop never worked again.

@ggzengel
Copy link
Contributor Author

@aerusso Thanks for your help.
I installed a jessie VM and build the 0.7.4 packages. After testing with zfs-test I installed them on my productive system.
Now postfix is sending mails again.
@happyaron Please package the debian packages.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants