RHEL7.6: rpcinfo -p fails when starting NFS #2341

hpannenb · 2020-03-17T15:07:43Z

Relax-and-Recover (ReaR) Issue Template

Fill in the following items before submitting a new issue
(quick response is not guaranteed with free support):

ReaR version ("/usr/sbin/rear -V"):
Relax-and-Recover 2.5 / 2019-05-10
OS version ("cat /etc/rear/os.conf" or "lsb_release -a" or "cat /etc/os-release"):
Red Hat Enterprise Linux Server 7.6 (Maipo)
ReaR configuration files ("cat /etc/rear/site.conf" and/or "cat /etc/rear/local.conf"):

[...]
export TMPDIR="/var/tmp"

OUTPUT=PXE
OUTPUT_PREFIX_PXE=pxe-iso-$HOSTNAME

BACKUP=NETFS
BACKUP_URL=nfs://<SERVER>/var/backup/

PXE_TFTP_URL=nfs://<SERVER>/var/lib/tftpboot
PXE_CONFIG_URL=nfs://<SERVER>/var/lib/tftpboot/pxelinux.cfg

USE_DHCLIENT=
USE_STATIC_NETWORKING="y"
USE_RESOLV_CONF=n

MODULES=()
[...]

Hardware (PC or PowerNV BareMetal or ARM) or virtual machine (KVM guest or PoverVM LPAR):
BareMetal
System architecture (x86 compatible or PPC64/PPC64LE or what exact ARM device):
x86
Firmware (BIOS or UEFI or Open Firmware) and bootloader (GRUB or ELILO or Petitboot):
BIOS
Storage (local disk or SSD) and/or SAN (FC or iSCSI or FCoE) and/or multipath (DM or NVMe):
local disk
Storage layout ("lsblk -ipo NAME,KNAME,PKNAME,TRAN,TYPE,FSTYPE,SIZE,MOUNTPOINT" or "lsblk" as makeshift):
...
Description of the issue (ideally so that others can reproduce it):
The recovery fails because the NFSv4 shared backup directory /var/backup on cannot be accessed. rpcinfo -p always fails in NETFS/default/050_start_required_nfs_daemons.sh
Workaround, if any:
In the RESCUE environment adjust the line 58 in file NETFS/default/050_start_required_nfs_daemons.sh from rpcinfo -p to rpcinfo -p <SERVER>. Afterwards rear recover works flawlessly.
Attachments, as applicable ("rear -D mkrescue/mkbackup/recover" debug log files):

+ echo '2020-03-17 15:14:49.690880709 Leaving debugscripts mode (back to previous bash flags and options settings).'
2020-03-17 15:14:49.690880709 Leaving debugscripts mode (back to previous bash flags and options settings).
2020-03-17 15:14:49.697364352 Including verify/NETFS/default/050_start_required_nfs_daemons.sh
2020-03-17 15:14:49.699329362 Entering debugscripts mode via 'set -x'.
+ source /usr/share/rear/verify/NETFS/default/050_start_required_nfs_daemons.sh
+++ url_scheme nfs://ts9esesv805/var/backup/
+++ local url=nfs://ts9esesv805/var/backup/
+++ local scheme=nfs
+++ echo nfs
+++ grep -q :
+++ echo nfs
++ local backup_url_scheme=nfs
++ test nfs = nfs
++ local attempt=
++ local portmapper_program=
++ LogPrint 'Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if available.'
++ Log 'Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if available.'
++ echo '2020-03-17 15:14:49.709025586 Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if availab
le.'
2020-03-17 15:14:49.709025586 Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if available.
++ Print 'Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if available.'
++ has_binary portmap
++ for bin in '$@'
++ type portmap
/usr/share/rear/lib/_input-output-functions.sh: line 457: type: portmap: not found
++ return 1
++ has_binary rpcbind
++ for bin in '$@'
++ type rpcbind
++ return 0
++ portmapper_program=rpcbind
++ rpcinfo -p
++ rpcbind
++ LogPrint 'Started RPC portmapper '\''rpcbind'\''.'
++ Log 'Started RPC portmapper '\''rpcbind'\''.'
++ echo '2020-03-17 15:14:49.717288722 Started RPC portmapper '\''rpcbind'\''.'
2020-03-17 15:14:49.717288722 Started RPC portmapper 'rpcbind'.
++ Print 'Started RPC portmapper '\''rpcbind'\''.'
+++ seq 5
++ for attempt in '$( seq 5 )'
++ grep -q portmapper
++ rpcinfo -p
++ sleep 1
++ for attempt in '$( seq 5 )'
++ rpcinfo -p
++ grep -q portmapper
++ sleep 1
++ for attempt in '$( seq 5 )'
++ grep -q portmapper
++ rpcinfo -p
++ sleep 1
++ for attempt in '$( seq 5 )'
++ grep -q portmapper
++ rpcinfo -p
++ sleep 1
++ for attempt in '$( seq 5 )'
++ grep -q portmapper
++ rpcinfo -p
++ sleep 1
++ test ok = 5
++ Error 'RPC portmapper '\''rpcbind'\'' unavailable.'
++ PrintError 'ERROR: RPC portmapper '\''rpcbind'\'' unavailable.'
++ PrintError 'Some latest log messages since the last called script 050_start_required_nfs_daemons.sh:'
++ PrintError '  2020-03-17 15:14:49.697364352 Including verify/NETFS/default/050_start_required_nfs_daemons.sh
  2020-03-17 15:14:49.699329362 Entering debugscripts mode via '\''set -x'\''.
  2020-03-17 15:14:49.709025586 Starting required daemons for NFS: RPC portmapper (portmap or rpcbind) and rpc.statd if available.
  /usr/share/rear/lib/_input-output-functions.sh: line 457: type: portmap: not found
  2020-03-17 15:14:49.717288722 Started RPC portmapper '\''rpcbind'\''.'
++ Log 'ERROR: RPC portmapper '\''rpcbind'\'' unavailable.'
++ echo '2020-03-17 15:14:54.761443203 ERROR: RPC portmapper '\''rpcbind'\'' unavailable.'
2020-03-17 15:14:54.761443203 ERROR: RPC portmapper 'rpcbind' unavailable.
++ LogToSyslog 'ERROR: RPC portmapper '\''rpcbind'\'' unavailable.'
++ logger -t rear -i 'ERROR: RPC portmapper '\''rpcbind'\'' unavailable.'
++ echo '===== Stack trace ====='

Execution of rpcinfo in the RESCUE environment:

RESCUE <CLIENT>:~ # rpcinfo -p
rpcinfo: can't contact portmapper: RPC: Remote system error - Connection refused

RESCUE <CLIENT>:~ # rpcinfo -p <SERVER>
   program vers proto   port  service
    100000    4   tcp    111  portmapper
    100000    3   tcp    111  portmapper
    100000    2   tcp    111  portmapper
    100000    4   udp    111  portmapper
    100000    3   udp    111  portmapper
    100000    2   udp    111  portmapper
    100005    1   udp  20048  mountd
    100005    1   tcp  20048  mountd
    100024    1   udp  54592  status
    100024    1   tcp  37424  status
    100005    2   udp  20048  mountd
[...]

The text was updated successfully, but these errors were encountered:

hpannenb · 2020-03-17T16:30:42Z

seems to be related to #2250

pcahyna · 2020-03-17T16:51:28Z

@rmetrich @yontalcar can you please have a look?

hpannenb · 2020-03-17T17:29:35Z

I checked on https://access.redhat.com/solutions/1132143 and I got a trigger looking into /var/log/messages of the RESCUE environment:

Mar 17 18:15:20 CLIENT rpcbind[12127]: cannot get uid of 'rpc': Success

Adding the user "rpc" to the /etc/passwd

[...]
rpc:x:32:32:Rpcbind Daemon:/var/lib/rpcbind:/sbin/nologin
[...]

in the RESCUE environment solves this rpcbind startup failure. So it should be considered to include not only rpcuser but also the rpc user.

I will give it a try with adding this to local.conf:

CLONE_USERS=( ${CLONE_USERS[@]} rpcuser rpc )

which should be changed in Line 12 of 100_check_nfs_version.sh thereafter.

hpannenb · 2020-03-17T18:10:10Z

I changed the mentioned line on my test system, rpcbind is starting and the NFS based recovery is working now.

hpannenb · 2020-03-19T11:17:19Z

@jsmeix so in most of the cases it is better safe than sorry to always enabled the parameter with CLONE_ALL_USERS_GROUPS=yes.

Since Your change and mine were merged into current version I will close the issue.

pcahyna · 2020-03-19T11:22:36Z

@hpannenb thanks, so the addition of the parameter after -p is not needed anymore?

@xjezda00 I am curious about this, IIUC you have been testing restore from NFS regularly, have you seen this problem? If not, why not?

pcahyna · 2020-03-19T11:34:54Z

@hpannenb thanks, so the addition of the parameter after -p is not needed anymore?

Ah I see, it is intentionally called without a parameter according to a comment in the file:

# use plain 'rpcinfo -p' to check if RPC service is available
# instead of using 'rpcinfo -p localhost' because the latter
# does not work on some systems while the former works everywhere
https://github.com/rear/rear/issues/889

nevermind.

Added required NFS users for proper rpcbind startup: Usually 'rpcuser' is used but 'rpc' is used in RHEL7.x and '_rpc' is used in Debian 10, see #2341 and #2250

jsmeix · 2020-03-19T14:50:19Z

With #2342 merged
this issue should be fixed.

@hpannenb
thank you for your analysis what the root cause was
and for your fix!

jsmeix · 2020-03-19T14:53:28Z

@hpannenb
regarding your #2341 (comment)

I did not set CLONE_ALL_USERS_GROUPS="yes".
The default is still CLONE_ALL_USERS_GROUPS="no"
https://github.com/rear/rear/blob/master/usr/share/rear/conf/default.conf#L1412

jsmeix · 2020-03-19T15:12:53Z

I did #2345
to get the question answered if we should have
CLONE_ALL_USERS_GROUPS="yes" by default.

hpannenb mentioned this issue Mar 17, 2020

Add missing but required NFS users for proper rpcbind startup #2342

Merged

jsmeix added enhancement Adaptions and new features minor bug An alternative or workaround exists labels Mar 18, 2020

hpannenb closed this as completed Mar 19, 2020

jsmeix added a commit that referenced this issue Mar 19, 2020

Merge pull request #2342 from hpannenb/master

d9a23d2

Added required NFS users for proper rpcbind startup: Usually 'rpcuser' is used but 'rpc' is used in RHEL7.x and '_rpc' is used in Debian 10, see #2341 and #2250

jsmeix self-assigned this Mar 19, 2020

jsmeix added the fixed / solved / done label Mar 19, 2020

jsmeix mentioned this issue Mar 19, 2020

Have CLONE_ALL_USERS_GROUPS="yes" by default? #2345

Closed

gdha mentioned this issue Sep 5, 2020

RPC Portmapper "rpcbind" unavailable - Proxmox Recovery #2487

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

RHEL7.6: rpcinfo -p fails when starting NFS #2341

RHEL7.6: rpcinfo -p fails when starting NFS #2341

hpannenb commented Mar 17, 2020

hpannenb commented Mar 17, 2020

pcahyna commented Mar 17, 2020

hpannenb commented Mar 17, 2020 •

edited

hpannenb commented Mar 17, 2020

hpannenb commented Mar 19, 2020

pcahyna commented Mar 19, 2020

pcahyna commented Mar 19, 2020

jsmeix commented Mar 19, 2020

jsmeix commented Mar 19, 2020

jsmeix commented Mar 19, 2020

RHEL7.6: rpcinfo -p fails when starting NFS #2341

RHEL7.6: rpcinfo -p fails when starting NFS #2341

Comments

hpannenb commented Mar 17, 2020

Relax-and-Recover (ReaR) Issue Template

hpannenb commented Mar 17, 2020

pcahyna commented Mar 17, 2020

hpannenb commented Mar 17, 2020 • edited

hpannenb commented Mar 17, 2020

hpannenb commented Mar 19, 2020

pcahyna commented Mar 19, 2020

pcahyna commented Mar 19, 2020

jsmeix commented Mar 19, 2020

jsmeix commented Mar 19, 2020

jsmeix commented Mar 19, 2020

hpannenb commented Mar 17, 2020 •

edited