Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

/.autorelabel on centos/7 sometimes not working #4

Open
gdha opened this issue Sep 28, 2016 · 6 comments

Comments

Projects
None yet
3 participants
@gdha
Copy link
Member

commented Sep 28, 2016

On the recover vm the /.autorelabel file is touched by rear, but when rebooting the vm this seems to be ignored (sometimes seen with virtualbox) resulting into a recovered vm where login is not possible.

Only booting with selinux=0 on the command line helps, but then SELinux is disabled.
We haven't seen this with libvirt so far, reason is unknown

@gdha gdha self-assigned this Sep 28, 2016

@gdha gdha added the bug label Sep 28, 2016

@gdha

This comment has been minimized.

Copy link
Member Author

commented Sep 28, 2016

Once booted via selinux=0 and doing fixfiles relabel does not fix the problem. See also https://www.centos.org/docs/5/html/5.2/Deployment_Guide/sec-sel-fsrelabel.html

@N3WWN

This comment has been minimized.

Copy link

commented Mar 1, 2017

I've got some more info on this...

For the last 3 days or so, I've been beating my head against this exact issue. Upon the first boot from disk, autorelabel works for me no problem, but my coworker, following the same steps, cannot log in due to the autorelabel not actually happening.

We have narrowed it down to how we create our VMs in Virtualbox:

I always add a virtual serial port which has the output sent to a raw file (so I can see everything that was sent to the console, even if the screen is cleared or the VM reboots). My coworker doesn't do this.

If we each restore VMs from the exact same ISO image created by ReaR, his restored VM will not relabel and will not allow logins while mine will relabel, reboot and logins work just fine.

FYI, this is with CentOS 7.2 VMs, Relax-and-Recover 1.17.2 / Git, Vagrant 1.9.1 and Virtualbox 5.1.14 r112924, but we haven't narrowed it down to which component is the culprit yet.

Hopefully I can look into this shortly and either get more info back here... or a patch to resolve it (if it's a ReaR issue)!

-Rich Alloway (RogueWave)

@gdha

This comment has been minimized.

Copy link
Member Author

commented Mar 3, 2017

@N3WWN FYI I never had this issue with libvirt.

In my next workshop I turn off selinux altogether to fore-come this little annoyance.

@N3WWN

This comment has been minimized.

Copy link

commented Mar 14, 2017

So, here's an update on what I've found...

The crux of the problem is that, if you have console=ttyS0,115200, or some other ttySx serial console definition, as the last console on the kernel command line, the kernel will attempt to activate the serial console regardless of whether or not the serial port actually exists.

The last console definition determines where /dev/console is located.

A StandardInput=tty parameter in the /lib/systemd/system/rhel-autorelabel.service file causes systemd to attempt an ioctl() operation on /dev/console, which fails, resulting in any systemd service with StandardInput=tty to fail:

Mar 13 12:30:56 CentOS7-Virtualbox systemd: Started Mark the need to relabel after reboot.
Mar 13 12:30:56 CentOS7-Virtualbox systemd: Failed at step STDIN spawning /lib/systemd/rhel-autorelabel: Inappropriate ioctl for device
Mar 13 12:30:56 CentOS7-Virtualbox systemd: rhel-autorelabel.service: main process exited, code=exited, status=208/STDIN
Mar 13 12:30:56 CentOS7-Virtualbox systemd: Unit rhel-autorelabel.service entered failed state.
Mar 13 12:30:56 CentOS7-Virtualbox systemd: rhel-autorelabel.service failed.

The kernel cmdline is

BOOT_IMAGE=/vmlinuz-3.10.0-327.13.1.el7.x86_64 root=/dev/mapper/VolGroup00-LogVol00 ro no_timer_check console=tty0 console=ttyS0,115200 net.ifnames=0 biosdevname=0 crashkernel=auto rd.lvm.lv=VolGroup00/LogVol00 rd.lvm.lv=VolGroup00/LogVol01 rhgb quiet LANG=en_US.UTF-8

when this failed.

Swapping the console definitions, which changes the cmdline to

BOOT_IMAGE=/vmlinuz-3.10.0-327.13.1.el7.x86_64 root=/dev/mapper/VolGroup00-LogVol00 ro no_timer_check console=ttyS0,115200 console=tty0 net.ifnames=0 biosdevname=0 crashkernel=auto rd.lvm.lv=VolGroup00/LogVol00 rd.lvm.lv=VolGroup00/LogVol01 rhgb quiet LANG=en_US.UTF-8

results in /dev/console being /dev/tty0, so systemd can perform ioctl() operations successfully. Removing console=ttyS0,115220 also works.

I have been able to replicate this in VirtualBox and libvirt, but NOT in VMWare Fusion.

My guess is that there is something different with how VMWare Fusion handles emulation of the serial hardware regardless of whether or not the serial device is present in the VM configuration.

VirtualBox defaults to not having a virtual serial port present on VMs. libvirt default to having a virtual serial port present on VMs. I think this is why you haven't seen this issue with libvirt @gdha .

If you remove the virtual serial port from a libvirt VM but have the last console definition be a ttySx device, you can recreate the problem within libvirt.

Since we don't know what hardware (virtual or bare metal) the recover will be performed on, I don't think we can catch this reliably on the system running the backup.

.

What I think we could do, though, is to warn users that there is a potential issue with having the last console definition reference ttySx IFF the serial device does not exist on the restore system.

.

If this sounds reasonable, do you think it should only be included in the logfile or sent to the logfile and output to STDERR?

-Rich Alloway (RogueWave)


Test environments:

VirtualBox:
CentOS 7.2 VMs, Vagrant 1.9.1 and Virtualbox 5.1.14 r112924

libvirt:
CentOS 7.2 VMs, libvirt 2.0.0-10.el7_3.5.x86_64

VMWare Fusion:
CentOS 7.2 VMs, VMWare Fusion Professional Version 8.5.3 (4696910)

@gdha

This comment has been minimized.

Copy link
Member Author

commented Mar 14, 2017

@N3WWN Perhaps hashicorp/vagrant#536 could fix this issue via Vagranfile (adding ttyS0)?

@schlomo

This comment has been minimized.

Copy link
Member

commented May 2, 2017

I reported this bug to CentOS as https://bugs.centos.org/view.php?id=13213 and I also created a Vagrantfile to show the problem and the fix at https://gist.github.com/schlomo/b532ba9bca87ea40d922d90e62b7338c

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.