Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

recover from failure (update to fedora 31) #1684

Closed
aanno opened this issue Nov 1, 2019 · 14 comments
Closed

recover from failure (update to fedora 31) #1684

aanno opened this issue Nov 1, 2019 · 14 comments
Assignees
Labels
linked linked to external bug tracker

Comments

@aanno
Copy link

aanno commented Nov 1, 2019

Yesterday, I updated my PC from fedora 30 to 31. The update turned out to be problematic, and now I could only boot into the 'emergency shell'.

Fedora installation disk layout is like this: All partitions are LUKS encrypted, root (/) is (directly) on LVM on a SSD. /home and /opt are stratis managed (HDDs partition SSD cached).

Investigations showed the following: boot into 'emergency shell' is because /home could not be mounted. However, I have problems to recover stratis both from the 'emergency shell' and from the fedora 31 live cd.

In 'emergency shell' stratisd warns that there is no DBus. Hence the stratis cli is on no use. I have not found a way to successfully start DBus in the 'emergency shell'.

From the live cd, I could install stratisd and stratis-cli. But it is not possible to start stratisd with systemd (some IO Error, permission denied).

In chroot from live cd to my installed root (/) it is not possible to start stratisd because systemd does not allow this in chroot.

So, has somebody an idea what to do to recover the system?

(Cross posted on mailing list as well)

@aanno
Copy link
Author

aanno commented Nov 1, 2019

I opened a fedora bug for this: https://bugzilla.redhat.com/show_bug.cgi?id=1767743

@mulkieran
Copy link
Member

mulkieran commented Nov 1, 2019

Please give precise behavior when attempting to start w/ systemd, thanks! EDIT: Probably unnecessary, I see there is much more information in bz.

@drckeefe
Copy link
Member

drckeefe commented Nov 2, 2019

I've performed a basic test of creating a stratis volume on fedora 30 then did an upgrade to fedora 31. There was a minor hang on first boot of the system (not sure what the cause of it was), but rebooted again and the volume mounted fine with all of its data. This test didn't involve LUKS. I'll review the sosreport in bz 1767743.

@drckeefe
Copy link
Member

drckeefe commented Nov 2, 2019

From comment 3 of https://bugzilla.redhat.com/show_bug.cgi?id=1767743 there is an "os error 13" reported
"Nov 01 11:19:56 blacksnapper stratisd[2585]: IO error: Permission denied (os error 13)"
We've seen a similar error in issues 853.
#853

I believe there was a fix to the lock file location for stratis recently because of a different boot issue. Could this be a similar issue? Still looking...

@drckeefe
Copy link
Member

drckeefe commented Nov 2, 2019

aanno, is stratisd running after boot, if not, can you try to start it? # systemctl enable stratisd;systemctl start stratisd ( please provide output)
Can you provide the output of #lsblk after trying to start stratisd?
Also, do you know if the LUKS volume is open?
Thank you,
-Dennis

@aanno
Copy link
Author

aanno commented Nov 2, 2019

Dear drckeefe,

thank you for looking into this. On the effected system, stratisd is not running after boot and I can't start it because of https://bugzilla.redhat.com/show_bug.cgi?id=1767773 .

However, if I first setenforce 0 in the emergency shell I could start stratisd and use my data. However, the problem is not really solved as (a) I have to use the emergency shell or (b) disable selinux permanently.

I don't imply that every update from fedora 30 to fedora 31 is effected. My update was a bit rough because I had to disable some third party repos after the update to regain a usable system. Hence it is possible that all kind of evil has effected my update.

However, as far I can tell my system is now in a clean (i.e. fedora 31 repos only plus nvidia driver) state. This ticket is more on my options for recovery. For me it feels a bit problematic that interaction with stratisd is based on DBus - as DBus might not be available in case of a disaster. For example, it is not available in the (fedora) emergency shell.

In summary, stratisd seems not to be the reason for my ongoing problem with the system. There is no data loss, however at present the system is hardly usable. The problem at hand is caused by some strange interaction between selinux and stratisd. So probably only fedora users could encounter the problem...

@aanno
Copy link
Author

aanno commented Nov 2, 2019

Versions affected:

# rpm -qa selinux-policy* stratis*
stratis-cli-1.0.4-1.module_f31+6320+bf3c8975.x86_64
selinux-policy-targeted-3.14.4-39.fc31.noarch
selinux-policy-3.14.4-39.fc31.noarch
stratisd-1.0.5-1.module_f31+6320+bf3c8975.x86_64

@drckeefe
Copy link
Member

drckeefe commented Nov 3, 2019

Thank you for digging into this further and provide feedback to the issue it is very helpful information. I'm interested to know what your thoughts are on a quick process to reproduce this, such as:

  1. install Fedora 30
  2. create Stratis filesystem
  3. persist the mount point in /etc/fstab using
  4. upgrade to Fedora 31

Do you believe that the upgrade process impacts this configuration? Could this happen if a Stratis filesystem was created directly from Fedora 31?

@drckeefe
Copy link
Member

drckeefe commented Nov 4, 2019

Thanks for the information aanno.

Here was my process to get stratis to fail after installing stratis-cli

  1. create a vm using the fedora 31 live iso
  2. add a 2gb virtio dist to the vm after the live iso was booted
  3. yum install stratis-cli

Installing:
 stratis-cli                                            x86_64                       1.0.4-1.module_f31+6320+bf3c8975                       fedora-modular                        52 k
Installing dependencies:
 stratisd                                               x86_64                       1.0.5-1.module_f31+6320+bf3c8975                       fedora-modular                       1.5 M
...
4. systemctl enable stratisd;systemctl start stratisd

Now check the status of startisd

systemctl status stratisd
[root@localhost-live liveuser]# systemctl status stratisd
● stratisd.service - A daemon that manages a pool of block devices to create flexible file systems
   Loaded: loaded (/usr/lib/systemd/system/stratisd.service; enabled; vendor preset: enabled)
   Active: failed (Result: exit-code) since Mon 2019-11-04 15:11:16 EST; 1min 56s ago
     Docs: man:stratisd(8)
  Process: 3171 ExecStart=/usr/libexec/stratisd --debug (code=exited, status=1/FAILURE)
 Main PID: 3171 (code=exited, status=1/FAILURE)
      CPU: 3ms

Nov 04 15:11:16 localhost-live systemd[1]: Started A daemon that manages a pool of block devices to create flexible file systems.
Nov 04 15:11:16 localhost-live stratisd[3171]: DEBUG libstratis::stratis::buff_log: BuffLogger: pass_through: true hold time: none
Nov 04 15:11:16 localhost-live stratisd[3171]:  INFO stratisd: Using StratEngine
Nov 04 15:11:16 localhost-live stratisd[3171]: IO error: Permission denied (os error 13)
Nov 04 15:11:16 localhost-live systemd[1]: stratisd.service: Main process exited, code=exited, status=1/FAILURE
Nov 04 15:11:16 localhost-live systemd[1]: stratisd.service: Failed with result 'exit-code'.

Try to create a pool results in a failure because stratisd is not running
[root@localhost-live liveuser]# stratis pool create p1 /dev/vda
Execution failure caused by:
The name does not have an owner
    which in turn caused:
The name is not activatable

Most likely stratis is unable to connect to the stratisd D-Bus service.

Set selinux to 0 resolves the issue

[root@localhost-live liveuser]# setenforce 0
[root@localhost-live liveuser]# systemctl start stratisd
[root@localhost-live liveuser]# systemctl status stratisd
● stratisd.service - A daemon that manages a pool of block devices to create flexible file systems
   Loaded: loaded (/usr/lib/systemd/system/stratisd.service; enabled; vendor preset: enabled)
   Active: active (running) since Mon 2019-11-04 15:13:58 EST; 3s ago
     Docs: man:stratisd(8)
 Main PID: 3187 (stratisd)
    Tasks: 1 (limit: 2308)
   Memory: 1020.0K
      CPU: 7ms
   CGroup: /system.slice/stratisd.service
           └─3187 /usr/libexec/stratisd --debug

Nov 04 15:13:58 localhost-live systemd[1]: Started A daemon that manages a pool of block devices to create flexible file systems.
Nov 04 15:13:58 localhost-live stratisd[3187]: DEBUG libstratis::stratis::buff_log: BuffLogger: pass_through: true hold time: none
Nov 04 15:13:58 localhost-live stratisd[3187]:  INFO stratisd: Using StratEngine
Nov 04 15:13:58 localhost-live stratisd[3187]: DEBUG stratisd: Engine state:
Nov 04 15:13:58 localhost-live stratisd[3187]: StratEngine {
Nov 04 15:13:58 localhost-live stratisd[3187]:     pools: {},
Nov 04 15:13:58 localhost-live stratisd[3187]:     incomplete_pools: {},
Nov 04 15:13:58 localhost-live stratisd[3187]:     watched_dev_last_event_nrs: {},
Nov 04 15:13:58 localhost-live stratisd[3187]: }
Nov 04 15:13:58 localhost-live stratisd[3187]:  INFO stratisd: D-Bus API is available

[root@localhost-live liveuser]# stratis pool create p1 /dev/vda
[root@localhost-live liveuser]# stratis pool
Name Total Physical Size Total Physical Used
p1 2 GiB 52 MiB

@drckeefe
Copy link
Member

drckeefe commented Nov 4, 2019

Here are the journal messages related to stratisd being blocked by selinux

Nov 04 15:33:25 localhost-live systemd[1]: Started A daemon that manages a pool of block devices to create flexible file systems.
Nov 04 15:33:25 localhost-live audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=stratisd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Nov 04 15:33:25 localhost-live stratisd[2607]: DEBUG libstratis::stratis::buff_log: BuffLogger: pass_through: true hold time: none
Nov 04 15:33:25 localhost-live stratisd[2607]: INFO stratisd: Using StratEngine
Nov 04 15:33:25 localhost-live audit[2607]: AVC avc: denied { dac_override } for pid=2607 comm="stratisd" capability=1 scontext=system_u:system_r:stratisd_t:s0 tcontext=system_u:system_r:stratisd_t:s0 tclass=capability permissive=0
Nov 04 15:33:25 localhost-live stratisd[2607]: IO error: Permission denied (os error 13)
Nov 04 15:33:25 localhost-live audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=stratisd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
Nov 04 15:33:25 localhost-live systemd[1]: stratisd.service: Main process exited, code=exited, status=1/FAILURE
Nov 04 15:33:25 localhost-live systemd[1]: stratisd.service: Failed with result 'exit-code'.
Nov 04 15:34:55 localhost-live systemd[1]: Started A daemon that manages a pool of block devices to create flexible file systems.
Nov 04 15:34:55 localhost-live audit[1]: SERVICE_START pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=stratisd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=success'
Nov 04 15:34:55 localhost-live stratisd[2615]: DEBUG libstratis::stratis::buff_log: BuffLogger: pass_through: true hold time: none
Nov 04 15:34:55 localhost-live audit[2615]: AVC avc: denied { dac_override } for pid=2615 comm="stratisd" capability=1 scontext=system_u:system_r:stratisd_t:s0 tcontext=system_u:system_r:stratisd_t:s0 tclass=capability permissive=0
Nov 04 15:34:55 localhost-live stratisd[2615]: INFO stratisd: Using StratEngine
Nov 04 15:34:55 localhost-live stratisd[2615]: IO error: Permission denied (os error 13)
Nov 04 15:34:55 localhost-live systemd[1]: stratisd.service: Main process exited, code=exited, status=1/FAILURE
Nov 04 15:34:55 localhost-live systemd[1]: stratisd.service: Failed with result 'exit-code'.
Nov 04 15:34:55 localhost-live audit[1]: SERVICE_STOP pid=1 uid=0 auid=4294967295 ses=4294967295 subj=system_u:system_r:init_t:s0 msg='unit=stratisd comm="systemd" exe="/usr/lib/systemd/systemd" hostname=? addr=? terminal=? res=failed'
[root@localhost-live liveuser]#

@aanno
Copy link
Author

aanno commented Nov 6, 2019

Very well, you have reproduced the problem!

@johnr14
Copy link

johnr14 commented Nov 15, 2019

Having the same problem, clean fedora 30 to 31 upgrade for latest stratisd and will reproduce.
There seems to be a commit done on https://bugzilla.redhat.com/show_bug.cgi?id=1755396

Also, when I run dnf update, I get conflicting request :
module(platform:f31) needed by module stratis.

Trying out nightly :

dnf copr enable lvrabec/selinux-policy-nightly
dnf update -y selinux-policy
reboot

And it works now ! Well, the service starts...

EDIT :
but, I get an error on pool creation. Must set setenforce 0 for it to work...
Seems more work needs to be done for permissions.

# stratis pool create strata1 /dev/vdb
Execution failed:
stratisd failed to perform the operation that you requested. It returned the following information via the D-Bus: ERROR: permission denied.
# tail -F /var/log/audit/audit.log
type=AVC msg=audit(1573781075.632:256): avc:  denied  { write } for  pid=1070 comm="stratisd" name="vdb" dev="devtmpfs" ino=13060 scontext=system_u:system_r:stratisd_t:s0 tcontext=system_u:object_r:fixed_disk_device_t:s0 tclass=blk_file permissive=0
# dnf info selinux-policy-targeted.noarch
Last metadata expiration check: 0:08:59 ago on Mon 18 Nov 2019 02:21:05 PM CST.
Modular dependency problem:

 Problem: conflicting requests
  - nothing provides module(platform:f31) needed by module stratis:1:3120190907214611:22d7e2a5-0.x86_64
Installed Packages
Name         : selinux-policy-targeted
Version      : 3.14.5
Release      : 15.fc32.3
Architecture : noarch
Size         : 29 M
Source       : selinux-policy-3.14.5-15.fc32.3.src.rpm
Repository   : @System
From repo    : copr:copr.fedorainfracloud.org:lvrabec:selinux-policy-nightly
Summary      : SELinux targeted base policy
URL          : https://github.com/fedora-selinux/selinux-policy
License      : GPLv2+
Description  : SELinux Reference policy targeted base module.

Thanks

@aanno
Copy link
Author

aanno commented Dec 19, 2019

Current status: The issue has been handled with https://bugzilla.redhat.com/show_bug.cgi?id=1755396 . However, the selinux problem is still there. Currently, there seems to be no attempt in the Fedora project to rectify the issue.

@mulkieran mulkieran added the linked linked to external bug tracker label Jan 10, 2020
@aanno
Copy link
Author

aanno commented Feb 15, 2020

The issue has finally been solved with https://bugzilla.redhat.com/show_bug.cgi?id=1794645

@aanno aanno closed this as completed Feb 15, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
linked linked to external bug tracker
Projects
None yet
Development

No branches or pull requests

4 participants