Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

The content is gone after the reboot #55

Closed
bulgaru opened this issue Dec 11, 2019 · 20 comments · Fixed by #111
Closed

The content is gone after the reboot #55

bulgaru opened this issue Dec 11, 2019 · 20 comments · Fixed by #111

Comments

@bulgaru
Copy link

bulgaru commented Dec 11, 2019

Bug description

Using Fusion-IO ioScale2 1.65GB card. Compiled the drivers for the pve-kernel-5.3.13-1-pve: 5.3.13-1 (Proxmox 6). The drivers compiled normally and the card is visible. It has a GPT partition table and has been added as LVM-thin storage. Weirdly enough, all the content seems to be gone after the reboot. Here's the storage summary:
Screenshot 2019-12-11 at 1 58 12 PM
Here's the storage content (real size is around 0.6-1.2GB):
Screenshot 2019-12-11 at 1 58 23 PM

Environment information

  1. Linux kernel compiled against: Linux 5.3.13-1-pve oops after flashing new firmware #1 SMP PVE 5.3.13-1
  2. The C compiler version used: gcc (Debian 8.3.0-6) 8.3.0
  3. distribution, and version: Proxmox 6, Debian 10 Buster based
  4. Branch of iomemory-vsl that is being compiled: 5.1.28
  5. FIO device used, if applicable
    • fio-status: ioMono Fusion-io 1.65TB ioScale2
    • lspci -b -nn: Mass storage controller [0180]: SanDisk ioDrive2 [1aed:2001] (rev 04)
@snuf
Copy link
Collaborator

snuf commented Dec 11, 2019

@bulgaru thanks for filing a bug. Could you provide the following output so I can get an idea of what's going on?
0) lsmod | grep iomemory, to make sure the module is loaded

  1. the output of dmesg starting from the first fioinf message after the module is loaded
  2. the full fio-status -a information so we can see the state of the device

From there we can move on to more troubleshooting if required, and would have to look at the state of the GPT and LVM itself.

Best,

@bulgaru
Copy link
Author

bulgaru commented Dec 11, 2019

  1. lsmod | grep iomemory: iomemory_vsl 1286144 0
  2. [ 15.291390] <6>fioinf ioDrive 0000:41:00.0.0: Required PCIE bandwidth 2.000 GBytes per sec
    [ 15.291394] <6>fioinf ioDrive 0000:41:00.0.0: Board serial number is 1412G0900
    [ 15.291395] <6>fioinf ioDrive 0000:41:00.0.0: Adapter serial number is 1412G0900
    [ 15.291401] <6>fioinf ioDrive 0000:41:00.0.0: Default capacity 1650.000 GBytes
    [ 15.291403] <6>fioinf ioDrive 0000:41:00.0.0: Default sector size 512 bytes
    [ 15.291404] <6>fioinf ioDrive 0000:41:00.0.0: Rated endurance 8.00 PBytes
    [ 15.291407] <6>fioinf ioDrive 0000:41:00.0.0: 85C temp range hardware found
    [ 15.291409] <6>fioinf ioDrive 0000:41:00.0.0: Maximum capacity 1650.000 GBytes
    [ 15.291432] <6>fioinf ioDrive 0000:41:00.0.0: Firmware version 7.1.17 116786 (0x700411 0x1c832)
    [ 15.291434] <6>fioinf ioDrive 0000:41:00.0.0: Platform version 19
    [ 15.291435] <6>fioinf ioDrive 0000:41:00.0.0: Firmware VCS version 116786 [0x1c832]
    [ 15.291442] <6>fioinf ioDrive 0000:41:00.0.0: Firmware VCS uid 0xaeb15671994a45642f91efbb214fa428e4245f8a
    [ 15.295298] <6>fioinf ioDrive 0000:41:00.0.0: Powercut flush: Enabled
    [ 15.567414] <6>fioinf ioDrive 0000:41:00.0.0: PCIe power monitor enabled (master). Limit set to 24.750 watts.
    [ 15.567420] <6>fioinf ioDrive 0000:41:00.0.0: Thermal monitoring: Enabled
    [ 15.567423] <6>fioinf ioDrive 0000:41:00.0.0: Hardware temperature alarm set for 85C.
    [ 15.751378] <6>fioinf ioDrive 0000:41:00.0: Found device fct0 (Fusion-io 1.65TB ioScale2 0000:41:00.0) on pipeline 0
    [ 21.041867] <6>fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: probed fct0
    [ 21.162961] <6>fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: sector_size=512
    [ 21.162968] <6>fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: setting channel range data to [2 .. 4095]
    [ 21.340123] <6>fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: Found metadata in EBs 1260-1261, loading...
    [ 22.032594] <6>fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: setting recovered append point 1261+488816640
    [ 22.098885] <6>fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: Creating device of size 1650000000000 bytes with 3222656250 sectors of 512 bytes (118983836 mapped).
    [ 22.103498] fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: Creating block device fioa: major: 252 minor: 0 sector size: 512...
    [ 22.105415] fioa: fioa1
    [ 22.105646] <6>fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: Attach succeeded.
    [ 22.648829] audit: type=1400 audit(1576086304.719:2): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/lxc-start" pid=1871 comm="apparmor_parser"
    [ 22.650574] audit: type=1400 audit(1576086304.719:3): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe" pid=1870 comm="apparmor_parser"
    [ 22.650585] audit: type=1400 audit(1576086304.719:4): apparmor="STATUS" operation="profile_load" profile="unconfined" name="nvidia_modprobe//kmod" pid=1870 comm="apparmor_parser"
    [ 22.659944] audit: type=1400 audit(1576086304.731:5): apparmor="STATUS" operation="profile_load" profile="unconfined" name="/usr/bin/man" pid=1873 comm="apparmor_parser"
    [ 22.659953] audit: type=1400 audit(1576086304.731:6): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_filter" pid=1873 comm="apparmor_parser"
    [ 22.659958] audit: type=1400 audit(1576086304.731:7): apparmor="STATUS" operation="profile_load" profile="unconfined" name="man_groff" pid=1873 comm="apparmor_parser"
    [ 22.677516] audit: type=1400 audit(1576086304.747:8): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default" pid=1874 comm="apparmor_parser"
    [ 22.677524] audit: type=1400 audit(1576086304.747:9): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-cgns" pid=1874 comm="apparmor_parser"
    [ 22.677530] audit: type=1400 audit(1576086304.747:10): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-with-mounting" pid=1874 comm="apparmor_parser"
    [ 22.677536] audit: type=1400 audit(1576086304.747:11): apparmor="STATUS" operation="profile_load" profile="unconfined" name="lxc-container-default-with-nesting" pid=1874 comm="apparmor_parser"
    [ 22.853249] new mount options do not match the existing superblock, will be ignored
    [ 22.859896] vmbr0: port 1(eno3) entered blocking state
    [ 22.859900] vmbr0: port 1(eno3) entered disabled state
    [ 22.860086] device eno3 entered promiscuous mode
    [ 22.989444] softdog: initialized. soft_noboot=0 soft_margin=60 sec soft_panic=0 (nowayout=0)
    [ 23.280709] bpfilter: Loaded bpfilter_umh pid 2034
    [ 26.415797] igb 0000:07:00.0 eno3: igb: eno3 NIC Link is Up 1000 Mbps Full Duplex, Flow Control: RX
    [ 26.416025] vmbr0: port 1(eno3) entered blocking state
    [ 26.416028] vmbr0: port 1(eno3) entered forwarding state
    [ 26.416259] IPv6: ADDRCONF(NETDEV_CHANGE): vmbr0: link becomes ready
  3. Found 1 ioMemory device in this system
    Driver version: 3.2.16 build 1731
    Adapter: ioMono
    Fusion-io 1.65TB ioScale2, Product Number:F11-003-1T65-CS-0001, SN:1412G0900, FIO SN:1412G0900
    ioDrive2 Adapter Controller, PN:PA005004003
    External Power: NOT connected
    PCIe Bus voltage: avg 11.99V
    PCIe Bus current: avg 0.72A
    PCIe Bus power: avg 8.67W
    PCIe Power limit threshold: 24.75W
    PCIe slot available power: 25.00W
    PCIe negotiated link: 4 lanes at 5.0 Gt/sec each, 2000.00 MBytes/sec total
    Connected ioMemory modules:
    fct0: Product Number:F11-003-1T65-CS-0001, SN:1412G0900
    fct0 Attached
    ioDrive2 Adapter Controller, Product Number:F11-003-1T65-CS-0001, SN:1412G0900
    ioDrive2 Adapter Controller, PN:PA005004003
    SMP(AVR) Versions: App Version: 1.0.20.0, Boot Version: 1.0.5.1
    Powerloss protection: protected
    PCI:41:00.0, Slot Number:1
    Vendor:1aed, Device:2001, Sub vendor:1aed, Sub device:2001
    Firmware v7.1.17, rev 116786 Public
    1650.00 GBytes device size
    Format: v500, 3222656250 sectors of 512 bytes
    PCIe slot available power: 25.00W
    PCIe negotiated link: 4 lanes at 5.0 Gt/sec each, 2000.00 MBytes/sec total
    Internal temperature: 48.72 degC, max 50.20 degC
    Internal voltage: avg 1.01V, max 1.01V
    Aux voltage: avg 2.48V, max 2.48V
    Reserve space status: Healthy; Reserves: 100.00%, warn at 10.00%
    Active media: 100.00%
    Rated PBW: 8.00 PB, 99.99% remaining
    Lifetime data volumes:
    Physical bytes written: 601,731,575,416
    Physical bytes read : 106,382,366,064
    RAM usage:
    Current: 675,134,016 bytes
    Peak : 675,134,016 bytes
    Contained VSUs:
    fioa: ID:0, UUID:80bf4175-81c5-408b-aa28-43cb05e3be9e
    fioa State: Online, Type: block device
    ID:0, UUID:80bf4175-81c5-408b-aa28-43cb05e3be9e
    1650.00 GBytes device size
    Format: 3222656250 sectors of 512 bytes

@snuf
Copy link
Collaborator

snuf commented Dec 12, 2019

@bulgaru nothing strange there, there's a top part of the dmesg missing, which shows the version and hash etc, could you parse that part too?.

if you look back in time, at your log messages before the reboot, did the module get unloaded, and did it give any output when unloading? Generally it's something like this:

[30888.312480] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:06:00.0: Groomer for data log is tearing down
[30888.313368] fiob: data gc halted.
[30888.316189] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:06:00.0: Writing metadata starting at EB 2017 (1838 free 208 used)...
[30888.552133] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:05:00.0: Groomer for data log is tearing down
[30888.553093] fioa: data gc halted.
[30888.554967] <6>fioinf Fusion-io ioDrive Duo 640GB 0000:05:00.0: Writing metadata starting at EB 149 (1735 free 311 used)...
[30888.810031] fioinf Fusion-io ioDrive Duo 640GB 0000:06:00.0: destroying device 1
[30888.818816] fioinf Fusion-io ioDrive Duo 640GB 0000:05:00.0: destroying device 0
[30888.828762] <6>fioinf ioDrive driver unloading...
[30888.930054] <6>fioinf ioDrive driver unloaded...

I'm not familiar with the LVM-thin setup you have, but my guess there is something with the meta-data on the LVM layer. Looking at lvmdiskscan do you see the drive?

Oddly enough when I try to remove the module with LVM provisioned VG and LVs it complains it can't due to the volume group being active. Can you check if your volume groups is actually active, or you can see it at all?

Best,

@bulgaru
Copy link
Author

bulgaru commented Dec 12, 2019

These are the only records i see:

[ 12.742456] <6>fioinf VSL configuration hash: 8f82ea05bdf1195cb400fb48e4ef09fc49b3c1aa
[ 12.742579] <6>fioinf
[ 12.742581] <6>fioinf Copyright (c) 2006-2014 Fusion-io, Inc. (acquired by SanDisk Corp. 2014)
[ 12.742581] <6>fioinf Copyright (c) 2014-2016 SanDisk Corp. and/or all its affiliates. (acquired by Western Digital Corp. 2016)
[ 12.742582] <6>fioinf Copyright (c) 2016-2018 Western Digital Technologies, Inc. All rights reserved.
[ 12.742583] <6>fioinf For Terms and Conditions see the License file included
[ 12.742584] <6>fioinf with this driver package.
[ 12.742585] <6>fioinf
[ 12.742586] <6>fioinf ioDrive driver 3.2.16.1731 pinnacles@4870ad45b7ea loading...
[ 12.746359] <6>fioinf ioDrive 0000:41:00.0: mapping controller on BAR 5
[ 12.746480] <6>fioinf ioDrive 0000:41:00.0: MSI enabled
[ 12.746483] <6>fioinf ioDrive 0000:41:00.0: using MSI interrupts
[ 12.782986] <6>fioinf ioDrive 0000:41:00.0.0: Starting master controller
[ 12.814196] cryptd: max_cpu_qlen set to 1000
[ 12.869233] <6>fioinf ioDrive 0000:41:00.0.0: PMP Address: 1 1 1
[ 13.097272] <6>fioinf ioDrive 0000:41:00.0.0: SMP Controller Firmware APP version 1.0.20 0
[ 13.097278] <6>fioinf ioDrive 0000:41:00.0.0: SMP Controller Firmware BOOT version 1.0.5 1

In the syslog i noticed some lines that may or may not be related to the issue:

Dec 12 11:46:00 dc4 iomemory-vsl[2094]: iomemory-vsl is not enabled in the init config '/etc/sysconfig/iomemory-vsl'
Dec 12 11:46:00 dc4 kernel: new mount options do not match the existing superblock, will be ignored

Let me know if you need access to the server - it's available from the internet so you can easily log into Proxmox and run all the tests you need, in case it's quicker and easier.

@bulgaru
Copy link
Author

bulgaru commented Dec 12, 2019

Additionally went to check the /var/log/kern.log to check the previous session.
Haven't seen anything related to driver unloading.

As for lvmdiskscan:

/dev/fioa1 [ 1.50 TiB] LVM physical volume
/dev/sda2 [ 512.00 MiB]
/dev/sda3 [ 837.86 GiB]
/dev/sdb2 [ 512.00 MiB]
/dev/sdb3 [ 837.86 GiB]
/dev/sdc1 [ 838.35 GiB]
/dev/sdc9 [ 8.00 MiB]
/dev/sdd1 [ 838.35 GiB]
/dev/sdd9 [ 8.00 MiB]
0 disks
8 partitions
0 LVM physical volume whole disks
1 LVM physical volume

Output of the lvs:

LV VG Attr LSize Pool Origin Data% Meta% Move Log Cpy%Sync Convert
lvol0 vmdata -wi------- 104.00m
lvol1 vmdata -wi------- 100.00m
lvol2 vmdata -wi------- 100.00m
vm-400-disk-0 vmdata Vwi---tz-- 30.00g vmstore
vmstore vmdata twi---tz-- 1.49t
vmstore_meta0 vmdata -wi------- 96.00m

@bulgaru
Copy link
Author

bulgaru commented Dec 12, 2019

Update #2: After I've manually removed the driver via modprobe -r, i got this output:

[ 3390.395044] <6>fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: Groomer for data log is tearing down
[ 3390.403229] fioa: data gc halted.
[ 3390.407273] <6>fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: Writing metadata starting at EB 1266 (2829 free 1263 used)...
[ 3391.720273] fioinf Fusion-io 1.65TB ioScale2 0000:41:00.0: destroying device 0
[ 3391.894750] <6>fioinf ioDrive driver unloading...
[ 3391.982623] <6>fioinf ioDrive driver unloaded...

It is not generated on reboot, so i guess the driver is not detached then. I noticed no errors related to volume groups.

@bulgaru
Copy link
Author

bulgaru commented Dec 12, 2019

Update #3:
I've run another test: created a container on the ioScale2 lvm-thin storage in order the check how it's content would react to driver unloading and loading back. This is the error i get:

modprobe -r iomemory_vsl
modprobe: FATAL: Module iomemory_vsl is in use.

When i try to check what's using the driver, i get this:

lsmod | grep iomemory_vsl
iomemory_vsl 1286144 2

So my best guess is that on reboot the storage is not removed properly, which leads to data corruption.

@snuf
Copy link
Collaborator

snuf commented Dec 13, 2019

@bulgaru my guess is that the metadata is full or got trashed somehow. Historically this is not uncommon on ThinLVM / LVM from what I've read. In all honesty though I've never had it myself knocks on wood. Poking around on the web I did find some pointers lvs -v should give some more info, and so should lvscan -v. Does the vm-400-disk-0 also live on there, or is that on a separate disk somewhere?

For access where do you want me to send my public key?

@bulgaru
Copy link
Author

bulgaru commented Dec 13, 2019

@snuf please check your email - the access credentials are there.

There has been a breakthrough thanks to your help. I've discovered that for some reason the volume group is inactive after the boot. It may be related to the mechanics of Fusion-Io drivers being loaded into the kernel or some Proxmox changes or bugs:

INACTIVE '/dev/vmdata/lvol0' [104.00 MiB] inherit
INACTIVE '/dev/vmdata/lvol1' [100.00 MiB] inherit
INACTIVE '/dev/vmdata/lvol2' [100.00 MiB] inherit
INACTIVE '/dev/vmdata/vmstore' [1.49 TiB] inherit
INACTIVE '/dev/vmdata/vmstore_meta0' [96.00 MiB] inherit
INACTIVE '/dev/vmdata/vm-400-disk-0' [30.00 GiB] inherit
INACTIVE '/dev/vmdata/vm-100-disk-0' [10.00 GiB] inherit

After activating the volume group all data is back up and seems to work perfectly. I've submitted a ticket on Proxmox forum and we'll see if it can be fixed via some configs. Never encountered an issue with volume group activation before, even though these cards are used on 4 Proxmox nodes at the moment.

@snuf
Copy link
Collaborator

snuf commented Dec 13, 2019

@bulgaru a that is good news!

I sent you an email back. Can we keep this open until we have a conclusive answer to what happened exactly so we can record it for others?

Best,

@bulgaru
Copy link
Author

bulgaru commented Dec 13, 2019

Agreed! Le't keep it open. I will be glad to post updates related to the issue if something comes out of Proxmox forum. Thank you again for looking into it!

@snuf
Copy link
Collaborator

snuf commented Dec 14, 2019

Welcome!

Found a couple of things, vgchange -ay sets everything to active, and upon reboot it's all inactive again. Also /etc/init.d/iomemory-vsl is not used, it does contain some vgchange stuff to stop and start devices when things are started / stopped. I've tried setting:

ENABLED=0
LVM_VGS="/dev/vmdata"
LVM_DETECT=1in/etc/sysconfing/iomemory-vsl, and then the script works, and after a reboot all lvs are active. This is however probably not what you're looking for? I hashed out: # ENABLE=0`, which disables the startup script again to revert to what it originally was.

Poking in dmesg I found an error for lvm2-activation-generator may not be relevant but still notewor.

Dec 14 21:26:18 dc4 lvm2-activation-generator: Incorrect number of arguments for activation generator.
Dec 14 21:26:18 dc4 lvm2-activation-generator: Activation generator failed.
Dec 14 21:26:20 dc4 lvm2-activation-generator: Incorrect number of arguments for activation generator.
Dec 14 21:26:20 dc4 lvm2-activation-generator: Activation generator failed.
Dec 14 21:26:24 dc4 lvm2-activation-generator: Incorrect number of arguments for activation generator.
Dec 14 21:26:24 dc4 lvm2-activation-generator: Activation generator failed.
Dec 14 21:26:35 dc4 lvm2-activation-generator: Incorrect number of arguments for activation generator.
Dec 14 21:26:35 dc4 lvm2-activation-generator: Activation generator failed.
Dec 14 21:26:59 dc4 lvm2-activation-generator: Incorrect number of arguments for activation generator.
Dec 14 21:26:59 dc4 lvm2-activation-generator: Activation generator failed.

Also another thing I was looking at is that all the proxmox example configuration show pve as the name they use for the thin provisioned VG. In /etc/lvm/lvm.conf there is a global_filter that contains some directions around it.
My guess there is something in /etc/lvm/lvm.conf that needs to change, don't have any more time today, will have a look later tomorrow if time permits.

Best,

@bulgaru
Copy link
Author

bulgaru commented Dec 15, 2019

Hey!

I've played with the LVM for the past 2 days and here was my starting point - it works great with Sandisk drivers on Proxmox 5 and i should be able to spot the difference in Proxmox 6. Here are the 4 scenarios i've been looking into:

  1. The issue is with the drivers
  2. The issue is with how the init scripts tie into the OS
  3. The issue is with LVM2 configs, leading to blind spots when initialising vg's
  4. The issue is with OS itself, related to poor configs or poor LVM2 implementation

One by one:

  1. The issue is with the drivers - unlikely. The drivers seems to work perfectly fine after the vg's are initialised: perfect I/O, no corruption, no errors. Basically ioDrive2 works just as it did with Proxmox 5 with OEM drivers. I feel pretty confident that this scenario can be crossed off the list.

  2. The issue is with how the init scripts tie into the OS - this was my main focus. I assumed that certain scripts were either not adjusted to Debian 10, or not fully tailored to Proxmox Debian 10 based kernel. I think that this scenario is also highly unlikely, mostly because Proxmox does not rely on any init scripts and uses LVM2 tools to init disks. Both in Proxmox 5 and 6 the iomemory-vsl script is simply disabled. I feel pretty confident that this scenario can be crossed off the list.

  3. The issue is with LVM2 configs, leading to blind spots when initialising vg's - i've tested several aspects. First, i tried looking into logging (duhhhh...) during activation and, to my surprise, nothing showed in logs. I assumed i messed up the logging settings, but i noticed that even without adjusting the logging settings (using defaults), nothing LVM2-related appeared in the logs (unlike in Proxmox 5). My conclusion was that LVM wasn't even trying to activate the vg's (!!!). The simplest explanation would seem that global_filter was filtering out the path where vg's were located. I've switched the filters to the ones used in Proxmox 5 and it didn't help (i was pretty sure it would not help, since the regex would not catch the path, so it'd be weird if it did help).

This is where i am atm. I think the issue lies somewhere in between poor LVM configs and the way the OS works. I can easily activate the vg's manually and it would be trivial to add the initialisation at the boot. Question remains why it does not work out of the box.

Btw, another crossed out scenario is that the timeline is messed - that the physical device initialisation occurs after the attempt to initialise the vg's. I'm crossing it out cause it seems that the device initialisation occurs roughly at the same time in both Proxmox 5 and 6, with almost identical log messages. Yet, unlike in Proxmox 5, where the vg's are being initialised by systemd as soon as the device is attached, in Proxmox 6 nothing happens as soon as the device is attached.

@bulgaru
Copy link
Author

bulgaru commented Jan 8, 2020

Answer finally found! The reason why it works on Proxmox 5 and not Proxmox 6 is a different approach for LVM activation. The activation in Proxmox 6 is based on global/event_activation=1, which, as far as i understand, means that certain events need to be triggered in order for activation to occur. In my case, i need the default activation to occur on boot and it works great with global/event_activation=0.

This is a fantastic news, since it basically means that the Fusion-Io drives from the 2nd and possibly 3rd generation can be used with Proxmox 6. Thanks for help and support, @snuf!

@bulgaru bulgaru closed this as completed Jan 8, 2020
@mx-shift
Copy link
Contributor

I ran into this today. While changing global/event_activation=0 in lvm.conf works, I wanted to figure out what was preventing the activation events from occurring. Looking at /lib/udev/rules.d/69-lvm-metad.rules, a /dev/disk/by-id/lvm-pv-uuid-XXX symlink should be created for any block device that has an LVM fs type as indicated by ENV{ID_FS_TYPE}". Looking at my system after a reboot, that symlink is not created. Let's check udevadm info:

root@clever-chimp:/lib/udev/rules.d# udevadm info /dev/fioa
P: /devices/virtual/block/fioa
N: fioa
L: 0
S: disk/by-path/FIO-
S: dsk/FIO-
E: DEVPATH=/devices/virtual/block/fioa
E: DEVNAME=/dev/fioa
E: DEVTYPE=disk
E: DISKSEQ=34
E: MAJOR=252
E: MINOR=0
E: SUBSYSTEM=block
E: USEC_INITIALIZED=34843909
E: DEVLINKS=/dev/disk/by-path/FIO- /dev/dsk/FIO-
E: TAGS=:systemd:
E: CURRENT_TAGS=:systemd:

Huh. So ID_FS_TYPE isn't being set. That should be set by blkid as noted at the beginning of 69-lvm-metad.rules:

# This rule requires blkid to be called on block devices before so only devices
# used as LVM PVs are processed (ID_FS_TYPE="LVM2_member" or "LVM1_member").

Where does blkid normally get triggered? 60-persistent-storage.rules:

# probe filesystem metadata of disks
KERNEL!="sr*", IMPORT{builtin}="blkid"

So why isn't that happening? Well, much earlier in 60-persistent-storage.rules:

KERNEL!="loop*|mmcblk*[0-9]|msblk*[0-9]|mspblk*[0-9]|nvme*|sd*|sr*|vd*|xvd*|bcache*|cciss*|dasd*|ubd*|ubi*|scm*|pmem*|nbd*|zd*", GOTO="persistent_storage_end"

Oh. I see. Turns out someone already asked to have fio* added to that list in systemd/systemd#3718. Upstream doesn't want to include it since the driver is out-of-tree. I can't really argue with that.

So we need to provide our own udev rules for FusionIO devices. Cribbing from 60-persistent-storage.rules, the follow works as minimal udev rules for FusionIO:

# persistent storage links: /dev/disk/{by-id,by-uuid,by-label,by-path}
# scheme based on "Linux persistent device names", 2004, Hannes Reinecke <hare@suse.de>

ACTION=="remove", GOTO="persistent_storage_end"
ENV{UDEV_DISABLE_PERSISTENT_STORAGE_RULES_FLAG}=="1", GOTO="persistent_storage_end"

SUBSYSTEM!="block", GOTO="persistent_storage_end"
KERNEL!="fio*", GOTO="persistent_storage_end"

# ignore partitions that span the entire disk
TEST=="whole_disk", GOTO="persistent_storage_end"

# For partitions import parent disk ID_* information, except ID_FS_*.
#
# This is particularly important on media where a filesystem superblock and
# partition table are found on the same level, e.g. common Linux distro ISO
# installation media.
#
# In the case where a partition device points to the same filesystem that
# was detected on the parent disk, the ID_FS_* information is already
# present on the partition devices as well as the parent, so no need to
# propagate it. In the case where the partition device points to a different
# filesystem, merging the parent ID_FS_ properties would lead to
# inconsistencies, so we avoid doing so.
ENV{DEVTYPE}=="partition", \
  IMPORT{parent}="ID_[!F]*", IMPORT{parent}="ID_", \
  IMPORT{parent}="ID_F[!S]*", IMPORT{parent}="ID_F", \
  IMPORT{parent}="ID_FS[!_]*", IMPORT{parent}="ID_FS"

# probe filesystem metadata of disks
IMPORT{builtin}="blkid"

# by-label/by-uuid links (filesystem metadata)
ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_UUID_ENC}=="?*", SYMLINK+="disk/by-uuid/$env{ID_FS_UUID_ENC}"
ENV{ID_FS_USAGE}=="filesystem|other|crypto", ENV{ID_FS_LABEL_ENC}=="?*", SYMLINK+="disk/by-label/$env{ID_FS_LABEL_ENC}"

# by-partlabel/by-partuuid links (partition metadata)
ENV{ID_PART_ENTRY_UUID}=="?*", SYMLINK+="disk/by-partuuid/$env{ID_PART_ENTRY_UUID}"
ENV{ID_PART_ENTRY_SCHEME}=="gpt", ENV{ID_PART_ENTRY_NAME}=="?*", SYMLINK+="disk/by-partlabel/$env{ID_PART_ENTRY_NAME}"

LABEL="persistent_storage_end"

Put that in /etc/udev/rules.d/60-persistent-storage-fio.rules and ID_FS_TYPE gets populated correctly which then triggers the creation of the /dev/disk/by-id/lvm-pv-uuid-XXX symlink creation:

root@clever-chimp:/etc/udev/rules.d# udevadm info /dev/fioa
P: /devices/virtual/block/fioa
N: fioa
L: 0
S: disk/by-path/FIO-
S: dsk/FIO-
S: disk/by-id/lvm-pv-uuid-PSyuFK-r8RJ-KUe9-D0ND-GNGj-0ghp-3l3KYy
E: DEVPATH=/devices/virtual/block/fioa
E: DEVNAME=/dev/fioa
E: DEVTYPE=disk
E: DISKSEQ=34
E: MAJOR=252
E: MINOR=0
E: SUBSYSTEM=block
E: USEC_INITIALIZED=34843909
E: ID_FS_UUID=PSyuFK-r8RJ-KUe9-D0ND-GNGj-0ghp-3l3KYy
E: ID_FS_UUID_ENC=PSyuFK-r8RJ-KUe9-D0ND-GNGj-0ghp-3l3KYy
E: ID_FS_VERSION=LVM2 001
E: ID_FS_TYPE=LVM2_member
E: ID_FS_USAGE=raid
E: SYSTEMD_READY=1
E: SYSTEMD_ALIAS=/dev/block/252:0
E: SYSTEMD_WANTS=lvm2-pvscan@252:0.service
E: DEVLINKS=/dev/disk/by-path/FIO- /dev/dsk/FIO- /dev/disk/by-id/lvm-pv-uuid-PSyuFK-r8RJ-KUe9-D0ND-GNGj-0ghp-3l3KYy
E: TAGS=:systemd:
E: CURRENT_TAGS=:systemd:

Further improvements could be made by shipping a script that parses fio-status output such that udev could import things like the UUIDs for each /dev/fioX. That would allow for creating /dev/disk/by-id/fio-XXX symlinks.

I'm unclear if this repo is the right place to maintain and install udev rules for these devices. The minimal rules I wrote above do not require any of the programs from fio-util so maybe that's ok.

@mx-shift
Copy link
Contributor

And I just noticed that tools/udev/rules.d/60-persistent-fio.rules already exists. It's missing the import of blkid to discover the LVM PVs and create the matching symlinks. I'll send a PR shortly.

mx-shift added a commit to mx-shift/iomemory-vsl that referenced this issue Oct 22, 2022
LVM auto activation is triggered by udev discovering the LVM PVs and
populating ID_FS_TYPE.  60-persistent-fio.rules already runs blkid on
partitions on the fioX device to create /dev/disk/by-uuid symlinks.
Add an import of filesystem metadata on the fioX device using udev's
blkid builtin.

Fixes RemixVSL#55
@kiloforce
Copy link

My OSDs kept going offline after a reboot. Was pointed to this issue, and this solved it. Thanks!

Should really add this udev rule to the Wiki page. Needed for Proxmox version 8. Should really create a Proxmox dedicated page on the Wiki. Thanks!

@bplein
Copy link
Collaborator

bplein commented Jan 28, 2024

We are happy to have you create the documentation and we can approve/add it.

@kiloforce
Copy link

Want to create a repo for the wiki, so that pull requests can be submitted?

@bplein
Copy link
Collaborator

bplein commented Jan 29, 2024

For now, just submit a PR that is a file called proxmox.md in the root of the repo and then we can figure out where that lives. We do appreciate the support.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

5 participants