Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Error with secondary and up devices #550

Closed
lars18th opened this issue Jan 9, 2024 · 8 comments
Closed

Error with secondary and up devices #550

lars18th opened this issue Jan 9, 2024 · 8 comments

Comments

@lars18th
Copy link

lars18th commented Jan 9, 2024

Hi @kroese ,

When using DEVICES only the first one works. When you connect DEVICE2 until DEVICE6, the qemu driver seems to have some trouble with the controller and prints this:

dsm  | [   13.230684] scsi host8: Virtio SCSI HBA
dsm  | [   13.390521] scsi 8:0:0:0: Direct-Access     QEMU     QEMU HARDDISK            2.5+ PQ: 0 ANSI: 5
dsm  | [   13.407881] want_idx 0 index 1. delay and reget
dsm  | [   14.413161] want_idx 0 index 1
dsm  | [   14.418849] want_idx 0 index 1. delay and reget
dsm  | [   15.422182] want_idx 0 index 1
[it repeats a lot of times]

And then the disk is not attached.

You can check it using only one device. Change from this:

    environment:
      DEVICE: "/dev/sdc"

that will work, to this:

    environment:
      DEVICE2: "/dev/sdc"

And then you'll see the trouble (with DEVICE2 or DEVICE6 or anything else).

@kroese
Copy link
Collaborator

kroese commented Jan 9, 2024

Really strange, the relevant code is:

addDevice "userdata7" "$DEVICE" "6" "0xf" || exit $?
addDevice "userdata8" "$DEVICE2" "7" "0x5" || exit $?
addDevice "userdata9" "$DEVICE3" "8" "0x6" || exit $?
addDevice "userdata4" "$DEVICE4" "9" "0x7" || exit $?
addDevice "userdata5" "$DEVICE5" "10" "0x8" || exit $?
addDevice "userdata6" "$DEVICE6" "11" "0x9" || exit $?

As you can see from disk2 and higher I assign address 0x5 and up. Maybe it is caused by this lower addresses?

You could try replacing 0x5 by 0xd and see if it makes a difference? I cannot test this myself as I have no spare disk devices on my system.

@lars18th
Copy link
Author

lars18th commented Jan 9, 2024

You could try replacing 0x5 by 0xd and see if it makes a difference? I cannot test this myself as I have no spare disk devices on my system.

I'll try it.

@kroese
Copy link
Collaborator

kroese commented Jan 11, 2024

I looked into this again, and to start: these want_idx errors are printed by DSM not by QEMU.

The first possibility is that vDSM only has support for one physical drive (in which case there is nothing we can do about it), but this term idx is short for index so I am pretty sure it is just something about the way the scsi disks are presented to DSM.

One thing I noticed is that in disk.sh all disks are added with scsi-id=0, so they all have a duplicate ID (0). This should not matter because they are all on a different bus (controller), but maybe you can change this to scsi-id=$DISK_INDEX so that each device gets an unique scsi id, and see if it makes any difference for vDSM.

Another thing that could be the issue is that each disk gets its own controller, and maybe vDSM excepts all the drives to be connected to the same controller. So instead of having multiple virtio-scsi-pci devices and multiple scsi-hd devices, it could be worth a try to have one single virtio-scsi-pci device for all hd's.

@lars18th
Copy link
Author

Thank you @kroese . Very usefull information. I'll try to test then with the scsi-id= parameter and the virtual controllers.

@kroese
Copy link
Collaborator

kroese commented Jan 11, 2024

Now I think about it.. as you said changing DEVICE to DEVICE2 was enough to make it break, it cannot be any of those reasons I mentioned. Because in both cases there will be a single controller and a single disk.

So the only difference between using DEVICE and DEVICE2 is the PCI address (0xf vs 0x5). Its possible that vDSM does not expect these lower numbers. Currently:

Boot drive = 0xA
System drive = 0xB
Data disk = 0xC
Data disk 2 = 0xD
etc...

So normally the addresses are in increasing order. Maybe vDSM does not expect to have a drive that has a lower number than the previous drives?

Can you try with this disk.sh file? It assigns addresses differently (and also fixes your dummy drive issue):

https://raw.githubusercontent.com/vdsm/virtual-dsm/scsi/src/disk.sh

@lars18th
Copy link
Author

Can you try with this disk.sh file? It assigns addresses differently (and also fixes your dummy drive issue):

https://raw.githubusercontent.com/vdsm/virtual-dsm/scsi/src/disk.sh

I'll check asap. Please stay on hold.

@lars18th
Copy link
Author

Hi @kroese ,

You are the master! It works like a charm! Tested with config:

[...]
    environment:
      DISK_SIZE: "6G"
      ALLOCATE: "N"
      DISK_FMT: "qcow2"
      DEVICE: "/dev/sda"
      DEVICE2: "/dev/sdb"
      DEVICE3: "/dev/sdc"
    devices:
      - /dev/kvm
      - /dev/sda
      - /dev/sdb
      - /dev/sdc
    volumes:
      - /opt/virtual-dsm/storage:/storage
      - /opt/virtual-dsm/disk.sh:/run/disk.sh
[...]

And now the container starts without troubles and only the three devices are used. Congratulations!

I hope you'll merge it soon. I'll check with a new install without DISK_SIZE, ALLOCATE and DISK_FMT. The idea is to not create the initial disk image. Or almost create it empty with the minimum data and after don't use it at all.

😉

@kroese
Copy link
Collaborator

kroese commented Jan 12, 2024

Fixed by #552

@kroese kroese closed this as completed Jan 12, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants