-
Notifications
You must be signed in to change notification settings - Fork 1.2k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
VMD hotplug init from config file fails for multiple domains #3370
Comments
Debug build info including DPDK parameters on init:
|
Hi @tanabarr, this is a pretty old SPDK, can you reproduce this on v24.01 or latest master? |
SPDK JSON config file contents:
|
@jimharris not easily but I will try, I think we will run into various other errors upgrading to 24 but it's definitely the logical thing to do. |
This is something in DAOS that is calling
But my understanding was that the hotplug_busid_range (beginning of your config file) takes care of deciding which process takes control of a newly found SSD. Normally this would be used when an SSD is hot inserted at some later time, but it should work just as well at initial start up.
I think we need to understand this call to
|
@tanabarr can you turn on the vmd debug log flag, re-run your test, and post the log here? |
Yes, after attaching when loading config bdevs get enumerated and blobstores loaded/created.
The range in daos_data is decoded and applied in In terms of device selection in VMD mode, DAOS calls |
The registered filter function only gets called when the bdev/nvme hotplug poller tries to look for new (or more precisely, non-attached) devices. If you explicitly attach using bdev_nvme_attach_controller, that filter function will not fire.
For each VMD PCI function that is bound to vfio-pci, you will only be able to bind it to one process. So you will need to do filtering through spdk_env_init_opts. This is probably the problem. Without VMD, any new devices are globally accessible through the kernel's PCI subsystem, so you need that filter function to make sure only one of your hotplug-enabled processes tries to enable it. But I'm not sure how that filter function would work with VMD. It would need to pick whatever bus addresses VMD would choose for that VMD endpoint. But I don't think you even need the filter function with VMD - if the VMD device can only be attached to one SPDK process then by default all SSDs behind that VMD endpoint need to be attached to that process. No other process will be able to access them. |
daos_engine.0.log Attached are the logs from a run with SPDK v22.01 (my attempts to build with v24 failed due to time constraints and specifically linking issues in our scons build, I will try again later). The run was with the original DAOS source as described in the description with none of the experimental workarounds I have been looking at. The only source changes are to enable debug build and "vmd" debug flag.
@jimharris @ksztyber @NiuYawei please let me know what you think and if any logs from comparison runs would be useful. Note that I have tried some experiments with not setting hotplug poller bus-ID ranges and expanding the range to allow all. Those experiments yielded the same failure. |
From the logs, it looks like the VMD driver has enumerated the device the bdev_nvme complains about. I suspect there is a race between the hotplug poller and the If I'm right, moving the
@tanabarr, could you check it? |
@ksztyber apologies for the delay in response, I will try the suggestion this week. |
Yes changing the RPC order as suggested works around the issue. See attached good/bad config files. |
Sighting report
Failure when parsing config file when using two VMD domains and enabling hotplug. With a single domain or if hotplug is not enabled in config then the application works as expected.
Expected Behavior
Application is expected to parse JSON config file successfully and recognise VMD-backing-device addresses specified in attach directives.
Current Behavior
The error is seen after calling
spdk_subsystem_init_from_json_config()
whenxs_poll_completion(). cp_arg.cca_rc
gets set to -1003.The following is the application calling code:
The following errors are reported by SPDK:
Seems like for some reason when hotplug is enabled the second VMD domain can't be discovered.
If I hack the application to run the same sequence without bdev_nvme_attach_controller calls in the config and hotplug enabled the app proceeds past the
spdk_subsystem_init_from_json_config()
call (but then fails shortly after becausespdk_bdev_first()
returnsNULL
.Konrad explained that in the VMD-hotplug case it shouldn't be necessary to specify the devices explicitly in config attach entries as any devices attached to a VMD domain bound to the vfio driver should be discoverable by the SPDK process and therefore automatically detected and attached.
Not explicitly specifying the devices in attach entries in the config doesn't really work with the application's current implementation:
To clarify the problem space, experiments verify the following behaviour:
spdk_bdev_first()
Note that the VMD domains are in different sockets (just for extra context) but I doubt that makes any difference.
As Konrad has already observed, it might be to do with a race between the app and the hotplug poller.
Possible Solution
When attach config entries are removed, the app is failing after SPDK has been initialised from config on
spdk_bdev_first()
. In order to discover the devices in the VMD-hotplug case it would be needed to wait for the hotplug poller to detect, attach and init bdevs before continuing. This wouldn't be an ideal workaround as the application is currently dependent on information encoded in the bdev name. Ideally the current method of explicit device specification in config would work for multiple VMD domains per process.Steps to Reproduce
As above.
Context (Environment including OS version, SPDK version, etc.)
Rocky Linux 8.7
SPDK 22.01
DAOS 2.6 (master as of 07-May-2024)
The text was updated successfully, but these errors were encountered: