New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Dragonball: add pci vfio passthrough, hot(un)plug support #8740
Dragonball: add pci vfio passthrough, hot(un)plug support #8740
Conversation
studychao
commented
Dec 27, 2023
•
edited
edited
src/dragonball/Cargo.toml
Outdated
vfio-bindings = { version = "0.3.0", optional = true} | ||
vfio-ioctls = { version = "0.1.0", optional = true} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Missing a space after true
.
vfio-bindings = { version = "0.3.0", optional = true} | |
vfio-ioctls = { version = "0.1.0", optional = true} | |
vfio-bindings = { version = "0.3.0", optional = true } | |
vfio-ioctls = { version = "0.1.0", optional = true } |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed
7d731ca
to
4511761
Compare
ae6a7d0
to
82c02fa
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @studychao, a few comments here.
#[cfg(feature = "host-device")] | ||
fn add_vfio_device(&self, vmm: &mut Vmm, config: HostDeviceConfig) -> VmmRequestResult { | ||
let vm = vmm.get_vm_mut().ok_or(VmmActionError::HostDeviceConfig( | ||
VfioDeviceError::InvalidVMID, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: InvalidVMID
-> InvalidVmId
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'll raise a separate PR to fix this.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#8747 is raised for this problem.
|
||
let mut ctx = vm.create_device_op_context(None).map_err(|e| { | ||
info!("create device op context error: {:?}", e); | ||
if let StartMicroVmError::MicroVMAlreadyRunning = e { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: MicroVMAlreadyRunning
-> MicroVmAlreadyRunning
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
|
||
let (sender, receiver) = unbounded(); | ||
|
||
let vfio_manager = vm.device_manager.vfio_manager.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give some comments to describe why it is safe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
} | ||
})?; | ||
|
||
let mut vfio_manager = vm.device_manager.vfio_manager.lock().unwrap(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
|
||
self.bus | ||
.upgrade() | ||
.unwrap() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ditto.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fixed.
will return error here.
use crate::device_manager::DeviceManagerContext; | ||
use crate::resource_manager::ResourceManager; | ||
|
||
///we only support one pci bus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
///we only support one pci bus | |
/// we only support one pci bus |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
base: 0xcf8, | ||
size: 0x8, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Give some comments?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
comment added
src/dragonball/src/error.rs
Outdated
@@ -205,6 +207,14 @@ pub enum StartMicroVmError { | |||
VhostUserNetDeviceError( | |||
#[source] device_manager::vhost_user_net_dev_mgr::VhostUserNetDeviceError, | |||
), | |||
#[cfg(feature = "host-device")] | |||
/// Failed to create VFIO device | |||
#[error("cannot create VFIO device")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Print error stack?
#[error("cannot create VFIO device")] | |
#[error("cannot create VFIO device: {0:?}")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
src/dragonball/src/error.rs
Outdated
CreateVfioDevice(#[source] VfioDeviceError), | ||
#[cfg(feature = "host-device")] | ||
/// Failed to register DMA memory address range. | ||
#[error("failure while registering DMA address range: {0}")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
#[error("failure while registering DMA address range: {0}")] | |
#[error("failure while registering DMA address range: {0:?}")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
#[cfg(feature = "host-device")] | ||
/// Failed to register DMA memory address range. | ||
#[error("failure while registering DMA address range: {0}")] | ||
RegisterDMAAddress(#[source] VfioDeviceError), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Nit: RegisterDMAAddress
-> RegisterDmaAddress
|
||
#[cfg(feature = "host-device")] | ||
/// The action `InsertHostDevice` failed either because of bad user input or an internal error. | ||
#[error("failed to add VFIO passthrough device: {0}")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
0:?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same below
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -17,6 +17,8 @@ use std::cmp::{Ord, PartialOrd}; | |||
use std::convert::TryFrom; | |||
use std::sync::Mutex; | |||
|
|||
use downcast_rs::{impl_downcast, Downcast}; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use std::any instead of downcast_rs
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -73,11 +73,7 @@ impl PciDevice for PciHostBridge { | |||
} | |||
} | |||
|
|||
impl DeviceIo for PciHostBridge { | |||
fn as_any(&self) -> &dyn std::any::Any { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we can use as_any, not downcast
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
done
@@ -884,7 +884,7 @@ impl Region { | |||
} | |||
} | |||
|
|||
struct VfioPciDeviceState<C: PciSystemContext> { | |||
pub struct VfioPciDeviceState<C: PciSystemContext> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why make state a pub struct?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
75cdc46
to
c3dc571
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few addtional comments.
} | ||
|
||
impl VfioPciDeviceConfig { | ||
///default pci domain is 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
///default pci domain is 0 | |
/// default pci domain is 0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
pub fn host_pci_domain(&self) -> u32 { | ||
0 | ||
} | ||
pub fn valid_vendor_device(&self) -> bool { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
error!("send upcall result failed, due to {:?}!", e); | ||
} | ||
} | ||
_ => {} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Do something? Like print some logs or return an error?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add the match arm for test case here.
.iter() | ||
.position(|info| info.config.id().eq(id)) | ||
} | ||
/// Register guest memory to the VFIO container. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
} | ||
Ok(()) | ||
} | ||
pub(crate) fn register_memory_region(&mut self, region: &GuestRegionMmap) -> Result<()> { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
let readonly = region.prot() & libc::PROT_WRITE == 0; | ||
self.register_region(gpa, size, user_addr, readonly) | ||
} | ||
pub(crate) fn register_region( |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
Ok(cfg.sysfs_path.clone()) | ||
} | ||
} | ||
/// Get all PCI devices' legacy irqs |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added
} | ||
Ok(self.pci_vfio_manager.as_mut().unwrap()) | ||
} | ||
/// Get the PCI manager to support PCI device passthrough |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add a blank here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
added.
c3dc571
to
81dd0d4
Compare
Introduce a new vmm action InsertHostDevice to passthrough host pci devices like NIC or GPU devices into guest so that users could have high performance usage of those devices. fixes: kata-containers#8741 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Introduce two new vmm action to implement pci hotplug and pci hot-unplug: PrepareRemoveHostDevice and RemoveHostDevice. PrepareRemoveHostDevice is to call upcall to unregister the pci device in the guest kernel. RemoveHostDevice should be called after PrepareRemoveHostDevice, it is used to clean the PCI resource in the Dragonball side. fixes: kata-containers#8741 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Zizheng Bian <zizheng.bian@linux.alibaba.com> Signed-off-by: Zha Bin <zhabin@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
81dd0d4
to
b35c6c5
Compare
add pci add and del guest kernel patch as the extension in the upcall device manager server side. also, dump config version to 120 since we need to add config for dragonball pci in upcall fixes: kata-containers#8741 Signed-off-by: Gerry Liu <gerry@linux.alibaba.com> Signed-off-by: Helin Guo <helinguo@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
b35c6c5
to
f9e0a4b
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Lgtm, thanks!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM
Looks like unit tests were failing. Please fix this first. |
31cde2b
to
d901a4e
Compare
yep, I'm trying to fix all CI complains |
d901a4e
to
a261450
Compare
a261450
to
6b4653f
Compare
/test |
6b4653f
to
e69c9eb
Compare
/test |
vfio commits introduce quite a lot change in runtime-rs, this commit is for all the changes related to ci, including compilation errors and so on. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
e69c9eb
to
71c322c
Compare
/test |