-
Notifications
You must be signed in to change notification settings - Fork 1k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
runtime-rs: built-in Dragonball sandbox part I - resource and device managers #4265
runtime-rs: built-in Dragonball sandbox part I - resource and device managers #4265
Conversation
Related issue: #4257 |
f8d4c9f
to
ea613b2
Compare
c859a25
to
57c7151
Compare
7caa412
to
5338bfb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
A few comments
atomic-guest-memory = [] | ||
virtio-vsock = ["dbs-virtio-devices/virtio-vsock", "virtio-queue"] | ||
|
||
[patch.'crates-io'] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's unnecessary to use patch for lib.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be removed at the final version of Dragonball:)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will be removed at the final version of Dragonball:)
const MSI_IRQ_BASE: u32 = LEGACY_IRQ_MAX + 1; | ||
|
||
// Linux: KVM_MAX_IRQ_ROUTES is defined as 4096 in linux/include/linux/kvm_host.h | ||
const MSI_IRQ_MAX: u32 = 1023; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The comment does not match for MSI_IRQ_MAX
.
// Linux: KVM_MAX_IRQ_ROUTES is defined as 4096 in linux/include/linux/kvm_host.h | ||
const MSI_IRQ_MAX: u32 = 1023; | ||
// Linux: KVM_USER_MEM_SLOTS is 509 for x86_64 and 512 for aarch64. | ||
const KVM_USER_MEM_SLOTS: u32 = 509; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it need to use #[cfg(target_arch = "x86_64")]
and #[cfg(target_arch = "aarch64")]
to distinguish KVM_USER_MEM_SLOTS.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Where does this value come from? Could you add a reference to the kernel source line that specifies this value?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, I'm curious about it too. I check it in aarch64 and KVM_USER_MEM_SLOTS is 32767. If I'm wrong, pls correct me.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thans for your comments!
As Dragonball only supports kernel-4.19 and kernel-5.10 currently, the macro "KVM_USER_MEM_SLOTS" is defined in file linux/arch/arm64/include/asm/kvm_host.h, in which KVM_USER_MEM_SLOTS = 512.
In kernel-5.12 and above, this macro is defined in linux/include/linux/kvm_host.h and KVM_USER_MEM_SLOTS = 32767.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for explanation.
Does that mean we must run kata3.0 on top of specific host kernel version?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@jongwu :)
For now, we only test Arm Dragonball on these two kernel version(4.19 & 5.10). In the future, we may support higher versions of kernel.
|
||
/// Type of the dragonball virtio mmio devices. | ||
#[cfg(feature = "dbs-virtio-devices")] | ||
pub type DbsMmioV2Device = dbs_virtio_devices::mmio::MmioV2Device< |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unnecessary qualification dbs_virtio_devices::mmio::
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed. Thanks.
stdin_handle, | ||
sock_listener, | ||
sock_conn: None, | ||
logger: logger.new(slog::o!()), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it better to add a subsystem for the logger?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Added.
} | ||
|
||
if should_drop { | ||
let events = Events::with_data_raw(libc::STDIN_FILENO, EPOLL_EVENT_STDIN, EventSet::IN); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Is it need to set self.stdin_handle
to None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If self.stdin_handle is set to None, stdin related connection will be lost. We only want to drop the data in stdin channel here.
38fc116
to
f3b4a41
Compare
8cea7db
to
8681a42
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @studychao - A few initial comments...
src/dragonball/Makefile
Outdated
default: build | ||
|
||
build: | ||
cargo update -p vm-memory:0.8.0 --precise 0.7.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Buglet: This line is prefixed by spaces, not tabs as required for
make
. - Why is this line required / why can't version 0.8.0 be specified in
Cargo.toml
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is because the crates in dbs-xxx uses version 0.7.0 of vm-memory, but rust-vmm uses 0.8.0. This will be resolved by upgrading dbs-xxx in later.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
space is changed to tab
src/dragonball/Makefile
Outdated
|
||
test: | ||
@echo "INFO: testing dragonball for development build" | ||
cargo test --all-features -- --test-threads 1 --nocapture |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this limited to a single thread? Do the unit tests need to be fixed (or annotated with #[serial]
) maybe?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This problem has been solved, and the --test-threads 1
will be deleted in the following patch.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Already deleted.
src/dragonball/README.md
Outdated
@@ -0,0 +1,32 @@ | |||
# Introduction | |||
`Dragonball Sandbox` is a light-weight virtual machine manager(VMM) based on Linux Kernel-based Virtual Machine(KVM), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please add a single space before brackets:
`Dragonball Sandbox` is a light-weight virtual machine manager(VMM) based on Linux Kernel-based Virtual Machine(KVM), | |
`Dragonball Sandbox` is a light-weight virtual machine manager (VMM) based on Linux Kernel-based Virtual Machine (KVM), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed:) Thanks.
src/dragonball/README.md
Outdated
which is optimized for container workloads with: | ||
- container image management and acceleration service | ||
- flexible and high-performance virtual device drivers | ||
- low CPU & memory overhead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- low CPU & memory overhead | |
- low CPU and memory overhead |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed
src/dragonball/README.md
Outdated
- container image management and acceleration service | ||
- flexible and high-performance virtual device drivers | ||
- low CPU & memory overhead | ||
- industry-leading startup time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd consider rewording this to something like:
- industry-leading startup time | |
- Minimal startup time |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changed~
|| irq.checked_add(count).is_none() | ||
|| irq + count - 1 > MSI_IRQ_MAX | ||
{ | ||
panic!("invalid irq number when freeing legacy irq"); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Again, please add a unit test using #[should_panic]
for this and any other functions that panic.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The panic has been removed.
Resource::PioAddressRange { base, size } => self.free_pio_address(*base, *size), | ||
Resource::MmioAddressRange { base, size } => self.free_mmio_address(*base, *size), | ||
Resource::MemAddressRange { base, size } => self.free_mem_address(*base, *size), | ||
Resource::LegacyIrq(base) => self.free_legacy_irq(*base), | ||
Resource::MsiIrq { ty: _, base, size } => self.free_msi_irq(*base, *size), | ||
Resource::KvmMemSlot(slot) => self.free_kvm_mem_slot(*slot), | ||
Resource::MacAddresss(_) => {} | ||
} | ||
} | ||
} | ||
|
||
fn free_allocated_resources( | ||
&self, | ||
resources: DeviceResources, | ||
) -> Result<DeviceResources, ResourceError> { | ||
self.free_device_resources(&resources); | ||
Err(ResourceError::NoAvailResource) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is slightly odd code: All the free_*()
handlers panic
on error and the function that calls the handlers then returns an error "on success". Also, if the resources are freed, saying there are no resources available is "correct", but it's potentially misleading as it really means, "there (probably) are system resources to create the requested resources, but those requested resources haven't been (re-)created yet".
Why can we just use the conventional methodology here and make the handlers return a Result
to allow free_device_resources()
to return that Result
. This function could then return that Result
on error, or Ok
on success, as is normal.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for your suggestion.
We have refactored this part.
/// The descriptor to the initrd file, if there is one | ||
initrd_file: Option<File>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does this imply that normal rootfs images cannot be used? Also, what does it mean if initrd_file == None
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We support normal rootfs images and initrd is for providing the possibility for some cases if users need to use initrd
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, but I'm still not clear on this point. If initrd_file = None
, where is the rootfs image specified?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
rootfs image related setting will be introduced in the next PR related to block device manager
/// Share guest kernel text/ro sections. | ||
share_ro_sections: String, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why is this a String
and not just a bool
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This is a performance feature and we decide to not include this in the stage one for Dragonball:)
pub host_numa_node_id: Option<u32>, | ||
/// numa node id on guest for this region | ||
pub guest_numa_node_id: Option<u32>, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- Are these values always different?
- Can multiple
NumaRegionInfo
object contain the same values for these elements?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
- host_numa_node_id and guest_numa_node_id could be the same or different, it's related to users' actual needs.
- Technically, it won't be multiple
NumaRegionInfo
that contain the same value. We will introduce the validation check in vm configuration part PR.
Hi @jodh-intel thanks a lot for your comments. Your advice are very important to us. |
9e729ba
to
697a163
Compare
0a519e9
to
c756ea3
Compare
@jodh-intel Thanks for your review, I have addressed the latest comments. |
For the tracing question, I am not sure what you mean by If you mean GDB, we have achieved the GDB support for Dragonball but it will not be included in the first stage of Dragonball. It could be added in the future stages. For Hope that helps and please let me know if I misunderstood anything:) |
/test |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @studychao. One query regarding that default issue, but other than that...
lgtm
/// Invalid address space operation. | ||
#[error("invalid address space operation")] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ack. That seems a bit pointless if there is an #[error]
, but I guess we have to keep the compiler happy ;)
src/dragonball/src/config_manager.rs
Outdated
ConfigInfos { | ||
configs: Vec::new(), |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
? I'm not seeing that...?
c756ea3
to
5be2304
Compare
@jodh-intel Thanks for your review:) |
The dragonball crate initial commit that includes dragonball README and basic code structure. Fixes: kata-containers#4257 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Resource manager manages all resources of a virtual machine instance. Fixes: kata-containers#4257 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Address space abstraction to manage virtual machine's physical address space. The AddressSpaceMgr Struct to manage address space. Fixes: kata-containers#4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Console manager to manage frontend and backend console devcies. A virtual console are composed up of two parts: frontend in virtual machine and backend in host OS. A frontend may be serial port, virtio-console etc, a backend may be stdio or Unix domain socket. The manager connects the frontend with the backend. Fixes: kata-containers#4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
The legacy devices manager is used for managing legacy devices. Fixes: kata-containers#4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
It is used for managing a group of configuration information. Fixes: kata-containers#4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
It is used for holding guest kernel configuration information. Fixes: kata-containers#4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Device manager to manage IO devices for a virtual machine. And added DeviceManagerTx to provide operation transaction for device management, added DeviceManagerContext to operation context for device management. Fixes: kata-containers#4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Added VsockDeviceMgr struct to manage all vsock devices. Fixes: kata-containers#4257 Signed-off-by: Liu Jiang <gerry@linux.alibaba.com> Signed-off-by: wllenyj <wllenyj@linux.alibaba.com> Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Update Dragonball Readme to fix style problem and add github issue for TODOs. Add document for devices in dragonball. This is the document for the current dragonball device status and we'll keep updating it when we introduce more devices in later pull requets. Fixes: kata-containers#4257 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
Currently supported: build, clippy, check, format, test, clean Fixes: kata-containers#4257 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
Revert this patch, after dragonball-sandbox is ready. And all subsequent implementations are submitted. Fixes: kata-containers#4257 Signed-off-by: wllenyj <wllenyj@linux.alibaba.com>
add dragonball description into kata README to help introduce dragonball sandbox. Fixes: kata-containers#4257 Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
add THIRD-PARTY file to add license for crosvm. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
5be2304
to
93c10df
Compare
ebbd3ed
to
9199e35
Compare
fix clippy warnings in safe-path lib to make clippy happy. Signed-off-by: Chao Wu <chaowu@linux.alibaba.com>
9199e35
to
d5ee3fc
Compare
This pull request initializes the basic framework for Dragonball Sandbox and introduces the resource and device managers part.
In this PR, we introduce the abstractions for several management for virtual machine managers such as LegacyDeviceManager, AddressSpaceManager, ResourceManager and etc. They are responsible for maintaining different kinds of resources for virtual machines.
This is the first pull request for Dragonball Sandbox and there are approximately 6 PRs covering different parts of VMM in total.
Fixes: #4257