Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

runtime-rs: support Memory hotplug #6876

Merged
merged 5 commits into from Dec 15, 2023

Conversation

Tim-0731-Hzt
Copy link
Member

@Tim-0731-Hzt Tim-0731-Hzt commented May 18, 2023

support memory hotplug for runtime-rs

@katacontainersbot katacontainersbot added the size/tiny Smallest and simplest task label May 18, 2023
@Tim-0731-Hzt Tim-0731-Hzt marked this pull request as draft May 18, 2023 06:56
@Tim-0731-Hzt Tim-0731-Hzt changed the title Memory hotlug runtime-rs: support Memory hotlug May 18, 2023
@Tim-0731-Hzt Tim-0731-Hzt changed the title runtime-rs: support Memory hotlug WIP: support Memory hotlug May 18, 2023
@Tim-0731-Hzt
Copy link
Member Author

Tim-0731-Hzt commented May 18, 2023

This PR will depend on the framework of #6289, and the memory hotplug functionality of dragonball.

@Tim-0731-Hzt Tim-0731-Hzt changed the title WIP: support Memory hotlug WIP: support Memory hotplug May 18, 2023
@fidencio
Copy link
Member

@Tim-0731-Hzt, please, when this PR becomes ready to be merged, re-enable this test here: https://github.com/kata-containers/kata-containers/blob/main/tests/integration/kubernetes/k8s-cpu-ns.bats

We'd like to have it tested as part of our CI.

@katacontainersbot katacontainersbot added size/huge Largest and most complex task (probably needs breaking into small pieces) and removed size/tiny Smallest and simplest task labels Jun 13, 2023
@Tim-0731-Hzt Tim-0731-Hzt marked this pull request as ready for review June 13, 2023 08:14
@Tim-0731-Hzt Tim-0731-Hzt changed the title WIP: support Memory hotplug runtime-rs: support Memory hotplug Jun 13, 2023
Copy link
Contributor

@Apokleos Apokleos left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thx @Tim-0731-Hzt for your good job.
some comments for it.

src/runtime-rs/crates/hypervisor/src/dragonball/inner.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/hypervisor/src/dragonball/inner.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/hypervisor/src/dragonball/inner.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/resource/src/cpu_mem/mem.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/resource/src/cpu_mem/mem.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/resource/src/cpu_mem/mem.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/resource/src/cpu_mem/mem.rs Outdated Show resolved Hide resolved
Copy link
Contributor

@YushuoEdge YushuoEdge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tim-0731-Hzt ! Some initial comments:

src/libs/kata-types/src/capabilities.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/hypervisor/src/dragonball/inner.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/hypervisor/src/dragonball/inner.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/resource/src/cpu_mem/mem.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/resource/src/cpu_mem/mem.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/resource/src/cpu_mem/mem.rs Outdated Show resolved Hide resolved
@Tim-0731-Hzt Tim-0731-Hzt force-pushed the memory_hotlug branch 2 times, most recently from 1c3c24b to ec053cb Compare June 15, 2023 06:58
@katacontainersbot katacontainersbot added size/tiny Smallest and simplest task and removed size/huge Largest and most complex task (probably needs breaking into small pieces) labels Jun 15, 2023
Copy link
Contributor

@YushuoEdge YushuoEdge left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks @Tim-0731-Hzt , i still have some questions:

src/libs/kata-types/src/capabilities.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/hypervisor/src/dragonball/inner.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/resource/src/cpu_mem/initial_size.rs Outdated Show resolved Hide resolved
src/runtime-rs/crates/resource/src/cpu_mem/mem.rs Outdated Show resolved Hide resolved
@katacontainersbot katacontainersbot added size/huge Largest and most complex task (probably needs breaking into small pieces) and removed size/tiny Smallest and simplest task labels Jun 20, 2023
@Tim-0731-Hzt Tim-0731-Hzt force-pushed the memory_hotlug branch 2 times, most recently from 53f825c to 4ddb4f7 Compare June 21, 2023 06:10
@Tim-0731-Hzt Tim-0731-Hzt force-pushed the memory_hotlug branch 2 times, most recently from d10e8f0 to 275842a Compare November 20, 2023 09:24
@Tim-0731-Hzt Tim-0731-Hzt force-pushed the memory_hotlug branch 2 times, most recently from 2078804 to 2b63b58 Compare November 27, 2023 08:36
@studychao
Copy link
Member

/test

#[derive(Default, Debug, Clone)]
pub struct MemResource {
/// Current memory
pub(crate) current_mem: Arc<RwLock<u32>>,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

do we really need this Arc<RwLock> field which is only used to judge whether to clear balloon size or not?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

As runtime is a multi-thread application, there might be a situation that we have one thread that needs to update the current_mem, while other threads are reading it to make decisions based on the current state

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not focus on the Arc Rwlock, I mean this field is never used again after you refactor balloon logic.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -338,13 +348,115 @@ impl DragonballInner {
Ok((old_vcpus, new_vcpus))
}

pub(crate) fn resize_memory(
&mut self,
old_mem_mb: u32,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we cloud simply save a balloon size in self, or just force to clear balloon size. It seems that resize_memory() only need a target memory size is enough

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

use std::{collections::HashSet, fs::create_dir_all};

const DRAGONBALL_KERNEL: &str = "vmlinux";
const DRAGONBALL_ROOT_FS: &str = "rootfs";

const BALLOON0: &str = "balloon0";
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about BALLOON_DEVICE_ID? and why mem device don't use a const name?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@@ -17,6 +17,8 @@ pub enum CapabilityBits {
MultiQueueSupport,
/// hypervisor supports filesystem share
FsSharingSupport,
/// hypervisor supports memory hotplug probe interface
GuestMemoryHotplugProbe,
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

GuestMemoryProbe may be better?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is to know if the hypervisor supports Hotplug, maybe keep the keyword Hotplug is better here

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The path you read in agent is /sys/devices/system/memory/probe, leave only probe here is more precise. And you can also choose leaving only hotplug here, which is also make sense.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

// set memory hotplug probe
if guest_details.support_mem_hotplug_probe {
self.hypervisor
.set_capabilities(CapabilityBits::GuestMemoryHotplugProbe)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

why do we need this checking? set cap and check it?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is because we need to know if the hypervisor support memory hotplug. we get support_mem_hotplug_probe from agent response, and it is to determine the capability of memory hotplug, if it is, we set the capability to hypervisor.

.await;

// set memory hotplug probe
if guest_details.support_mem_hotplug_probe {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

is the probe interface in guest kernel a necessary condition for hypervisor?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it might be unnecessary for Dragonball, but it is used for qemu in Kata 2.0

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So when kernel does not support this feature, will dragonball still works? If so, you may need to set this capability as default, and consider if this need to be update to agent response.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed, set this capability when we initialize dragonball instance

) -> Result<(u32, MemoryConfig)> {
// check the invalid request memory
if new_mem_mb > self.hypervisor_config().memory_info.default_maxmemory
&& self.hypervisor_config().memory_info.default_maxmemory > 0
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what about set it to intmax or host total memory when it is 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this new_mem_mb variable cannot be 0, because it is calculated by adding it from the hypervisor's default memory.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Does default_maxmemory can be 0?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fixed

@Tim-0731-Hzt Tim-0731-Hzt force-pushed the memory_hotlug branch 5 times, most recently from 327ec15 to abbf70e Compare December 12, 2023 07:23
@HerlinCoder
Copy link
Contributor

LGTM!

@Tim-0731-Hzt
Copy link
Member Author

/test

@Tim-0731-Hzt
Copy link
Member Author

/retest

get memory block size and guest mem hotplug probe

Fixes:kata-containers#6356
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Fixes: kata-containers#6875
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
Fixes:kata-containers#6875
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
add default_maxmemory in config file

Fixes:kata-containers#6875
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
@Tim-0731-Hzt
Copy link
Member Author

/test

check the update memory size greater than default max memory size

Fixes:kata-containers#6875
Signed-off-by: Zhongtao Hu <zhongtaohu.tim@linux.alibaba.com>
@Tim-0731-Hzt
Copy link
Member Author

/test

@Tim-0731-Hzt Tim-0731-Hzt merged commit 0f80dc6 into kata-containers:main Dec 15, 2023
167 of 176 checks passed
@Tim-0731-Hzt Tim-0731-Hzt deleted the memory_hotlug branch December 15, 2023 06:28
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
ok-to-test runtime-rs size/huge Largest and most complex task (probably needs breaking into small pieces)
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

9 participants