Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Implement the container_clone using CLONE_PARENT #1610

Merged
merged 1 commit into from Mar 6, 2023

Conversation

yihuaf
Copy link
Collaborator

@yihuaf yihuaf commented Mar 1, 2023

This is the first step into fixing #1601 which implements a container_clone call that uses clone and CLONE_PARENT. Since this function uses CLONE_PARENT which creates sibling process instead of child process, we need to re-examine the overall logic to make sure all the waitpid calls are still correct, since a process can only wait on its immediate child process.

@codecov-commenter
Copy link

Codecov Report

Merging #1610 (eb2ffbf) into main (7b7a782) will increase coverage by 0.13%.
The diff coverage is 86.86%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #1610      +/-   ##
==========================================
+ Coverage   68.84%   68.98%   +0.13%     
==========================================
  Files         120      120              
  Lines       13072    13171      +99     
==========================================
+ Hits         9000     9086      +86     
- Misses       4072     4085      +13     

@utam0k
Copy link
Member

utam0k commented Mar 1, 2023

@yihuaf
Copy link
Collaborator Author

yihuaf commented Mar 1, 2023

@yihuaf
Copy link
Collaborator Author

yihuaf commented Mar 2, 2023

@utam0k I looked into clone3 and I am a little hesitant to use it for this specific usecase. First, clone3 doesn't add anything extra compared to clone syscall for this usecase. It is a nicer interface, but using both clone or clone3 can get the job done. Second, the crate used for the clone3 is not merged into the libc crate, which raises some concern. If we decide to use clone3, we may want to vendor the crate into our code base to be maintained. Lastly, should we fall back to clone in the case where clone3 fails or not supported? If so, then I would argue that it will be easier to just use clone for all cases. clone being older, is definitely more mature and more well documented.

I don't have a strong preference. If you don't have a strong preference, I would recommend us to stay with clone. Given the abstraction we have built around fork.rs, we can easily make the switch at a later point. However, if you feel strongly about clone3, then I am happy to update this PR. Your call :)

@yihuaf
Copy link
Collaborator Author

yihuaf commented Mar 2, 2023

Plus, I am hitting EINVAL with specifying both CLONE_PARENT and exit signal (SIGCHLD) with clone3. Reference email thread here: https://www.spinics.net/lists/linux-man/msg23564.html

@yihuaf
Copy link
Collaborator Author

yihuaf commented Mar 2, 2023

Aha, following this thread: checkpoint-restore/criu#926

It looks like we should not pass in exit signal when CLONE_PARENT is used. There is more context in the referenced thread.

@yihuaf
Copy link
Collaborator Author

yihuaf commented Mar 2, 2023

Updated 68511cf with clone3. Let me know what you think.

@Furisto
Copy link
Member

Furisto commented Mar 3, 2023

I am in favor of using clone3. We have decided that we will support kernels >= 5.4 only and clone3 has been available since kernel 5.3 so we should not need a fallback.

@yihuaf
Copy link
Collaborator Author

yihuaf commented Mar 3, 2023

I am in favor of using clone3. We have decided that we will support kernels >= 5.4 only and clone3 has been available since kernel 5.3 so we should not need a fallback.

Sounds good. I updated the comments with what has been discussed above. PTAL. Thanks.

@yihuaf yihuaf force-pushed the yihuaf/1601 branch 2 times, most recently from 18e69d0 to fd9ebc5 Compare March 3, 2023 16:17
@utam0k
Copy link
Member

utam0k commented Mar 4, 2023

I am in favor of using clone3. We have decided that we will support kernels >= 5.4 only and clone3 has been available since kernel 5.3 so we should not need a fallback.

💯 Since they are too new, there are some options we have to be careful to use.

Comment on lines 61 to 62
let mut pidfd = -1;
clone.flag_pidfd(&mut pidfd).exit_signal(SIGCHLD as u64);
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you have the plan to use pidfd?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see a need for this, so I can remove them for now, until we need it.

@Furisto
Copy link
Member

Furisto commented Mar 4, 2023

Can you test what the performance impact of this change is?
hyperfine --prepare 'sudo sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' --warmup 10 --min-runs 100 'sudo ./youki create -b tutorial a && sudo ./youki start a && sudo ./youki delete -f a'

@yihuaf
Copy link
Collaborator Author

yihuaf commented Mar 6, 2023

hyperfine --prepare 'sudo sync; echo 3 | sudo tee /proc/sys/vm/drop_caches' --warmup 10 --min-runs 100 'sudo ./youki create -b tutorial a && sudo ./youki start a && sudo ./youki delete -f a'

With this PR.

Benchmark 1: sudo ./youki create -b tutorial a && sudo ./youki start a && sudo ./youki delete -f a
  Time (mean ± σ):     233.5 ms ±  13.9 ms    [User: 7.5 ms, System: 8.0 ms]
  Range (min … max):   197.7 ms … 265.5 ms    100 runs

Without the PR:

Benchmark 1: sudo ./youki create -b tutorial a && sudo ./youki start a && sudo ./youki delete -f a
  Time (mean ± σ):     231.5 ms ±  14.5 ms    [User: 7.4 ms, System: 7.9 ms]
  Range (min … max):   198.0 ms … 261.5 ms    100 runs

I believe the result matches the theoretical assumption that the performance different should be minimal since the underlying syscall is the same. The average with clone3 seems to be a little slower (about 1 ms) difference, but this should be within the standard deviation.

Signed-off-by: Eric Fang <yihuaf@unkies.org>
@utam0k
Copy link
Member

utam0k commented Mar 6, 2023

Thanks, @yihuaf

@utam0k utam0k merged commit 8744614 into containers:main Mar 6, 2023
@yihuaf yihuaf deleted the yihuaf/1601 branch March 6, 2023 15:59
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants