Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix cgroups determination in exec implementation #2720

Merged
merged 9 commits into from
Apr 27, 2024

Conversation

YJDoc2
Copy link
Collaborator

@YJDoc2 YJDoc2 commented Mar 4, 2024

This fixes the way we determine cgroup name as well as the way we set the cgroup in case of exec. Right now on the main, following would fail even though they should succeed

podman run -d --runtime /path/to/youki busybox sleep 10m
podman exec -it --runtime /path/to/youki SHA-HERE /bin/sh

This is because

  1. we copy over the namespaces from "main" container to tenant, but not the cgroups
  2. In case of rootless, we directly use container id as cgroup name, but that fails when trying to decode the cgroup path

Thus, we fix it here by also copying over the cgroup config from init to tenant, and if cgroup path is not present using :youki:id instead .

I have also added a test for exec in the rootless tests we have.

NOT READY FOR MERGE YET.

@YJDoc2 YJDoc2 added the kind/bug label Mar 4, 2024
@YJDoc2
Copy link
Collaborator Author

YJDoc2 commented Mar 4, 2024

With this PR, the number of failed podman tests goes down from 157 to 105 : https://github.com/YJDoc2/youki/actions/runs/8138706191 🎉

Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
@codecov-commenter
Copy link

codecov-commenter commented Mar 6, 2024

Codecov Report

Merging #2720 (0ce379a) into main (496d955) will decrease coverage by 0.05%.
Report is 18 commits behind head on main.
The diff coverage is 40.62%.

Additional details and impacted files
@@            Coverage Diff             @@
##             main    #2720      +/-   ##
==========================================
- Coverage   65.21%   65.17%   -0.05%     
==========================================
  Files         133      133              
  Lines       16981    16984       +3     
==========================================
- Hits        11075    11070       -5     
- Misses       5906     5914       +8     

@YJDoc2 YJDoc2 requested review from lengrongfu and a team March 6, 2024 06:37
tests/rootless-tests/run.sh Outdated Show resolved Hide resolved
@lengrongfu
Copy link
Collaborator

LGTM

Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
@YJDoc2
Copy link
Collaborator Author

YJDoc2 commented Mar 26, 2024

@lengrongfu @utam0k May I ask one of you to take a look at the newer changes I have pushed?

@@ -353,6 +353,11 @@ impl CgroupManager for Manager {
if pid.as_raw() == -1 {
return Ok(());
}
if self.client.transient_unit_exists(&self.unit_name) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can we add a unit tet for this change?

Copy link
Collaborator Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ok, willl do 👍

Copy link
Member

@utam0k utam0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I noticed this leads to breaking changes for existing users due to changing a default cgroup path. In my opinion, affecting rootless users can be allowed, but breaking the change for regular users(non-rootless) is more important. Can we avoid affecting regular users?

@YJDoc2
Copy link
Collaborator Author

YJDoc2 commented Mar 31, 2024

I noticed this leads to breaking changes for existing users due to changing a default cgroup path. In my opinion, affecting rootless users can be allowed, but breaking the change for regular users(non-rootless) is more important. Can we avoid affecting regular users?

Hey, while this is breaking , I'm not sure which part specifically are you talking about with respect to default cgroup path. If it is this : https://github.com/containers/youki/pull/2720/files#diff-36cd5e5a1222b5c819958e4045a242cbff00ae22a8a1699229f5c6220dafdb6bR150-R153

Then, this only changes the when the cgroup path was not set, and I had observed this behavior when creating a tenant container via exec. For the normal container, the path is correctly set and would not change in this. Also the previous behavior of using just the container_id as the path is somewhat wrong, as in a valid cgroup path check, we expect it to have two : indicating the parent and prefix here in the systemd manager, which using just the container_id is wrong (I have observed that docker and podman with root still use the systemd manager) .

If you meant something else, please correct me. Thanks!

@utam0k
Copy link
Member

utam0k commented Mar 31, 2024

Hey, while this is breaking , I'm not sure which part specifically are you talking about with respect to default cgroup path. If it is this : https://github.com/containers/youki/pull/2720/files#diff-36cd5e5a1222b5c819958e4045a242cbff00ae22a8a1699229f5c6220dafdb6bR150-R153

Yes 👍

Then, this only changes the when the cgroup path was not set, and I had observed this behavior when creating a tenant container via exec. For the normal container, the path is correctly set and would not change in this.

I wonder what happens in the following case:

  1. a user runs a container(Container A) with no-set cgroup path with the latest youki version(v0.3.2)
  2. a user updates the release version, including this PR
  3. a user tries exec to Container A

In this above case, Every 1 and 3 processes find out different cgroup paths, right?

@YJDoc2
Copy link
Collaborator Author

YJDoc2 commented Mar 31, 2024

Hey,

I wonder what happens in the following case:
a user runs a container(Container A) with no-set cgroup path with the latest youki version(v0.3.2)
a user updates the release version, including this PR
a user tries exec to Container A
In this above case, Every 1 and 3 processes find out different cgroup paths, right?

Note : below youki 3.2 = youki on tag v0.3.2 ; branch youki = youki compiled from this branch

So I tried this and when I launch container A (with sudo pdoman) with youki 3.2 , then try to exec into it with youki 3.2, it gives error of

Error: OCI runtime error: thread 'main' panicked at crates/libcontainer/src/process/container_intermediate_process.rs:49:82:
called `Result::unwrap()` on an `Err` value: Systemd(CgroupsPath(MalformedPath("348136f0a314677e10ac6b3d09b2e7430a70ee6b2b01a23239f0bb4f18cfd150")))
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
exec failed : failed to receive. "waiting for intermediate process". BrokenChannel

which is what I mentioned about the default cgroup path being incorrect in my previous comment.

However, if I try to launch container A (with sudo podman) youki 3.2 and exec into it using branch youki , it works due to the cgroups path being set correctly in this branch. Also, I verified that there were two processes in the cgroup of container A when I tried to exec into it using branch youki.

Similarly using branch youki for container A and exec works too. Trying to exec into container launched with branch youki using youki 3.2 also fails due to malformed path.

In short, the exec was broken for both rootful and rootless on v0.3.2 , so I think changing the cgroup path determination wouldn't cause any issues when using containers launched with youki 0.3.2 with branch youki.

Does this address your concern? Do you want me to try any other permutations?

@YJDoc2
Copy link
Collaborator Author

YJDoc2 commented Mar 31, 2024

Also, I need to check this in detail, but I suspect that changes in the cgroup path determination don't get applied at all due to the changes in setting the cgroups path here for exec containers.

Previously we would not set that path, thus the path determination finding None and using container_id as path, which lead to the failure. However, after we set the cgroup path, it will always take that as the cgroup path for exec/tenant container. I need to see in which case does the cgroup path can be None (I think it can be case if someone is using youki directly withou docker/podman or they are doing some manual cgroup setup) .

@utam0k
Copy link
Member

utam0k commented Mar 31, 2024

I think it can be case if someone is using youki directly withou docker/podman or they are doing some manual cgroup setup

Yes, I imagined this situation(Sorry for my lack of words 🙏)
Is there any case at all where podman, contained, and CRI-O is not setting the cgroup path? I think it is possible to write a Migration Guide and get away with it.

Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
@YJDoc2 YJDoc2 force-pushed the fix/exec branch 2 times, most recently from 1b8ffdf to bbaf9d6 Compare April 12, 2024 11:43
@YJDoc2
Copy link
Collaborator Author

YJDoc2 commented Apr 12, 2024

@utam0k I have added a test for systemd cgroup manager regarding the changes here. The CI is failing as the dbus pipe is not set up correctly, so I'll fix that up ; but can you take a look at the test I have added, and check if that approach is ok, or any changes are needed - commit : 4a95211 ? Thanks :)

EDIT : This commit id is no longer relevant due to rebase.

@utam0k
Copy link
Member

utam0k commented Apr 17, 2024

The commit in itself looks acceptable to me. I wonder whether this PR should be written as a breaking change or not.

@YJDoc2
Copy link
Collaborator Author

YJDoc2 commented Apr 17, 2024

Thanks, I'll work on fixing the CI then.

The commit in itself looks acceptable to me.

I'm not sure, while some of the internals have changed, the overall effect is making exec work which was not working previously, thus I added the bug label. I'll think more on this 👍

I wonder whether this PR should be written as a breaking change or not.

@utam0k
Copy link
Member

utam0k commented Apr 18, 2024

LGTM except for the CI failed

Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
Signed-off-by: Yashodhan Joshi <yjdoc2@gmail.com>
@YJDoc2
Copy link
Collaborator Author

YJDoc2 commented Apr 27, 2024

@utam0k please take a look. I have changed the approach of adding the task - before I wrote directly to the cgroup proc file, which could be incorrect / have permission errors ; Now I use the systemd dbus api to add the process via dbus call. I have also fixed the tests, although for the task addition test, I needed to add some more mounts and one hack script so that it can run properly in the cross container.

See these three commits for these new changes : 75c851b 3fe7d04 df5ebb2

Thanks :)

@YJDoc2 YJDoc2 requested a review from utam0k April 27, 2024 09:59
Copy link
Member

@utam0k utam0k left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Depending on DBus is better than the previous approach, +1

@utam0k utam0k merged commit 601df9e into containers:main Apr 27, 2024
28 checks passed
@utam0k
Copy link
Member

utam0k commented Apr 27, 2024

Thanks a lot, @YJDoc2

@github-actions github-actions bot mentioned this pull request Apr 27, 2024
@YJDoc2 YJDoc2 deleted the fix/exec branch April 27, 2024 14:49
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants