-
Notifications
You must be signed in to change notification settings - Fork 192
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
external workload: Run cilium-agent in host cgroup namespace #572
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Changes to the install script seem sane to me, but I have no context to the various issues we've had with cgroup (v2) mounts, will have to defer to @aditighag for that. For example, is it possible or likely that the added cgroupv2 filesystem would not be available in the VM? Is there a minimum kernel version needed, etc.? If yes then maybe the script should the added mount to fail and not add the new docker mount option if that happens.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are there existing dependencies on the mount
command in the script? If not, can we check if the host has utilities like mount
installed, and log a warning if they are not? See cilium/cilium#16815 for more details.
@aditighag Currently, there is no existing dependency on the Is it okay if I log and exit when |
Ok, let's do that then. |
@aditighag I have modified the script to log and exit when |
External workloads workflow has successfully completed so marking it ready for merge. Edit : Checking if a review from sig-clustermesh is required. |
if I'm understanding this change correctly, you are mounting the host's cgroup tree into the agent container, so the agent container can see the entire tree of cgroup namespaces created by docker? sanity check me: what occurs when this script runs on a host without cgroup_v2 enabled? |
|
@wazir-ahmed - this all sounds good. The other approach would be to start the container with the host's "mount" name space, like we do with the net namespace. I think your approach is fine, and I will approve just after checking that init.sh does either fail or fall back to something safe. However did we consider sharing the host's mount namespace as well ? |
@ldelossa - We can use the docker option Initially, I mentioned both the approaches in #569. But then, I followed the |
@wazir-ahmed okay, I'm cool with the mount approach if we don't need any other facilities of the host's filesystem its probably best to just bind mount what we need. |
@ldelossa Sharing host's mount namespace (or bind mount) isn't going to address the issue. The namespace in question here is cgroup.
On second thoughts, let's go ahead with this approach so that we don't have to run the additional mount commands. This option will obviously only work with Docker container runtime. But I just checked, and we already pass a bunch of other Docker options while creating the cilium container in the script. The end result for both the options is same - cilium container is started in the underlying host's cgroup namespace. Sorry, I should've checked this earlier. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Per last comment, let's go ahead with the Docker CLI option.
Edit : Please add a note in the code explaining why we are passing that option.
Thank @aditighag i had it in my head that the root cgroup fs on the host can view all children cgroups created by docker, thus giving access to the hosts cgroup fs from the agent would accomplish the goal. |
That would require cgroup2 fs be already mounted on the host. |
Ah okay, and in this case it is not? |
@aditighag I have added the Docker CLI option |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks!
Recent versions of Docker create a new cgroup namespace for each container. Add Docker CLI option `--cgroupns=host` to the external workload installation script (generated by `cilium clustermesh vm install`) so that the cilium agent will run in host cgroup namespace and service load-balancing will work as expected in external workloads. Fixes: cilium#569 Signed-off-by: Wazir Ahmed <wazir@accuknox.com>
Multicluster test failed for the new commit. AFAICS, the test shouldn't have failed even though the changes are related. Will run it locally to see if it's a flake. |
The multicluster test failure looks like a the known issue #399. I've re-triggered the test to validate. |
Re-run also failed. /cc @tklauser @wazir-ahmed Although the test failure is not related, can you check the minimum Docker CLI version that supports the |
Marking the PR as ready for merge since the test failure is a known flake. |
@aditighag - The CLI reference document mentions that |
@wazir-ahmed Can you please send a PR to document this requirement - https://docs.cilium.io/en/v1.10/gettingstarted/external-workloads/#enable-support-for-external-workloads? |
@aditighag Raised a PR for documenting the Docker requirement - cilium/cilium#17726 |
Related to cilium/cilium-cli#572 Signed-off-by: Wazir Ahmed <wazir@accuknox.com>
Related to cilium/cilium-cli#572 Signed-off-by: Wazir Ahmed <wazir@accuknox.com>
Recent versions of Docker create a new cgroup namespace for each container.
Add Docker CLI option
--cgroupns=host
to the external workload installationscript (generated by
cilium clustermesh vm install
) so that the ciliumagent will run in host cgroup namespace and service load-balancing will work
as expected in external workloads.
Fixes: #569
Signed-off-by: Wazir Ahmed wazir@accuknox.com