Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support unprivileged container #172

Closed
wants to merge 1 commit into from
Closed

Support unprivileged container #172

wants to merge 1 commit into from

Conversation

pjbgf
Copy link

@pjbgf pjbgf commented Aug 7, 2020

Add support to run kured as a non-privileged container also restraining it from using any capability apart from SYS_BOOT.

Relates to #60 and SUSE/skuba#1237

@evrardjp
Copy link
Collaborator

evrardjp commented Aug 13, 2020

hello @pjbgf! Awesome! I will review this.
Some folks intended to make that reboot command configurable, so we might have to communicate on a few issues about it.

Copy link
Collaborator

@evrardjp evrardjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

While I like the concepts, there are a few things that bother me:

  • We use now two daemonsets, the new one is only to a have shim to sync files. Correct? And that's because there is no other way to expose with kubernetes if a file is either absent or present on the host OS (because kubernetes exposes only existing folder/files). Still correct? In that case, I believe that the commit message of this patch should clarify why we introduce this (necessary to not expose the whole host folder matching the dirname of the sentinel file)
  • The reboot-sentinel file becomes now a liability: we now have static paths in the manifest file. Of course this is not a problem with a helm chart that could template that location. However, it will be annoying to deal with kustomize, and easy to miss for some users.
  • Need to maintain more things in the manifest file: Similarly to the previous point: what do I do if I don't need that daemonset shim? I will have to write and maintain my own manifest. In our use case, we already have a translation layer, which checks our maintenance updates and output a file for kured, and we rely on the basic default kured manifest. With this, we would need to deploy this new daemonset, doing about the same as we do, and/or modify the manifest. We would prefer be close to upstream as much as possible.
    For our operational teams, it also means they now have multiple things to follow to know what will trigger a reboot, which is tad more complex operationally. And maintain the bash image of course.

The TL;DR: is that I like the change, but it's more complex for our operational folks. I am also not sure if kured manifest should be have that extra ds in charge of this file touch/rm, and having to maintain yet another image.

I would welcome the change on a different branch, where we can take further assumptions, without being too impactful on the current model of working. For example, we could required the sentinel file to be in a certain location, simplifying the tooling/code/manifests: "Want to use kured? When you want to coordinate a reboot, just drop a file in the folder /opt/kured/trigger-reboot-syscall, and we'll take care of the rest. Want to run a command instead? Just drop something in /opt/kured/trigger-command, and we'll take care of the rest. If your command needs extra privileges, please adapt the manifest accordingly by ."
The extra daemonset manifest can then be "smarter", it could be the compatibility layers for ubuntu, centos, SUSE, etc., which creates the necessary daemonsets to watch for the hosts, issue the OS specific command to know if it's necessary to reboot, and drop the file into the known location. It makes the interface by far simpler, and cleaner, IMO.

The reason I am saying this, is that people are looking for the granularity of actions to do on the reboot, and I have the impression that this patch is not the direction some people are looking towards to, see also #122 . That include a poweroff, which could also leverage that kind of code and security, just a different syscall :)

What do you think?

@pjbgf
Copy link
Author

pjbgf commented Aug 13, 2020

Hey @evrardjp,

Thank you for taking the time to review the changes, you gave me a lot of food for thought.
I do agree that flexibility is key - specially when we can keep the attack surface small.

Since your last review I made a few changes:

  1. The changes to main.go are now backwards compatible. Meaning that users can continue to use kured just as they did before. However, users wanting to run as unprivileged containers would have that option by using kured-ds-unprivileged.yaml instead.

  2. To run without privileged an environment variable PRIVILEGED with value true will need to set.

  3. I renamed the sentinel being used on the Daemonset to explicitly show the flexibility with this approach, however, they could both have the same name/location.

As for the reboot and other commands, I think those are great additions and can see the value they would add. I wonder whether they need a full branch of its own or if they could just be added as follow-up PRs? But I am happy either way. 👍

PTAL

Copy link
Collaborator

@evrardjp evrardjp left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think I am fine with this approach, but a few details remain to iron out, IMO.

rebootCmd := newCommand("/usr/bin/nsenter", "-m/proc/1/ns/mnt", "/bin/systemctl", "reboot")
if err := rebootCmd.Run(); err != nil {
// Relies on hostPID:true and SYS_BOOT to restart machine
err := syscall.Reboot(syscall.LINUX_REBOOT_CMD_RESTART)
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I would also wrap with the unprivileged condition, and use the systemctl reboot for privileged ones.

AFAIK, the syscall will only do the reboot, and therefore we might be at least missing a sync in the reboot process, to not corrupt data... WDYT?

Copy link
Author

@pjbgf pjbgf Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good spot on the sync. We actually need it in both cases, so I have amended my PR to include that.

@@ -0,0 +1,140 @@
---
Copy link

@saschagrunert saschagrunert Aug 14, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is there any drawback so that we do not make this deployment the new default?

(Would prefer having it the default)

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't see any major problem on pushing this to be the default.
The new approach won't break existing workloads and we could update the yaml and helm chart to be a seamless replacement.

The only down side of the new approach is having two daemonsets, which in my opinion is a tiny downside when considering we will finally be able to run kured unprivileged and with a single Linux capability.

Shall we get this merged and then a follow up PR to just make it the default so we can hear more feedback just around that possibility?

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah sounds reasonable, I tend to make it the default. 👍

Copy link

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

@pjbgf
Copy link
Author

pjbgf commented Aug 14, 2020

@saschagrunert @evrardjp Just added a call to syscall.Sync() pre-reboot to avoid data loss - which is the same process as systemctl does internally, PTAL.

@evrardjp
Copy link
Collaborator

It's kinda the same process as systemd's shutdown target, but it's not exactly the same, which leaves us with a gap IMO.

the shutdown service will do quite more things 0, if I am not mistaken (the sync can hang for example, so systemd has implemented code 1 to be more reliable). It means we'll have to do the same, as we are effectively reimplementing a reboot procedure here.

@pjbgf
Copy link
Author

pjbgf commented Aug 20, 2020

It's kinda the same process as systemd's shutdown target, but it's not exactly the same, which leaves us with a gap IMO.
the shutdown service will do quite more things 0, if I am not mistaken (the sync can hang for example, so systemd has implemented code 1 to be more reliable). It means we'll have to do the same, as we are effectively reimplementing a reboot procedure here.

@evrardjp Thank you for the feedback. I made a few changes to keep the existing behaviour and also add an async call to sync() with a timeout, which should be analogous to the systemd's implementation. This should get us the best of both worlds.

PTAL

@evrardjp
Copy link
Collaborator

evrardjp commented Aug 20, 2020

While there are still differences, I think this should be "good enough" as a first pass. If someone else would like to step up by testing it (I didn't, I just read the code) and review it, that would be awesome!

Thanks @pjbgf : )

@pjbgf
Copy link
Author

pjbgf commented Aug 30, 2020

@evrardjp @saschagrunert any ideas how can we move this forward?

Copy link

@saschagrunert saschagrunert left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code LGTM, @dholbach please take a look :)

@evrardjp
Copy link
Collaborator

@dholbach if we are worried that this code might be too disruptive, maybe we can start another branch.

Copy link
Member

@ckotzbauer ckotzbauer left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Don't tested it, but the code looks good!

@evrardjp
Copy link
Collaborator

evrardjp commented Sep 1, 2020

My biggest concern is that we reimplement the wheel of a restart process, which is what systemd is already tackling. However, as said above, I think we should merge it and iterate, maybe in a different branch. I feel like the work could quite reduce the surface :)

@dholbach we should probably talk about this patch in our next meeting.

@pjbgf
Copy link
Author

pjbgf commented Sep 24, 2020

Hey @evrardjp @dholbach, any news on this one?

@evrardjp
Copy link
Collaborator

We've discussed it in our meeting, but the tl;dr: is that we need to evaluate the different approaches for reboot in more detail.
We might want to make this bit of code an optional path instead of being the unique way.

For that we, might want to add a flag at kured to define the method of reboot (syscall/command), to keep the current behaviour unchanged.

@github-actions
Copy link

This PR was automatically considered stale due to lack of activity. Please refresh it and/or join our slack channels to highlight it, before it automatically closes (in 7 days).

@pjbgf
Copy link
Author

pjbgf commented Feb 4, 2021

Is it worth reopening it and implementing it based on the new flags then or is running privileged the preferred approach?

@evrardjp
Copy link
Collaborator

evrardjp commented Feb 5, 2021

AFAIK, there is still no official stance on the way forward. I rewrote bits of the reboot logic to aggregate a few PR, and thought about this one while doing so. I would say it's not worth rebasing this PR as of now, but at the same time, I would say that running unprivileged was not discarded at all.

@evrardjp evrardjp reopened this Feb 5, 2021
@pjbgf
Copy link
Author

pjbgf commented Mar 29, 2021

@evrardjp thank you for the update. Giving the lack of traction I will close this PR, if things change in the future we can always re-open it.

@pjbgf pjbgf closed this Mar 29, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants