Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add rootless DinD runner #1644

Merged
merged 3 commits into from
Aug 3, 2022
Merged

Add rootless DinD runner #1644

merged 3 commits into from
Aug 3, 2022

Conversation

some-natalie
Copy link
Contributor

@some-natalie some-natalie commented Jul 13, 2022

Addresses, possibly closes #1288

Docker-in-Docker is great! While I was an enterprise admin, I just wanted GitHub Actions to work the way it does in .com for my users in GHES. However, rootful DinD raises some eyebrows to say the least. This PR adds a rootless DinD runner that does not allow sudo access.

This mitigates some security concerns around nested virtualization, especially in a co-tenanted or enterprise-wide environment. However, it also means the administrator who builds these images on-premises needs to be responsible for every configuration that'd require sudo, such as installing new software with apt-get. On the other hand, this is much more "it just works" experience for users to build Docker images and otherwise use GitHub Actions on-premises. ❤️


I wasn't sure how to add additional runner tests for this, but ran some tests manually on my own demo cluster and everything seems to work as anticipated so far. 😅

Running a docker build job, rootlessly:
Screen Shot 2022-07-13 at 9 05 39 AM

Here's an example test workflow run in my own demo repo. I tested the following

  1. Docker daemon is up and running rootlessly.
  2. Container Actions work as the runner user.
  3. Docker building containers works as the runner user.
  4. sudo fails

There's also a decent amount of debug info in that test as well.

@mumoshu
Copy link
Collaborator

mumoshu commented Jul 19, 2022

@some-natalie Hey! Thanks for your contribution. This looks awesome!

Since you submited this pull request, I had been relearning how rootless dind is supposed to work.

This does seem to remove the ability to run any privileged operations from within workflow jobs scheduled to be run within the runner pod's runner container, which is great.

But privileged: true is still needed, right? (I'm referring to docker-library/docker#291)
If that's correct, can I take this rootless-dind-powered-runner as a way to add another layer of security within the privileged container, although it doesn't allow us to avoid using privileged: true like #1546?

Copy link
Collaborator

@toast-gear toast-gear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Missing the index entry too

@some-natalie
Copy link
Contributor Author

some-natalie commented Jul 25, 2022

Hey @mumoshu, super sorry for the delay here! I was at a conference last week and am still digging out of my notifications.

But privileged: true is still needed, right?

Correct. --privileged allows the nested container access to the main pod's resources, specifically for mounting procfs and sysfs (source). Rootlesskit does provide some additional isolation by not running in a privileged namespace, though.

However, if you want to run this with privileged: false, you should get something to the effect of what's below in the pod logs when trying to launch rootless Docker. The runner will still launch and attach to your repo/org/enterprise, but Docker is unavailable in this situation.

[rootlesskit:parent] error: failed to start the child: fork/exec /proc/self/exe: operation not permitted

If that's correct, can I take this rootless-dind-powered-runner as a way to add another layer of security within the privileged container, although it doesn't allow us to avoid using privileged: true like #1546?

Also correct! That's what this PR is hoping to do - add another layer of security within a privileged container. For some of the teams I work with (and my old team when I was running this in production), the limitations of the runner container approach wouldn't have been workable - many teams rely on the ability to have container Actions built from a dockerfile, for instance. For enterprise admins, balancing security, time investment in running CI, and user expectations that things "just work" in GHES like they did in GitHub.com is a really really hard thing to ask for, but this gives users one more option to consider. ❤️


Missing the index entry too

Forgive the dumb question, @toast-gear , but what are you referring to here? Happy to revise, but not sure I understand what you're asking for. 😄

edit - 🤦🏻‍♀️ yeah, I see that now ... lemme know if this latest commit isn't what you were looking for.

Copy link
Contributor

@isarkis isarkis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do we also want to update runner/Makefile to include actions-runner-dind-rootless build?

ca-certificates \
dnsutils \
ftp \
fuse-overlayfs \
Copy link
Contributor

@isarkis isarkis Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I had this PR tested on our self hosted runners infra in GKE. We found fuse-overlayfs not working very well on Ubuntu (super slow, deadlocking, etc.). Everything is working as expected when using default overlay filesystem.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@isarkis, I think it's a bit out of scope for this PR - I superimposed my work on this topic from my demo repository onto this one for the PR. I don't run fuse-overlayfs on my own images either, for much the same reason.

Copy link
Contributor

@isarkis isarkis Jul 25, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Then, let's remove fuse-overlayfs from the runner/actions-runner-dind-rootless.dockerfile. Per https://docs.docker.com/engine/security/rootless/, overlay2, which is the default, is the recommended file system for Ubuntu.

@isarkis
Copy link
Contributor

isarkis commented Jul 25, 2022

@some-natalie, thank you so much for this PR, it will go long way to make self hosted runners workloads more secure. Does it also make sense to give ARC ability to use docker:dind-rootless when dockerdWithinRunnerContainer is set to false?

@some-natalie
Copy link
Contributor Author

Does it also make sense to give ARC ability to use docker:dind-rootless when dockerdWithinRunnerContainer is set to false?

I would say no, but that's not my call to make. First, some enterprises do not wish to allow container Actions at all for supply chain concerns and it's not my place to tell you how to run your business. No containers means no containers. Secondly, I tend to prefer explicit allows/denies over implicit "false, but true-ish" settings in software. Using dind-rootless when dind is explicitly disallowed by dockerdWithinRunnerContainer seems to fall into that unfriendly-to-users vagueness to me.

Copy link
Collaborator

@mumoshu mumoshu left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I managed to make it work with my test environment!

The only gotcha for me was that I had to use the docker driver for docker-buildx when the buildx builder container is managed by the rootless docker. So, the relevant part of my Actions workflow definition now looks like:

    - run: docker context create mycontext
    - run: docker context use mycontext
    - name: Set up Docker Buildx
      uses: docker/setup-buildx-action@v1
      with:
        buildkitd-flags: --debug
        driver: docker
        endpoint: mycontext

where in "rootful" it worked without driver: docker.

The docker buildx driver doesn't support build cache export --cache-from/to in docker-buildx-build args isn't valid anymore.

To be clear, all these gotchas are unrelated to this awesome rootless DinD runner feature!
So, this change LGTM.

Thank you so much for your contribution!

@mumoshu
Copy link
Collaborator

mumoshu commented Jul 31, 2022

Using dind-rootless when dind is explicitly disallowed by dockerdWithinRunnerContainer seems to fall into that unfriendly-to-users vagueness to me.

This is just an attempt for a friendly correction, but dockerdWithinRunnerContainer is for toggling whether to use a sidecar container for dind, not for toggling dind.

With dockerdWithinRunnerContainer: true ARC doesn't create the sidecar. In combination with runner.spec.image set to one of our dind runner images(we now have two, one for runner w/ rootful dind, another for runner w/ rootless dind), it will allow you to colocate dockerd in where the same container that actions/runner is running.

With dockerdWithinRunnerContainer: false, which is the default setting, ARC does create the sidecar container for dockerd. This allows you to manage the container image for actions/runner and dockerd independently.

Honestly speaking, I don't know what might be the rationale if a person prefers a dockerdWithinRunnerContainer setting over another.

Although it's not the default setting, probably dockerdWithinRunnerContainer: true is more easier to understand and configure as it can naturally share all the volumes and linux users/groups and the filesystem between the runner and dockerd.

dockerdWithinRunnerContainer: false was the default behavior since the inception of the ARC project. So it's proven to work a bit more than true. One of the benefits of it being false is that you can differentiate the dockerd version per runner using spec.dockerImage (in contrast to spec.runnerImage which is used for configuring the actions/runner image) as mentioned above. But I don't know if it's really used by anyone 🤔

That said,

@isarkis Do you have any need to use this feature with dockerdWithinRunnerContainer: false? Or are you just wondering whether it literally make sense or not?

@isarkis
Copy link
Contributor

isarkis commented Aug 1, 2022

@mumoshu, there is no need to use this feature with dockerdWithinRunnerContainer: false. I think that, eventually, it would be a nice to have.

I do have a problem with fuse-overlayfs being installed and used in the rootless runner container running Ubuntu. The default overlay2 is the recommended storage drive and it's way faster than fuse-overlayfs which is a single process for the entire container and quickly becomes the bottleneck for any IO intensive operation. We've confirmed this in our IO intensive workloads, and did notice degraded performance with occasional deadlocks. Things went back to normal after rebuilding rootless runner image without fuse-overlayfs.

@some-natalie
Copy link
Contributor Author

some-natalie commented Aug 1, 2022

@mumoshu thank you for the clarification on dockerdWithinRunnerContainer! That's really helpful.

@mumoshu and @isarkis - I'm happy to open a separate PR for removing fuse-overlayfs from all runner images too, if that'd help. Also happy to update it in this one. Would only take a couple minutes 🤷🏻‍♀️

@toast-gear - friendly bump on your review? ❤️

Copy link
Collaborator

@toast-gear toast-gear left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@some-natalie apologies for the delay, extremely busy at the moment.

@mumoshu
Copy link
Collaborator

mumoshu commented Aug 3, 2022

@some-natalie Thanks for your confirmation!

I'd prefer to merge this now so that we can state that this already works with a caveat(fuse is slow), and potentially involve more testers asap.

I'd greatly appreciate it if you could submit a follow-up PR for the removal of fuse-overlayfs!

Thank you again for your contribution 😄

@mumoshu mumoshu merged commit 37aa1a0 into actions:master Aug 3, 2022
mumoshu added a commit that referenced this pull request Aug 3, 2022
mumoshu added a commit that referenced this pull request Aug 4, 2022
mumoshu added a commit that referenced this pull request Aug 23, 2022
mumoshu added a commit that referenced this pull request Aug 23, 2022
Related to #1644

Signed-off-by: Yusuke Kuoka <ykuoka@gmail.com>
mumoshu added a commit that referenced this pull request Aug 24, 2022
Related to #1644

Signed-off-by: Yusuke Kuoka <ykuoka@gmail.com>

Signed-off-by: Yusuke Kuoka <ykuoka@gmail.com>
mumoshu added a commit that referenced this pull request Sep 25, 2022
@mumoshu
Copy link
Collaborator

mumoshu commented Sep 25, 2022

@some-natalie Hey! JFYI, I'm fixing an issue in the rootless startup script in #1856.

@some-natalie
Copy link
Contributor Author

@mumoshu thank you SO MUCH for the heads up and reminder here!!!

mumoshu added a commit that referenced this pull request Oct 13, 2022
neggles pushed a commit to neggles/actions-runners that referenced this pull request Nov 10, 2022
neggles pushed a commit to neggles/actions-runners that referenced this pull request Nov 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
4 participants