Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Support recursively read-only (RRO) mounts #45278

Merged
merged 1 commit into from
May 26, 2023
Merged

Conversation

AkihiroSuda
Copy link
Member

@AkihiroSuda AkihiroSuda commented Apr 5, 2023

- What I did

Added the support for recursively read-only (RRO) mounts.

docker run -v /foo:/foo:ro is now recursively read-only on kernel >= 5.12.
Automatically falls back to the legacy non-recursively read-only mount mode on kernel < 5.12.

Use ro-non-recursive to disable RRO.
Use ro-force-recursive or rro to explicitly enable RRO. (Fails on kernel < 5.12)

(EDIT: the CLI options are being changed in #46037 and docker/cli#4316 )

It is highly recommended to use RRO in conjunction with rprivate propagation.

- How I did it

By using OCI mount option "rro", which is implemented by calling mount_setattr(2) with AT_RECURSIVE and MOUNT_ATTR_RDONLY.

- How to verify it

Run docker run -v /mnt:/mnt:ro,rprivate on kernel >= 5.12, and make sure /mnt/recursive-mount-dir is read-only.

- Description for the changelog

Support recursively read-only (RRO) mounts

- A picture of a cute animal (not mandatory but encouraged)
🐧

@@ -118,6 +120,11 @@ func (daemon *Daemon) openContainerFS(container *container.Container) (_ *contai
if err := mount.Mount(m.Source, dest, "", opts); err != nil {
return err
}
if m.RecursivelyReadOnly {
if err := makeMountRRO(dest); err != nil {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this something that should be provided by github.com/moby/sys/mount ? (either through a mount.MakeRecursiveReadOnly() or mount.Mount() accepting this as option?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not necessary IMHO, as the function just wraps a single MountSetattr call with the EINTR loop.

api/swagger.yaml Outdated
Comment on lines 394 to 396
enum:
- "if-possible"
- "required"
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hm.. bit on the fence here how granular we want to be here (assuming users either know what their system supports, or to get an error that it's not supported).

We could still decide to fall back (and warn) on unsupported platforms, or be explicit and error. (pros/cons to both, I can imagine "error" could mean "works on my machine") OTOH; if you're targeting a specific environment, then perhaps that's the way to go.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

FYI: this tri-state granularity was proposed in kubernetes/enhancements#3858 (comment)

Comment on lines 117 to 119
{"/tmp:/tmp5:rro", "", mount.TypeBind, "/tmp5", "/tmp", "", "", false, mount.RecursivelyReadOnlyRequired, false},
{"/tmp:/tmp6:rro-if-possible", "", mount.TypeBind, "/tmp6", "/tmp", "", "", false, mount.RecursivelyReadOnlyIfPossible, false},
{"/tmp:/tmp7:rro-if-possible,rprivate", "", mount.TypeBind, "/tmp7", "/tmp", "", "", false, mount.RecursivelyReadOnlyIfPossible, false},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I wish we could fully freeze (if we didn't not already) the shorthand syntax. To some extend, I can see rro being "reasonable", but (also see my other comment) a bit on the fence on also having rro-if-possible

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The --mount CSV form is too long for CLI users, and a huge portion of users still have to stick to older kernels, so I'd prefer to retain rro-if-possible as the "best effort" mode, but I can move it to a separate follow-up PR if you prefer.

Comment on lines 570 to 577
for _, m := range hostConfig.Mounts {
if bo := m.BindOptions; bo != nil {
bo.RecursivelyReadOnly = ""
}
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is a tricky one; is this hit both when using the shorthand (-v <foo>:<bar>:<options>) and long-form (--mount type=....,opt=...,opt=....) formats?

Because at a glance this may actually result in the "reverse" of the current behaviour. Current APIs will return an error for this case;

docker run -it --rm -v (pwd):/docker:rro alpine
docker: Error response from daemon: invalid mode: rro.
See 'docker run --help'.

I guess this is different than other options, because "options" itself are supported, but not this specific one.

I also think we should either

  • fully error (unsupported)
  • or fallback to ro

Because we should not silently ignore to not mounting read-only if the user expects it to be read-only.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated to return an error

@cpuguy83
Copy link
Member

cpuguy83 commented Apr 6, 2023

We discussed this on the maintainers call today (@thaJeztah @tianon @corhere @neersighted)
There was much debate on how to approach this.

I believe we landed on this (feel free to correct me if I got it wrong):

  • Change ro to opportunistically try to do a recursive read-only mount (this would be the if-possible variant describe in the PR body). We expect that most, if not all, people are surprised by the current behavior.
  • We could add warnings when recursive is not possible on docker info and/or on the container create/start endpoints (if ro was requested), but this may be needlessly noisy for cases that don't actually have nested mounts.
  • Have rro require recursive and error out if not possible.
  • We may need a daemon option to disable the opportunistic upgrade for ro.

This means users will automatically get the enhanced security where they can and those same specs will continue to work on older daemons or machines that don't support it.

@AkihiroSuda
Copy link
Member Author

Change ro to opportunistically try to do a recursive read-only mount (this would be the if-possible variant describe in the PR body). We expect that most, if not all, people are surprised by the current behavior.

We can't do this. Kubernetes needs ro to retain the historical behavior: kubernetes/enhancements#3858 (comment)

@AkihiroSuda
Copy link
Member Author

AkihiroSuda commented Apr 7, 2023

Other usecases of the historical ro:

  • Mount /src as RO, while leaving /src/_output writable
  • Mount /lecture_materials as RO, while leaving /lecture_materials/student_assignment_submission writable

@thaJeztah
Copy link
Member

Wouldn't that case still be supported if the nested mounts are defined as separate mounts?

@AkihiroSuda
Copy link
Member Author

Wouldn't that case still be supported if the nested mounts are defined as separate mounts?

Yes, but it's a breaking change.

@thaJeztah
Copy link
Member

separate mount as in;

- v /src:ro -v /src/_output:rw

@thaJeztah
Copy link
Member

I expect 99% of those situations to be unintentional (and unexpected; the user asked for the mount to be read only, but another mount ended up in there, and now it's not).

We could consider an advanced (for the --mount extended format) option that explicitly opts-out of recursive.

@AkihiroSuda
Copy link
Member Author

AkihiroSuda commented Apr 7, 2023

I expect 99% of those situations to be unintentional

Yes, but Kubernetes will retain the classic behavior to follow their compatibility policy.

Can we just deprecate ro, and recommend people to switch to rro?

A blocker is that AT_RECURSIVE needs kernel >= 5.12, but I think we can fallback to compose classic read-only mounts to emulate AT_RECURSIVE on an older kernel.
(cc @kolyshkin @cyphar ; Is this safe?)

@corhere
Copy link
Contributor

corhere commented Apr 7, 2023

Opportunistic upgrade could be disabled per bind rather than as a daemon option. Then cri-dockerd would be able to map the Kubernetes flags to their semantic equivalents in the engine API.

K8s Engine
ro ro,ro-nonrecursive
rro rro
rro-if-possible ro

@tianon
Copy link
Member

tianon commented Apr 18, 2023

When we have ro or readonly in our CSV-style usage, it's technically shorthand for readonly=true -- could we abuse that in a similar way to what util-linux ended up doing to implement something like readonly=nonrecursive? Does our CSV-style flag parsing support that already? (true/false flags that don't require a value but can have an optional string value?)

One thing we discussed about emulating recursive read-only is that mount propagation makes it complex to do safely (for example, if you have a shared mount it'd have to be converted to private and then monitored and sync'd unless we do something clever with fuse which has its own drawbacks).

@AkihiroSuda
Copy link
Member Author

AkihiroSuda commented May 10, 2023

Updated PR.
RO mounts are now RRO by default on kernel >= 5.12.

Copy link
Contributor

@corhere corhere left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

docs/api/version-history.md Outdated Show resolved Hide resolved
@AkihiroSuda
Copy link
Member Author

rebased

`docker run -v /foo:/foo:ro` is now recursively read-only on kernel >= 5.12.

Automatically falls back to the legacy non-recursively read-only mount mode on kernel < 5.12.

Use `ro-non-recursive` to disable RRO.
Use `ro-force-recursive` or `rro` to explicitly enable RRO. (Fails on kernel < 5.12)

Fix issue 44978
Fix docker/for-linux issue 788

Signed-off-by: Akihiro Suda <akihiro.suda.cz@hco.ntt.co.jp>
Copy link
Member

@thaJeztah thaJeztah left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

"LGTM" - discussed with @corhere and others during the maintainers call, and all concerns should be addressed, so let's get this in.

@thaJeztah
Copy link
Member

Looks like github sees a conflict, but doesn't show "what" 🤔

If someone can do a quick rebase, then we can get this one in ❤️

@AkihiroSuda
Copy link
Member Author

Rebased

@thaJeztah
Copy link
Member

Thanks!

@thaJeztah thaJeztah merged commit 0db4174 into moby:master May 26, 2023
100 checks passed
@AkihiroSuda
Copy link
Member Author

AkihiroSuda commented May 29, 2023

Comment on lines +16 to +20
"rw": true,
"ro": true, // attempts recursive read-only if possible
"ro-non-recursive": true, // makes the mount non-recursively read-only, but still leaves the mount recursive
"ro-force-recursive": true, // raises an error if the mount cannot be made recursively read-only
"rro": true, // alias for ro-force-recursive
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area/api area/volumes impact/api impact/changelog kind/enhancement Enhancements are not bugs or new features but can improve usability or performance. status/4-merge
Projects
None yet
7 participants