Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the ability to map containers to a user host-side #22415

Open
atrauzzi opened this issue Apr 29, 2016 · 13 comments

Comments

@atrauzzi
Copy link

commented Apr 29, 2016

Before getting into the details, I want to point out that I suspect a feature like what I'm about to suggest could solve many permission related snags when using Docker.

So as to prevent this suggestion from getting mired in philosophical objections, this ticket is not advocating permanently storing credentials inside a container. I don't do that and I don't suggest anyone else do it either!


tldr;

The current method of UID spoofing inside of a container via --user has limitations that end up mandating entire ecosystems of one-offs and very problematic workarounds to massage container state into the right frame of mind. This is worsened by the fact that the need to perform these workarounds is influenced by what platform you are calling Docker from.

The feature I'd like to request is to have a flag added to docker run (and equivalents) that tells Docker to map all filesystem operations that a container performs to a specific user host-side. This is very different to --user which simply sets the user of the process that runs inside the container.

If you're interested in an extended explanation, read on!


Rationale

We've all heard about people having issues with temporarily binding SSH credentials and similar $UID & $USER related difficulties.
Something that I've noticed when using Docker on OSX and Windows however is that I'm actually freed from the challenges of ensuring my ephemeral development containers (my primary use case) aren't writing files to the filesystem as root! This is owed to the fact that they both rely on a variety of bridges back to the native platform's filesystem.

One non-exclusive example is when trying to forward either an SSH socket or the SSH directory on a Linux host, because SSH insists on proper home directories, I've been unable to get secure connections working without having to make my container aware of the environment it is running in! The best I've been able to dream up is bind mounting /etc/passwd, /etc/groups and /home into the container. Which is exactly as horrible as it sounds, but also my only choice given that I don't know what uids will be right for the host filesystem. This clearly runs against Dockers attitudes towards portability.

Another more common example would be when files are created by the container, if the bound filesystem is ext or otherwise linux-compatible, the UID of files is set to root (or whoever I run my container as). On Linux, in order to not blast my filesystem with files having UID/GID 0, I have to tell my containers to run their processes as my current user via --user.
This might work in trivial scenarios, but again ends up falling apart in situations where the user system inside of the container is different to the host or the binaries being run enforce /home directory requirements like SSH above. No host is likely to ever have a reliable way to influence the user system inside the container. This is especially true when using images produced by third parties.

So, I currently am stuck avoiding the complexity by staying as root. Everything works because all containers have a root user fully configured. But now without any better option, all Docker containers that want to avoid being polluted with hacks are forced to write files as UID 0!

As I mentioned above, switch over to Windows or OSX and this problem goes away because they don't share the same users as the container and the filesystems aren't compatible. Instead, vboxsf, samba and other drivers are actually emulating the feature I'm requesting here!

So, this clearly identifies the fact that while it's nice to control who the container runs its process as, it would actually be more desirable to optionally map the uid (and gid?) flag for any changes to the filesystem from the container, host-side. All the while, still allowing the container to function as whoever it needs to be internally.
This jives with the philosophy that Docker containers should be portable and require zero awareness of the environment that is running them. Indeed if you examine the nature of this suggestion, it's conceptually parallel to binding ports and bind mounting filesystems.

We need a way to bind users as well.

@MichaelAquilina

This comment has been minimized.

Copy link

commented Apr 23, 2019

@atrauzzi are you are of any progress or discussion that has been around this since your post?

@atrauzzi

This comment has been minimized.

Copy link
Author

commented Apr 25, 2019

Nope, and I was even thinking about this issue today. IIRC, @thaJeztah agreed with the value behind such a feature, might have been advocating for it internally (correct me if I'm wrong 😅).

This is probably one of the more glaring oversights in the docker platform and in a strange twist, keeps me preferring to work with it on macOS/Windows instead of natively on linux!

Feel free to retweet, this issue is as old as my daughter!

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Apr 26, 2019

If the main issue is to have the ability to bind-mount files from your host into a container, then this is largely the same issue as;

  • #28593 User namespaces - Phase 2
  • #2259 Add ability to mount volume as user other than root

And comes down to the ability to remap user-ids inside the container. This might be possible with a kernel that supports shiftfs or a FUSE filesystem.

@cpuguy83 did some work on something in this area in https://github.com/cpuguy83/idmapfs (implementing a FUSE driver to do uid/gid remapping), and in a WIP pull request to use it for userns; #38795

@atrauzzi

This comment has been minimized.

Copy link
Author

commented Apr 26, 2019

Doesn't that end up requiring the container to have special setup?

The idea here is that regardless of what the container chooses to do to the filesystem it sees, any writes coming out of it are abstracted to specific uid and/or gid.

@thaJeztah

This comment has been minimized.

Copy link
Member

commented Apr 26, 2019

It's not something that should be (or could be) enabled by default, as;

  • docker won't know what mapping to make (i.e. wouldn't know what local uid/gid you want to map to which uid/gid inside the container)
  • by default ignoring/mapping uid/gid means that a non-privileged user inside the container would get full access to bind-mounted files from the host (files which could be accessible to certain users only).

For Docker Desktop (Docker for Mac / Docker for Windows), the "ignore ownership" feature was implemented because Docker Desktop is targeted at developer use-cases, not for production (in production situations, giving a container access to files on the host is especially not desirable)

@atrauzzi

This comment has been minimized.

Copy link
Author

commented Apr 26, 2019

I'm not sure I understand the distinction ~"people dev on Windows/macOS" as I'm sure plenty of people develop on linux as well. If that's really an explanation for the difference in behaviour, they surely need some equivalent that is easy to use.

The mappings should be configured per volume. So if I bind a volume to a container, I should be able to set a uid and/or gid that all write operations to that volume map out to on the host. Internally the container will retain its own behaviour, but I can't have containers that write to uids and gids that might not even be configured on my host system.

There is clearly a missing abstraction here.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

commented Apr 26, 2019

The missing abstraction is this is simply not possible on Linux without fuse.
Docker for Mac uses fuse to do this, and it's horribly slow for most use cases.

@atrauzzi

This comment has been minimized.

Copy link
Author

commented May 1, 2019

@cpuguy83 - Sorry, can you elaborate? I might not be following on that one, why would FUSE be necessary?

Really, what this ticket represents applies regardless of what the current state of things is. I don't think it's at all a stretch to say that my /etc/passwd and /etc/group can't possibly be expected to harmonize with every container and every permutation of them I can end up running. Obviously the same applies for the containers own assumptions internal to themselves as well.

As a practical but still contrived example: If I'm running on a system and one container writes as www-data, another uses httpd and another is using webapp, they will all write as whatever the corresponding uid maps out to internally. To the host though, those uids could end up being anything. Worse still, between containers, you have a whole other set of permutations to deal with.

To me, it seems like the only possible answer here is that we need some way during instantiation to abstract the filesystem operations that originate from within the container to the local system. More importantly, this has to be possible without having to put that information in the containers themselves.

Ideally: Each container is responsible for its own security abstractions and when configured, have all operations performed on bound filesystems mapped to a specific uid/gid as defined by the host.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

commented May 1, 2019

FUSE is required because there is no way to do these mappings at the filesystem level.

See "shiftfs" for an attempt to bring this into the kernel generically.

@atrauzzi

This comment has been minimized.

Copy link
Author

commented May 1, 2019

Would super be nice if docker abstracted something like that.

@Lexicality

This comment has been minimized.

Copy link

commented May 8, 2019

Could we have an an option to use FUSE on Linux then? This issue causes huge amounts of problems for devs who use linux in our organisation and maintaining workarounds to let them do their jobs is irritating.
I accept that there will be performance issues with that - but since we mostly use Docker for Mac that's really not a concern for us.
I'd rather have "slow and working" over "fast but broken" any day.

@JacobBrownAustin

This comment has been minimized.

Copy link

commented May 8, 2019

FUSE support would be awesome! A lot better than setting up rsync scripts which I'm doing now which is a pain. I've also considered adding ssh service to my containers just to use sshfs to work around this issue.

@cpuguy83

This comment has been minimized.

Copy link
Contributor

commented May 8, 2019

I've been experimenting specifically with a filesystem for mapping UID/GID's mapped for user namespaces... #38795

How something like this might work for mapping a name in the container to a name on the host... I'd leave that to people who are interested in this.
I guess we already process /etc/passwd in the container for --user=<name> (but it's horrible and doesn't work for some cases).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.