Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

update otomi with workaround for docker run in linux #353

Closed
Morriz opened this issue Mar 10, 2021 · 16 comments · Fixed by #473
Closed

update otomi with workaround for docker run in linux #353

Morriz opened this issue Mar 10, 2021 · 16 comments · Fixed by #473
Assignees
Labels
Task Scrum task

Comments

@Morriz
Copy link
Contributor

Morriz commented Mar 10, 2021

See last comments

@Morriz Morriz added the Task Scrum task label Mar 10, 2021
@Morriz Morriz unassigned 0-sv Mar 10, 2021
@Dunky13 Dunky13 added the Spike Time boxed research label Mar 30, 2021
@Dunky13
Copy link
Contributor

Dunky13 commented Mar 30, 2021

Will research this on my linux machine to see if this is an issue or a non-issue

@Dunky13 Dunky13 removed the Task Scrum task label Mar 30, 2021
@Dunky13 Dunky13 self-assigned this Mar 30, 2021
@0-sv
Copy link
Contributor

0-sv commented Apr 7, 2021

I bake my own container image and don't use the app user.

@Morriz
Copy link
Contributor Author

Morriz commented Apr 9, 2021

Can you provide more details for others here to help them?

@0-sv
Copy link
Contributor

0-sv commented Apr 9, 2021

For development purposes:

  1. Remove the following from the tools Docker:
RUN chown -R app:app /home/app
USER app
  1. Docker build the tools image, e.g.
 docker build -t otomi/tools:my-tools -e NPM_TOKEN=$NPM_TOKEN . 
  1. Replace the otomi-core Dockerfile's referred image to the built tools image:
FROM otomi/tools:my-tools as test
FROM otomi/tools:my-tools as prod
  1. Replace every occurrence of --chown=app from the root Dockerfile's source code:
COPY --chown=app . .

with

COPY . .
  1. Build the otomi-core image, e.g.
docker build -t otomi/core:my-core .
  1. Replace otomiVersion in otomi-values with core image, e.g.:
clouds:
    aws:
        domain: eks.otomi.cloud
        clusters:
            dev:
                enabled: true
                apiName: eks_otomi-cloud_eu-central-1_otomi-eks-dev
                apiServer: BFE1BC8655F2C73189DCE38F5E40D0AC.gr7.eu-central-1.eks.amazonaws.com
                k8sVersion: '1.19'
                otomiVersion: my-core

@Morriz
Copy link
Contributor Author

Morriz commented Apr 11, 2021

This effectively makes it use the base image user, which is root. Not wanted.

@Dunky13 Dunky13 removed their assignment Apr 13, 2021
@Morriz
Copy link
Contributor Author

Morriz commented Apr 23, 2021

assigning to Sebastiaan as he is into this. Don't reassign!

@0-sv
Copy link
Contributor

0-sv commented Apr 29, 2021

tl;dr: There is a project that changes ownership on the fly, basically what OSXFS does for Docker for Mac: https://github.com/tianon/gosu.

For reference, this post: https://stackoverflow.com/a/54787364/8357826.

Summary

For others that see this issue with containers running as a different user, you need to ensure the uid/gid of the user inside the container has permissions to the file on the host. On production servers, this is often done by controlling the uid/gid in the image build process to match a uid/gid on the host that has access to the files (or even better, do not use host mounts in production).

A named volume is often preferred to host mounts because it will initialize the volume directory from the image directory, including any file ownership and permissions. This happens when the volume is empty and the container is created with the named volume.

MacOS users now have OSXFS which handles uid/gid's automatically between the Mac host and containers. One place it doesn't help with are files from inside the embedded VM that get mounted into the container, like /var/lib/docker.sock.

For development environments where the host uid/gid may change per developer, my preferred solution is to start the container with an entrypoint running as root, fix the uid/gid of the user inside the container to match the host volume uid/gid, and then use gosu to drop from root to the container user to run the application inside the container. The important script for this is fix-perms in my base image scripts, which can be found at: https://github.com/sudo-bmitch/docker-base

The important bit from the fix-perms script is:

# update the uid
if [ -n "$opt_u" ]; then
  OLD_UID=$(getent passwd "${opt_u}" | cut -f3 -d:)
  NEW_UID=$(stat -c "%u" "$1")
  if [ "$OLD_UID" != "$NEW_UID" ]; then
    echo "Changing UID of $opt_u from $OLD_UID to $NEW_UID"
    usermod -u "$NEW_UID" -o "$opt_u"
    if [ -n "$opt_r" ]; then
      find / -xdev -user "$OLD_UID" -exec chown -h "$opt_u" {} \;
    fi
  fi
fi

That gets the uid of the user inside the container, and the uid of the file, and if they do not match, calls usermod to adjust the uid. Lastly it does a recursive find to fix any files which have not changed uid's. I like this better than running a container with a -u $(id -u):$(id -g) flag because the above entrypoint code doesn't require each developer to run a script to start the container, and any files outside of the volume that are owned by the user will have their permissions corrected.

You can also have docker initialize a host directory from an image by using a named volume that performs a bind mount. This directory must exist in advance, and you need to provide an absolute path to the host directory, unlike host volumes in a compose file which can be relative paths. The directory must also be empty for docker to initialize it. Three different options for defining a named volume to a bind mount look like:

  # create the volume in advance
  $ docker volume create --driver local \
      --opt type=none \
      --opt device=/home/user/test \
      --opt o=bind \
      test_vol

  # create on the fly with --mount
  $ docker run -it --rm \
    --mount type=volume,dst=/container/path,volume-driver=local,volume-opt=type=none,volume-opt=o=bind,volume-opt=device=/home/user/test \
    foo

  # inside a docker-compose file
  ...
  volumes:
    bind-test:
      driver: local
      driver_opts:
        type: none
        o: bind
        device: /home/user/test
  ...

Lastly, if you try using user namespaces, you'll find that host volumes have permission issues because uid/gid's of the containers are shifted. In that scenario, it's probably easiest to avoid host volumes and only use named volumes.

@0-sv
Copy link
Contributor

0-sv commented Apr 29, 2021

This effectively makes it use the base image user, which is root. Not wanted.

I agree, it isn't a permanent solution.

@Morriz
Copy link
Contributor Author

Morriz commented May 6, 2021

can you test and document that for the outside world then?

@0-sv
Copy link
Contributor

0-sv commented May 7, 2021

There are actually two options he suggests:

  • Changing ownership on the fly when mounting host volumes into the container
  • Using named volumes

Which do we prefer? Personally, I think named volumes would be more flexible for our stack.

@Morriz
Copy link
Contributor Author

Morriz commented May 9, 2021

You choose ;) You don't have to involve me in all of these decisions. Just explain your choices if that is informative. If I need to be involved I hope you can explain why and what is expected of me.

@0-sv
Copy link
Contributor

0-sv commented May 27, 2021

This is a proposal for named volumes:

function named_volume() {
  [ -z "$NAMED_VOLUME" ] && volume_name='otomi-volume'
  helper_container_name="named-volume-tmp-helper-container"

  source_files=$1
  destination_path=$2

  docker container inspect $helper_container_name >/dev/null 2>&1 && docker rm $helper_container_name
  docker container create --name $helper_container_name -v $volume_name:$destination_path hello-world >/dev/null 2>&1 &&
    docker cp $source_files $helper_container_name:$destination_path >/dev/null 2>&1 &&
    docker rm $helper_container_name >/dev/null 2>&1

  retval="$volume_name:$destination_path"
}

I tested it, and it sets the correct permissions into the named volume. It would enable us to mount everything separately (or everything with xargs) like:

named_volume /tmp /tmp

E.g.

docker run $docker_terminal_params --network host --rm \
      -v $(named_volume /tmp /tmp && echo $retval) \
...

It's just an example and not conclusive yet but curious about your thoughts.

@0-sv
Copy link
Contributor

0-sv commented May 27, 2021

Follow-up sprint planning: create aforementioned proposal IF not on "Darwin" (or something similar).

@Morriz Morriz added Task Scrum task and removed Spike Time boxed research labels May 27, 2021
@Morriz Morriz changed the title update the docs to say you can only run otomi cli in dev mode on OSX update otomi with workaround for docker run in linux May 27, 2021
@0-sv 0-sv closed this as completed Jul 1, 2021
@0-sv 0-sv reopened this Jul 2, 2021
@0-sv
Copy link
Contributor

0-sv commented Jul 2, 2021

Sorry guys, my hack didn't work as expected.

It seems rootless containers on a Linux host is still an open issue, read moby/moby#2259. Please note the distinction between NEW volumes and run-time/bind volumes.

And to me this is the expected behaviour. Why would the app user be able to edit my files? I found a work-around by mounting /etc/passwd and adding an app user on my host and adding it to the group of me (the developer). This will solve some problems, but the app user is still not allowed to create new files in the /tmp directory for example.

This is by design, right? If the container is compromised, it should not be possible to rwx files on the host. So the only scenario in which a run-time container can rwx files on the host if it's run as root. So that's what you can use imo: --user root:root. I don't know if we want to add this to our code though.

To recap what happens on MacOS: OSXFS changes ownership on the fly. I've tried this approach as well, but I guess this only makes sense in the context of MacOS, where Docker-for-Mac runs a VM just for Docker, so there is no opening for compromise. If you change ownership on-the-fly in Linux you can access files in the container, but then they become unavailable on the host.

@0-sv
Copy link
Contributor

0-sv commented Jul 20, 2021

tl;dr docker run (...) --user root:root (...)

Makes sense to me if the user running is on its own host and knows what he is doing?

@Morriz
Copy link
Contributor Author

Morriz commented Jul 20, 2021

This is by design, right? If the container is compromised, it should not be possible to rwx files on the host. So the only scenario in which a run-time container can rwx files on the host if it's run as root. So that's what you can use imo: --user root:root. I don't know if we want to add this to our code though.

Seems like a valid hack to me for non-osx systems. Please go ahead

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Task Scrum task
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants