Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker socket is not accessible from inside the Docker container #2056

Open
gitfool opened this Issue Jan 15, 2019 · 21 comments

Comments

Projects
None yet
6 participants
@gitfool
Copy link

gitfool commented Jan 15, 2019

Agent Version and Platform

Version of your agent? 2.144.0

OS of the machine running the agent? Linux; Ubuntu 16.04 with Docker

Azure DevOps Type and Version

Azure Pipelines

What's not working?

The container job in BoardGameGeek.Dungeon/azure-pipelines.yml, which uses cake-build/cake via a Docker container (cake-docker/Dockerfile) and Cake.Dungeon scripts to build and push a Docker image to Docker Hub, is not working as expected because the Docker socket is not accessible from inside the Docker container.

See BoardGameGeek.Dungeon/#20190115.6/Docker job for all the build step environment variables, which have been logged as seen from inside the container. Note the error at the bottom:

Got permission denied while trying to connect to the Docker daemon socket at unix:///var/run/docker.sock

Is there a way to mount or otherwise make the Docker socket accessible to Azure Pipelines container jobs?

@gitfool

This comment has been minimized.

Copy link
Author

gitfool commented Jan 15, 2019

@TingluoHuang that's great; so I just need to work out why I'm getting the permission issue.

@gitfool

This comment has been minimized.

Copy link
Author

gitfool commented Jan 15, 2019

@TingluoHuang I've tried everything I can think of but I haven't managed to fix the permission issue:

I think I'm on the right track but I can't seem to solve it from my end. What am I missing?

@TingluoHuang

This comment has been minimized.

Copy link
Member

TingluoHuang commented Jan 15, 2019

@gitfool have you try SUDO groupadd/usermod?

@TingluoHuang

This comment has been minimized.

Copy link
Member

TingluoHuang commented Jan 15, 2019

@gitfool
my following yaml seems works fine.

resources:
  containers:
  - container: ubuntu
    #image: ubuntu:16.04
    image: microsoft/vsts-agent:ubuntu-16.04-docker-18.06.1-ce-standard  # this image has docker installed
phases:
- phase: test1
  queue:
    name: 'Hosted Ubuntu 1604'
    container: ubuntu
  steps:
  - script: printenv|sort
  - script: |
      sudo docker run hello-world
@gitfool

This comment has been minimized.

Copy link
Author

gitfool commented Jan 16, 2019

@TingluoHuang There's a fundamental difference though; the Cake build does not use sudo to run docker, so I need to run docker as a non-root user.

The Azure Pipelines agent creates a user in the container but does not create a docker group or use the group-add option of docker create. Others have grappled with this issue as the docker group id is not fixed or well-known and depends on the host operating system.

Eventually I found the best way to work around this issue with Azure Pipelines was to do the following:

  • Install sudo into my builder container
    • not installed by default in the base Ubuntu Bionic container and su -c requires a terminal
  • Add a separate build step to BoardGameGeek.Dungeon/azure-pipelines.yml to:
    • Use stat to read the group id from the mounted /var/run/docker.sock
    • Use sudo groupadd to create the docker group with the read group id
    • Use sudo usermod to add the executing user to this group

The subsequent Cake build step then successfully runs docker as a non-root user! 😅

@gitfool

This comment has been minimized.

Copy link
Author

gitfool commented Jan 16, 2019

@TingluoHuang the above workaround is not pretty. I might be missing something but wouldn’t this “work out of the box” if the Azure Pipelines agent was changed to use docker create --group-add docker ...?

@TingluoHuang

This comment has been minimized.

Copy link
Member

TingluoHuang commented Jan 16, 2019

@gitfool when you define the container in your yaml file, you can provider addition options for docker create.

- container: ubuntu
    image: ubuntu:16.04
    options: '--group-add docker'
@gitfool

This comment has been minimized.

Copy link
Author

gitfool commented Jan 16, 2019

Okay, that’s great news and good to know! I’ll try that soon. However, it still feels that, to fall into the pit of success, this should be specified by default by the agent.

@TingluoHuang

This comment has been minimized.

Copy link
Member

TingluoHuang commented Jan 16, 2019

@gitfool we try to not modify the original container customer provided when we run docker create, plus the --group-add docker is not a requirement for customer who don't need to run docker command inside the container.

@gitfool

This comment has been minimized.

Copy link
Author

gitfool commented Jan 16, 2019

In that case it should be documented how to set up a "docker in docker" scenario for a non-root user.

It's not quite that easy ... yet; I get error unable to find group docker after adding the options. So the agent vm image should be modified to create the docker group with name docker and should probably add the vsts user to this group to set the stage. Actually, if it adds the vsts user to the group (on the host), it may not be necessary to use the container options as the uid is the same as set up for the vsts_azpcontainer user inside the container.

@vtbassmatt

This comment has been minimized.

Copy link
Member

vtbassmatt commented Jan 16, 2019

Hey @gitfool thanks for engaging with us. I'm the product manager for the agent (and other platform parts of Azure Pipelines). Running Docker from within Docker isn't something I want to push people towards by default, which explains why it isn't a native capability. Once you have this working successfully (with our help), I'd like to update our docs so that advanced users don't have to reinvent the wheel.

StackOverflow seems to think that the group must exist in the image already, which we don't control. I don't have any objections to adding the host vsts user to docker, though I'm not sure when we can get to that work.

@gitfool

This comment has been minimized.

Copy link
Author

gitfool commented Jan 16, 2019

Hi @vtbassmatt! Okay, so I have a question and suggestion then, since I'm not sure if the error above is because the docker group is missing from the host agent, the container image, or both, and that should be investigated next.

So the question is, do the Ubuntu Hosted Agents (vmImage: ubuntu-16.04) which run the build steps for container jobs have the docker group already? If not, we need a test with it added to just the host - add the vsts user to this group while you're there.

Then if it still fails, most likely because the docker group needs to also exist inside the container, then I would suggest that the basis for the issue is because you're creating the vsts_azpcontainer user inside the container (for docker exec), so it's reasonable and only one more step to also add the docker group inside the container, using the group id from the agent host, similarly to what's done in Agent.Worker/ContainerOperationProvider.cs#L404, and while there you also should add the vsts_azpcontainer user to this docker group; so two more steps. 😛

Given the above, I would expect "docker in docker" as a non-root user to "just work". 😃

@vtbassmatt

This comment has been minimized.

Copy link
Member

vtbassmatt commented Jan 17, 2019

vsts on the host is in the docker group, so it must not be enough.

2019-01-17T13:24:00.5793655Z Script contents:
2019-01-17T13:24:00.5795014Z groups $(whoami)
2019-01-17T13:24:00.5829318Z [command]/bin/bash --noprofile --norc /home/vsts/work/_temp/8c47ee6c-deef-4889-88d6-b3a6545f499d.sh
2019-01-17T13:24:00.6556108Z vsts : docker

I'll talk with the team. I want to make sure we don't break some other scenario, or slow down mainline scenarios, by doing this. (Or we could make it an opt-in, maybe something like a user: keyword that accepts a name:group pair. I prefer automating it, though.)

Also, we take pull requests if you've got the urge to solve this 😉

@chrisrpatterson

This comment has been minimized.

Copy link
Member

chrisrpatterson commented Jan 25, 2019

If you have sudo in the container image you could try to run a script that adds the vsts_azpcontainer user to a docker group inside that container and then run your docker commands in a subsequent step. I think with a new process starting you would get your groups re-evaluated.
Nevermind I see you have already done that https://github.com/gitfool/BoardGameGeek.Dungeon/blob/master/azure-pipelines.yml#L21. It seems that it would make sense for us to do this as part of our user initialization.

@gitfool

This comment has been minimized.

Copy link
Author

gitfool commented Jan 25, 2019

@chrisrpatterson yes, that's my current workaround.

@artparks

This comment has been minimized.

Copy link

artparks commented Mar 7, 2019

@gitfool feels like my last 48 hours have pretty much replicated yours in trying to resolve this issue! this thread was a lifesaver to finally stumble across.

i think this should be documented more thoroughly as DinD is a pretty valid use-case here. chances are, if you're looking to run your builds inside containers, you're quite likely to be trying to ship (and therefore build) containers, meaning you need access to docker build/push type commands to be able to run these builds.

glad it works though!

@vtbassmatt

This comment has been minimized.

Copy link
Member

vtbassmatt commented Mar 7, 2019

@bryanmacfarlane this one feels like it's standing in the way of a fully sane Docker experience. When someone frees up from a bigger rock, we can take this one tactically.

@artparks

This comment has been minimized.

Copy link

artparks commented Mar 7, 2019

now that we have this working we're on the brink of a really great CI/CD setup via this. at the moment we're going through all the hassle of running our own agents in aurora/mesos which is a lot of extra setup and a maintenance overhead for our ops team.

this will streamline things so much as each project can just maintain a separate Dockerfile (or from a separate repo of common generic environments) that defines its build env and we're good to go.

@gitfool

This comment has been minimized.

Copy link
Author

gitfool commented Mar 7, 2019

Once you "dockerize" your build there's no going back, it's so empowering! The only trade-off is that caching sucks so the builds take much longer than they should.

@TingluoHuang

This comment has been minimized.

Copy link
Member

TingluoHuang commented Mar 7, 2019

i create a PR #2142 might help this scenario.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.