Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Patch plumbing of docker-compose UID/GID build args #4527

Merged
merged 32 commits into from
Jun 30, 2022

Conversation

stoooops
Copy link
Contributor

@stoooops stoooops commented Jun 24, 2022

Issue

The UID/GID docker build args are only partially plumbed. For docker-compose build flow, they were hard-coded to 1000/1000 in multiple places.

Expected Behavior

I can use a dedicated user with UID != 1000 and/or GID != 1000 without permissions errors in any service defined in docker-compose.yml

Symptoms

At least 3 docker services fail to come up if you use a user with UID != 1000 or GID != 1000. The services that fail are rpcdaemon, prometheus, and grafana (file permissions).

Fix

  • Add services > erigon > build > args in the docker-compose.yml, so the args are passed in to the docker process from docker-compose at build time
  • Add .env.example with ${DOCKER_UID} and ${DOCKER_GID} specified
  • Replace hard-coded 1000/1000 with ${DOCKER_UID} and ${DOCKER_GID} for prometheus, grafana services

These fixes allow me to use a user on my host OS with UID != 1000 and/or GID != 1000 without the docker-compose bringup failing.

Nice-to-have

Improved debugging for future users:

  • Detailed README including description around permissions pitfalls
  • More clearly renamed envvars PUID/PGID to DOCKER_UID/DOCKER_GID (on host OS) and UID/GID (in Dockerfile)
  • Echo user, uid, gid in docker-compose Makefile target
  • Add a fail-fast Makefile step if the host OS does not have an "erigon*" user with the correct UID/GID

Author

Patched written by cory.eth (Cory Gabrielsen) (@cory_eth).

@AskAlexSharov
Copy link
Collaborator

Great description

docker-compose.yml Outdated Show resolved Hide resolved
@AskAlexSharov
Copy link
Collaborator

AskAlexSharov commented Jun 25, 2022

shell usually has UID and GID vars. means we can do:
UID=${UID} GID=${GID} docker compose up or UID=$(id -u) GID=$(id -g) docker-compose up
and UID=${UID} GID=${GID} docker compose build erigon
and user: "${UID:-1000}:${GID:-1000}"

let's add user: "${UID:-1000}:${GID:-1000}" to x-erigon-service

  1. maybe also rename PUID to UID? Even that it's not backward compatible

How do you think?

@stoooops
Copy link
Contributor Author

stoooops commented Jun 25, 2022

Thanks for the quick review. I will try your suggestions tomorrow. Hiking today. ⛰️

@AskAlexSharov
Copy link
Collaborator

Looks great.
@stoooops , DOCKER_UID also can rename to UID?

@stoooops
Copy link
Contributor Author

Ok, so here's what I ended up doing since first commit:

  • Added a small .env
  • Default uid/gid to 3473 in examples (random number) since 1000/1000 will almost always be reserved by non-erigon user on host OS
  • Rename the PUID/GUID -> UID, GID
  • In the README, I went with ERIGON_* environment variable names.
  • More robust fail-fast build checks
  • Adjust Makefile logic to not print error if Go not installed on host OS (relying on docker)

I went heavy on comments throughout everywhere because this stuff is so easy to get wrong and the file permissions are usually pretty opaque error messages if you're not experienced. Have dealt with this exact issue multiple times across several professional/personal projects now that I favor docker.

@stoooops
Copy link
Contributor Author

stoooops commented Jun 28, 2022

Looks great. @stoooops , DOCKER_UID also can rename to UID?

I am actually not sure this is the correct decision. What if the caller is not the erigon user and $UID resolves to the wrong/unexpected/surprising value? What if it's a power user running some docker-compose manually? I think it makes it easier to mess up and not be sure why. I prefer the explicitness.

There can be three user/hosts pairs involved:

  • "dev" user on host OS
  • "erigon" user on host OS
  • "erigon" user in docker

both the erigon users must be in sync. But the dev user (or any other user on the host OS) should be able to invoke make/docker-compose easily without needing to think critically about what UID environment variable will resolve to. Having a separate DOCKER_UID allows this "dev" user to manage the concepts independently.

I do prefer it this way. I think I would need a bit more convincing to change it based on above reasoning.

@stoooops
Copy link
Contributor Author

Can you trigger the CI again when you get a chance? I think I patched it. Thank you.

Copy link
Collaborator

@AskAlexSharov AskAlexSharov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  1. make erigon doesn't work on MacOS. It doesn't store user in /etc/passwd https://superuser.com/questions/191330/users-dont-appear-in-etc-passwd-on-mac-os-x
  2. CI also red by some reason: https://github.com/ledgerwatch/erigon/runs/7104458854?check_suite_focus=true#step:3:13
  3. Maybe build on hub.docker.com will also fail, maybe need define DOCKER_UID in ./hooks/build

@stoooops
Copy link
Contributor Author

Ok thanks, I will fix MacOS. I know we can add MacOS as a github runner too.

Will try to get it working via my own repo so I don't need to keep triggering failed runs here.

@stoooops
Copy link
Contributor Author

Ok, I have added fixes and then updated the README again to reflect the changes:

  • patched the Makefile validation to support Mac OS.
  • include DOCKER_UID, DOCKER_GID in hooks/build
  • updated README
  • created Makefile targets for creating "erigon" user on either linux/macos

I tested the CI on my own fork repo, so it should work if we try running it.

@AskAlexSharov
Copy link
Collaborator

looks great. I will run test on hub.docker.com then merge

@stoooops
Copy link
Contributor Author

stoooops commented Jun 29, 2022

I see that it failed. If you can share the logs with me, I can try to patch it. (there was a login prompt so I'm assuming I don't have permissions)

I have also pushed one more change needed to make some README changes I made play nice together even when the env vars aren't setup yet. This doesn't affect the existing CI, just the new targets I made for helping create the "erigon" user.

@AskAlexSharov
Copy link
Collaborator

from hub.docker.com

Executing build hook...
Docker build args:
DOCKER_UID: 0
DOCKER_GID: 0
Ensuring host OS user exists with specified UID/GID...
✔️ host OS user exists: root
Updating git submodules
...
#17 [stage-1 3/7] COPY --from=builder /app/build/bin/* /usr/local/bin/
#17 sha256:7fd450444c8760819b499bec1a58998f494a8ec1bf2dfd2c29776bac0f075b68
#17 DONE 4.2s
#18 [stage-1 4/7] RUN adduser -H -u 0 -g 0 -D erigon
#18 sha256:30125699a05a0d9ae56d5500f7525ece800845a9f91b874ac644c3f488a1d774
#18 0.378 adduser: uid '0' in use
#18 ERROR: executor failed running [/bin/sh -c adduser -H -u ${UID} -g ${GID} -D erigon]: exit code: 1
------
> [stage-1 4/7] RUN adduser -H -u 0 -g 0 -D erigon:
------
executor failed running [/bin/sh -c adduser -H -u ${UID} -g ${GID} -D erigon]: exit code: 1
Makefile:55: recipe for target 'docker' failed
make: *** [docker] Error 1
build hook failed! (2)

@stoooops
Copy link
Contributor Author

stoooops commented Jun 29, 2022

Ok thank you, I need to handle the root user case. I will work on a fix.

@stoooops
Copy link
Contributor Author

Updated:

  • Set DOCKER_UID=3473 in hooks/build
  • Set DOCKER_GID=3473 in hooks/build
  • Add CI test that make docker succeeds when ran as root (after non-root builds succeeds and is cached)

Notably, on my local workspace, when calling sudo DOCKER_UID=$(id -u) DOCKER_GID=$(id -g) make docker, the existing git-submodules target would fail as git doesn't let root user stomp over a non-root users git repository. This is fixed by allowing the git submodule sync to fail (albeit noisely).

-    @git submodule sync --quiet --recursive
+    @git submodule sync --quiet --recursive || true

This case was not hit on Docker Hub since I believe the repo gets checked out with root file permissions.

This case was not hit in CI because the docker builds were ran as non-root user.

With these changes, root user can build the image in an existing, checked out repo, as long as they pass the DOCKER_UID and DOCKER_GID correctly. It will not fail fast from a git file permission check.

@stoooops
Copy link
Contributor Author

stoooops commented Jun 30, 2022

I think the CI that failed is nondeterministic / inconsistent and unrelated to these changes. If you re-run the one job, I'd expect it to work. I notice that a couple times on my own fork.

Thanks for your help.

@AskAlexSharov AskAlexSharov merged commit 74cf984 into erigontech:devel Jun 30, 2022
@0xakihiko
Copy link

0xakihiko commented Jul 7, 2022

Ok thank you, I need to handle the root user case. I will work on a fix.

Also getting this on my server while on root.

 => ERROR [stage-1 4/5] RUN adduser -D -u 0 -g 0 erigon                                                                                                                   0.2s
------
 > [stage-1 4/5] RUN adduser -D -u 0 -g 0 erigon:
#16 0.206 adduser: uid '0' in use
------
executor failed running [/bin/sh -c adduser -D -u $UID -g $GID erigon]: exit code: 1
make: *** [Makefile:55: docker] Error 1```

@stoooops
Copy link
Contributor Author

stoooops commented Jul 7, 2022

@0xakihiko Looks like you're running your build as root. Specify DOCKER_UID and DOCKER_GID to be non-zero so that the adduser command succeeds in the dockerfile.

You can see an example of this in the .github/workflows/ci.yaml where it tests that it can build with sudo.

That said, we could patch the Dockerfile "RUN adduser ..." layer so it gracefully ignore creating the user if both are passed in as 0, but usually it's better practice to use a dedicated user, especially as it will try to volume-mount a directory from your host OS and the written files are permissioned by the docker user uid/gid.

@0xakihiko
Copy link

May

@0xakihiko Looks like you're running your build as root. Specify DOCKER_UID and DOCKER_GID to be non-zero so that the adduser command succeeds in the dockerfile.

You can see an example of this in the .github/workflows/ci.yaml where it tests that it can build with sudo.

That said, we could patch the Dockerfile "RUN adduser ..." layer so it gracefully ignore creating the user if both are passed in as 0, but usually it's better practice to use a dedicated user, especially as it will try to volume-mount a directory from your host OS and the written files are permissioned by the docker user uid/gid.

Yeah this makes sense.
Maybe another alternative is to print a message if 0 is passed through to guide the user in another direction that they may have missed a step if it wasn't intentional?

@easeev
Copy link

easeev commented Jul 13, 2022

@stoooops I suggest bringing back default values for UID and GID as now Docker Hub builds produces images with value 3473 which breaks many things since before whey were set to 1000

https://hub.docker.com/layers/erigon/thorax/erigon/v2022.07.02/images/sha256-34ae0723539b09f6df51a7c0349acdc8744ed307ee36631be910339345019e15?context=explore

@AskAlexSharov
Copy link
Collaborator

I think it make sense to return 1000 as default for backward compatibility. done by: #4702

@JanRuettinger
Copy link

I think there is something missing in the readme.

My setup:
user 1: dev
user 2: erigon (created with password disabled using make user_linux)

When I run:
make user_linux
ERIGON_USER=erigon
sudo -u ${ERIGON_USER} DOCKER_UID=$(id -u ${ERIGON_USER}) DOCKER_GID=$(id -g ${ERIGON_USER}) XDG_DATA_HOME=~${ERIGON_USER}/.ethereum DOCKER_BUILDKIT=1 COMPOSE_DOCKER_CLI_BUILD=1 make docker-compose

I get the following error/issue. I am supposed to type in a password for erigon. But the password is disabled.

Docker build args:
    DOCKER_UID: 3473
    DOCKER_GID: 3473

Ensuring host OS user exists with specified UID/GID...
✔️ host OS user exists: erigon
mkdir -p ~erigon/.ethereum/erigon ~erigon/.ethereum/erigon-grafana ~erigon/.ethereum/erigon-prometheus
ls -aln ~erigon/.ethereum | grep -E "472.*0.*erigon-grafana" || sudo chown -R 472:0 ~erigon/.ethereum/erigon-grafana
[sudo] password for erigon:

Any ideas?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants