Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Dockerfile: run as user: linuxbrew #11815

Merged
merged 1 commit into from Aug 10, 2021
Merged

Dockerfile: run as user: linuxbrew #11815

merged 1 commit into from Aug 10, 2021

Conversation

sudoforge
Copy link

@sudoforge sudoforge commented Aug 3, 2021

Dockerfile: run as user: linuxbrew

This commit refactors the Dockerfile in order to resolve build errors
caused by attempting to execute `brew` commands as the root user. We
need to create the `/home/linuxbrew/.linuxbrew` folder prior to copying
the local directory into `/home/linuxbrew/.linuxbrew/Homebrew` (and
ensure the appropriate user owns it), as failing to do so will create
`/home/linuxbrew/.linuxbrew` with root user and group ownership, causing
the subsequent `mkdir` command called in the second `RUN` instruction to
fail.

closes #11802
  • Have you followed the guidelines in our Contributing document?
  • Have you checked to ensure there aren't other open Pull Requests for the same change?
  • Have you added an explanation of what your changes do and why you'd like us to include them?
  • Have you written new tests for your changes? Here's an example.
  • Have you successfully run brew style with your changes locally?
  • Have you successfully run brew typecheck with your changes locally?
  • Have you successfully run brew tests with your changes locally?

There are additional improvements that the container image could still benefit from, namely:

  • Avoiding cloning homebrew/homebrew-core via git. This causes the build to hang for some time (on the order of several minutes; I haven't timed it precisely). We could instead bring these source files in via an ADD instruction, pulling an archive from GitHub directly.
  • Reducing the image size. The current image exceeds 1GiB in size. This is... huge. Drawbacks to an image as large as this include slower network transport (of the image, to/from registries), higher disk consumption, and higher CPU and memory requirements to run. This can be done by utilizing a smaller base image, and/or multi-stage builds, as well as performing optimization of the final image contents, ensuring that any unecessary artifacts are removed.
  • Improve the user experience. Currently, users are required to type a command similar to docker run --rm -v "$(pwd):/home/linuxbrew/.linuxbrew" homebrew/brew brew <command>. This is a bit verbose, and also isn't ideal, as paths in PWD that are mounted into the container can overwrite preexisting paths within the container, and as with this repository (when running brew {style,typecheck,tests}), can potentially lead to lengthy runtimes due to re-fetching of the dependencies.

I will plan to submit PRs for the above recommendations over the next few weeks. I'd like to get it done sooner, but time constraints and my unfamiliarity with the internal architecture of brew will likely lead to me not being able to iterate on this right away.

@sudoforge
Copy link
Author

cc @iMichka @MikeMcQuaid

@sudoforge
Copy link
Author

First rebase: updating the commit message for clarity.
Second rebase: rebasing on top of 1a55b749c953c530087f31254e1ff9aa6fd4330a.

@sudoforge
Copy link
Author

Improve the user experience. Currently, users are required to type a command similar to docker run --rm -v "$(pwd):/home/linuxbrew/.linuxbrew" homebrew/brew brew . This is a bit verbose, and also isn't ideal, as paths in PWD that are mounted into the container can overwrite preexisting paths within the container, and as with this repository (when running brew {style,typecheck,tests}), can potentially lead to lengthy runtimes due to re-fetching of the dependencies.

I neglected to mention that yes, it is possible for the user to manually override this by mapping the files into a different path, and updating the working directory when running the image, however, this is cumbersome, and we can (and should) do better.

@MikeMcQuaid MikeMcQuaid requested a review from a team August 3, 2021 18:25
@MikeMcQuaid
Copy link
Member

  • Avoiding cloning homebrew/homebrew-core via git. This causes the build to hang for some time (on the order of several minutes; I haven't timed it precisely). We could instead bring these source files in via an ADD instruction, pulling an archive from GitHub directly.

Would like to leave this as-is. We need this in our CI jobs and various bits of Homebrew functionality assume it is cloned (for now, at least).

  • Reducing the image size. The current image exceeds 1GiB in size. This is... huge. Drawbacks to an image as large as this include slower network transport (of the image, to/from registries), higher disk consumption, and higher CPU and memory requirements to run. This can be done by utilizing a smaller base image, and/or multi-stage builds, as well as performing optimization of the final image contents, ensuring that any unecessary artifacts are removed.

Optimisation at the end seems best to us. We'd rather avoid a smaller base image or omitting anything from git. We could do a more aggressive git gc perhaps, though.

  • Improve the user experience. Currently, users are required to type a command similar to docker run --rm -v "$(pwd):/home/linuxbrew/.linuxbrew" homebrew/brew brew <command>. This is a bit verbose, and also isn't ideal, as paths in PWD that are mounted into the container can overwrite preexisting paths within the container, and as with this repository (when running brew {style,typecheck,tests}), can potentially lead to lengthy runtimes due to re-fetching of the dependencies.

This makes sense to me but interested what @iMichka and @Homebrew/linux think.

Copy link
Member

@MikeMcQuaid MikeMcQuaid left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

😍 great work so far @sudoforge. Would love @Homebrew/linux @iMichka to take a look too.

Dockerfile Outdated Show resolved Hide resolved
Dockerfile Outdated Show resolved Hide resolved
@sudoforge
Copy link
Author

Would like to leave this as-is. We need this in our CI jobs and various bits of Homebrew functionality assume it is cloned (for now, at least).

To be clear -- Homebrew needs the git history of the homebrew-core project? What I'm suggesting would grab the source code just fine, however, GitHub doesn't include the .git directory in archive downloads by design.

Optimisation at the end seems best to us. We'd rather avoid a smaller base image or omitting anything from git. We could do a more aggressive git gc perhaps, though.

I'm not sure I'm following you. Why would Homebrew's maintainers want to avoid a smaller base image? As an example, there is nothing in the dependency list (as far as I can tell) that is not available on Alpine Linux.

@sjackman
Copy link
Member

sjackman commented Aug 3, 2021

@sudoforge Hi, Ben. Thanks for this work. I'm camping this week, so I'll do my best with a quick reply now, but I'll be able to reply in more detail next week.

Please submit one PR for each isolated change that you would like to make. It will help to focus the discussion, and permit closing PRs with changes that we don't wish to make, while keeping merging PRs with changes that we do wish to make, or discussing changes in PRs that need changes.

To resolve an error encountered when running brew, a USER
instruction was added to run subsequent processes as the linuxbrew
user.

That's how it used to be, and it was intentionally switch from user linuxbrew by default to user root by @MikeMcQuaid in PR #5733. Please revert this change, or better, close this PR and open new PRs that isolate each change in their own PR.

@sjackman
Copy link
Member

sjackman commented Aug 3, 2021

Optimisation at the end seems best to us. We'd rather avoid a smaller base image or omitting anything from git. We could do a more aggressive git gc perhaps, though.

I'm not sure I'm following you. Why would Homebrew's maintainers want to avoid a smaller base image? As an example, there is nothing in the dependency list (as far as I can tell) that is not available on Alpine Linux.

There's no such thing as a free lunch. 😁 A smaller base image has less functionality. The ubuntu:20.04 base image is 73 MB, or something like 6% of the 1.2 GB Homebrew/brew image. It's not worth losing that functionality for reducing the size of the image by something less than 6%.

@sjackman
Copy link
Member

sjackman commented Aug 3, 2021

Would like to leave this as-is. We need this in our CI jobs and various bits of Homebrew functionality assume it is cloned (for now, at least).

To be clear -- Homebrew needs the git history of the homebrew-core project? What I'm suggesting would grab the source code just fine, however, GitHub doesn't include the .git directory in archive downloads by design.

Correct. The brew update command in particular needs the history of the project, due to performance issues on the GitHub server side with sparse checkouts, which we used previously.

A separate Docker image could be maintained by a motivated third party that has the formulae, but not the Git history, has HOMEBREW_NO_AUTO_UPDATE=1 set by default, and does not support brew update (at least not without downloading the Git history first).

We have discussed possible improvements to Homebrew that do not require "tapping" (cloning) the Git history to be able to brew install. The primary motivation is improving the install time of Homebrew, but it would also improve the size of the Docker image substantially. This issue is on our radar.

@sjackman
Copy link
Member

sjackman commented Aug 3, 2021

Various command lines were moved from one RUN invocation to the other
in order to improve image caching behavior (although note that this
gain is negligible in terms of performance).

Please separate functional changes from performance improvements.

This commit refactors the Dockerfile in order to resolve build errors
caused by attempting to execute `brew` commands as the root user. We
need to create the `/home/linuxbrew/.linuxbrew` folder prior to copying
the local directory into `/home/linuxbrew/.linuxbrew/Homebrew` (and
ensure the appropriate user owns it), as failing to do so will create
`/home/linuxbrew/.linuxbrew` with root user and group ownership, causing
the subsequent `mkdir` command called in the second `RUN` instruction to
fail.

closes #11802
@sudoforge
Copy link
Author

Please separate functional changes from performance improvements.

This PR has been updated to only include the absolute necessary work to get the image working again. The description and commit message have been updated as well.

@sudoforge sudoforge changed the title Dockerfile: fix the container image build Dockerfile: run as user: linuxbrew Aug 4, 2021
Dockerfile Show resolved Hide resolved
@MikeMcQuaid
Copy link
Member

To be clear -- Homebrew needs the git history of the homebrew-core project?

Yes.

There's no such thing as a free lunch. 😁 A smaller base image has less functionality. The ubuntu:20.04 base image is 73 MB, or something like 6% of the 1.2 GB Homebrew/brew image. It's not worth losing that functionality for reducing the size of the image by something less than 6%.

👍🏻. Also: we use this image to build our binary packages. Perhaps we could consider eventually having a "Homebrew maintainers/CI image" and a "end-users consumption image" (the latter being as small as possible).

We have discussed possible improvements to Homebrew that do not require "tapping" (cloning) the Git history to be able to brew install. The primary motivation is improving the install time of Homebrew, but it would also improve the size of the Docker image substantially. This issue is on our radar.

Even in this case: we'll likely still need that data in this Docker image for our CI usage.

@dawidd6 dawidd6 merged commit 4231efe into Homebrew:master Aug 10, 2021
@sudoforge sudoforge deleted the 11802/docker-run-as-linuxbrew branch August 10, 2021 19:36
@sjackman
Copy link
Member

Today I learned about HOMEBREW_JSON_CORE from @Rylan12, which makes it unnecessary to tap (clone) Homebrew/core!
https://github.com/Homebrew/brew/search?q=HOMEBREW_JSON_CORE

@danielnachun
Copy link
Member

danielnachun commented Aug 11, 2021

Today I learned about HOMEBREW_JSON_CORE from @Rylan12, which makes it unnecessary to tap (clone) Homebrew/core!
https://github.com/Homebrew/brew/search?q=HOMEBREW_JSON_CORE

I think using this for the initial "bootstrap" for user installs would also make things easier to setup because we wouldn't need to have a new enough git. The git and curl on CentOS 7 is now too old to install Homebrew, and I have to use Anaconda to install newer versions of git and curl first before Homebrew will even install. We can probably use the Debian Wheezy version checks instead during the initial install, but we could avoid having to deal with git by using HOMEBREW_JSON_CORE during the initial setup to install a newer git.

@Rylan12
Copy link
Member

Rylan12 commented Aug 11, 2021

Just be warned: this is intentionally undocumented and I'm pretty sure I'm the only one who's done any sort of testing on it so please don't use it in any user-facing ways at the moment until we've had maintainers test it and have documented it officially in EnvConfig.

You'd still need some version of git to clone the Homebrew/brew repo and you'd still need some version of curl to download the necessary JSON files and bottles, but it may be that only an older version is needed for these

@MikeMcQuaid
Copy link
Member

Yes, please don't encourage any non-maintainer use yet, thanks.

@github-actions github-actions bot added the outdated PR was locked due to age label Oct 5, 2021
@github-actions github-actions bot locked as resolved and limited conversation to collaborators Oct 5, 2021
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
outdated PR was locked due to age
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

6 participants