Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Earthfile suggestions #24

Closed
TravisCardwell opened this issue Mar 28, 2022 · 5 comments
Closed

Earthfile suggestions #24

TravisCardwell opened this issue Mar 28, 2022 · 5 comments

Comments

@TravisCardwell
Copy link
Collaborator

I wrote my own version of the Earthfile, to learn Earthly as well as try implementing some new features. I am creating this issue to see which (if any) of the changes you are interested in incorporating into the project.

Suggestions

Separate External Targets

The motivation is to be able to easily build specific images, without building all images. Users can then test changes against a single GHC version before building all versions, providing faster iteration. There is a target for each GHC version, a readme target for updating README.md, as well as the all target that builds everything.

For example, the following command can be used to build and test the GHC 9.2.2 image.

earthly --allow-privileged +ghc922

Note that these targets do not need to run in separate containers. A top-level FROM command is used, and none of these "external" targets have a FROM command.

Upgrade Alpine Packages

The Alpine package index is updated and installed packages are upgraded. #23

Add latest Tags

The ghc-musl images use tags that specify a version number. (I use GHC_MUSL_VERSION as the variable name for clarity.) This is great for making repeatable builds.

I think it would be a helpful to also create tags that point to the latest image for each GHC version. The tag names could simply leave out the version (ghc922) or use the string latest (latest-ghc922). Doing this allows downstream projects to use new versions (security updates) without having to bump the version number in their configuration.

Note that this is not implemented in the Earthfile below.

Use Latest Cabal Version

Each release of GHC specifies the minimum version of Cabal that it works with, but new versions of Cabal can be used with old version go GHC. Installing the minimum version is useful for testing compatibility, but ghc-musl is for building static executables, not testing. I think that it is preferable to use the latest version of Cabal for all GHC versions. There are multiple benefits:

  • Users who make use of new Cabal features will not have failures due to building with an older Cabal version.
  • The cabal tests relying on cabal list-bin can be performed. I added similar tests for Stack, btw.
  • The Earthfile is somewhat simplified.

Minimize Layers

I try to minimize the number of layers in output images. All consecutive RUN commands are combined.

Note that the test targets are not output, so they may use separate RUN commands for clarity.

Minimize Internal Targets

I combined some internal targets (such as ghcup, ghc-deps, and result) because there is no practical reason for them to be separate. IMHO, it makes the Earthfile a bit easier to read and understand.

Add TEST_CABAL And TEST_STACK Flags

I added TEST_CABAL and TEST_STACK flags so that these tests can be turned off using --build-arg options. For example, the following command can be used to disable Stack tests in CI.

earthly --no-output --build-arg TEST_STACK=0 +all

Note that the ghc target contains everything for an image, while the image target outputs the image only after the tests have passed.

Use Common Alpine Version

Unless there is a good reason to use different Alpine versions for different images, the version can be specified once.

I specify it at the top-level. Changing this versions invalidates the cache for all targets.

Use ARG To Configure BASE_TAG

The BASE_TAG should be an ARG, as it is not an environment variable.

Users who test local builds (adding dependencies necessary for their project, for example) are able to specify a custom BASE_TAG using a --build-arg option. This ensures that they do not overwrite official images.

Sort Packages

I prefer to sort the packages installed in the base-system target, so that they are easy to maintain. The first installation command, which installs general system software requirements, lists packages sorted by package name. The second installation command, which installs requirements for building static executables with various dependencies, lists packages in groups so that the program, library, -dev, and -static packages are kept together on the same line, and the lines are sorted.

Use GHCUp --set Option

I use the --set option instead of separate set commands, just to reduce the lines of code.

Reduce Whitespace

I wrote the Earthfile using a different style, which has less whitespace. I think that it makes the file easier to read, but I of course do not mind keeping the current style if it is preferred.

Note that the example Earthfiles use two-space indentation.

Specify Earthly Version

The VERSION command will be required in the future. I think it would be a good idea to start specifying it now.

Earthfile

I am pasting my Earthfile instead of linking to a branch or gist because a branch or gist may be deleted in the future. I am not including comments so that it is easier to read, but I think adding some documentation in comments (or a separate file in the repo) could help people who are not familiar with Earthly.

VERSION 0.6

ARG ALPINE_VERSION=3.15.2
FROM alpine:$ALPINE_VERSION

ARG GHC_MUSL_VERSION=24
ARG BASE_TAG=utdemir/ghc-musl:v$GHC_MUSL_VERSION

base-system:
  FROM alpine:$ALPINE_VERSION
  RUN apk update \
   && apk upgrade \
   && apk add \
        autoconf automake bash binutils-gold curl dpkg fakeroot file \
        findutils g++ gcc git make perl shadow tar xz \
   && apk add \
        brotli brotli-static \
        bzip2 bzip2-dev bzip2-static \
        curl libcurl curl-static \
        freetype freetype-dev freetype-static \
        gmp-dev \
        libffi libffi-dev \
        libpng libpng-static \
        ncurses-dev ncurses-static \
        openssl-dev openssl-libs-static \
        pcre pcre-dev \
        pcre2 pcre2-dev \
        sdl sdl-dev sdl-static \
        sdl2 sdl2-dev \
        sdl2_image sdl2_image-dev \
        sdl2_mixer sdl2_mixer-dev \
        sdl2_ttf sdl2_ttf-dev \
        sdl_image sdl_image-dev \
        sdl_mixer sdl_mixer-dev \
        xz xz-dev \
        zlib zlib-dev zlib-static \
   && ln -s /usr/lib/libncursesw.so.6 /usr/lib/libtinfo.so.6

ghc:
  FROM +base-system
  ARG --required GHC
  ENV GHCUP_INSTALL_BASE_PREFIX=/usr/local
  RUN curl --fail --output /bin/ghcup \
        'https://downloads.haskell.org/ghcup/x86_64-linux-ghcup' \
   && chmod 0755 /bin/ghcup \
   && ghcup upgrade --target /bin/ghcup \
   && ghcup install ghc "$GHC" --set \
   && ghcup install cabal --set \
   && /usr/local/.ghcup/bin/cabal update
  ENV PATH="/usr/local/.ghcup/bin:$PATH"

test-cabal:
  FROM +ghc
  COPY example /example
  WORKDIR /example/
  RUN cabal new-build example --enable-executable-static
  RUN file $(cabal list-bin example) | grep 'statically linked'
  RUN echo test | $(cabal list-bin example) | grep 'Hello World!'

test-stack:
  FROM earthly/dind:alpine
  RUN apk add curl file \
   && curl -sSL https://get.haskellstack.org/ | sh
  COPY example /example
  WORKDIR /example/
  WITH DOCKER --load ghc-musl=+ghc
    RUN stack build \
          --ghc-options '-static -optl-static -optl-pthread -fPIC' \
          --docker --docker-image ghc-musl
  END
  RUN file $(find /example/.stack-work/install/ -type f -name example) \
    | grep 'statically linked'
  RUN echo test \
    | $(find /example/.stack-work/install/ -type f -name example) \
    | grep 'Hello World!'

image:
  FROM +ghc
  ARG TEST_CABAL=1
  ARG TEST_STACK=1
  ARG --required TAG
  IF [ "$TEST_CABAL" = "1" ]
    BUILD +test-cabal
  END
  IF [ "$TEST_STACK" = "1" ]
    BUILD +test-stack
  END
  SAVE IMAGE --push "$TAG"

ghc922:
  BUILD +image --GHC=9.2.2 --TAG=$BASE_TAG-ghc922

ghc902:
  BUILD +image --GHC=9.0.2 --TAG=$BASE_TAG-ghc902

ghc8107:
  BUILD +image --GHC=8.10.7 --TAG=$BASE_TAG-ghc8107

ghc884:
  BUILD +image --GHC=8.8.4 --TAG=$BASE_TAG-ghc884

readme:
  RUN apk add bash gettext
  COPY ./update-readme.sh .
  RUN ./update-readme.sh \
        "$BASE_TAG-ghc922" \
        "$BASE_TAG-ghc902" \
        "$BASE_TAG-ghc8107" \
        "$BASE_TAG-ghc884"
  SAVE ARTIFACT README.md

all:
  BUILD +ghc922
  BUILD +ghc902
  BUILD +ghc8107
  BUILD +ghc884
  BUILD +readme
@utdemir
Copy link
Owner

utdemir commented Apr 3, 2022

I wrote my own version of the Earthfile, to learn Earthly as well as try implementing some new features. I am creating this issue to see which (if any) of the changes you are interested in incorporating into the project.

Thank you @TravisCardwell ! Almost all of your changes make a lot of sense to me, so I'd be more than happy to incorporate it into the project.

I'm adding my comments on a few sections; that means that I completely agree with the changes I haven't replied to :).

Add latest Tags

The ghc-musl images use tags that specify a version number. (I use GHC_MUSL_VERSION as the variable name for clarity.) This is great for making repeatable builds.

I think it would be a helpful to also create tags that point to the latest image for each GHC version. The tag names could simply leave out the version (ghc922) or use the string latest (latest-ghc922). Doing this allows downstream projects to use new versions (security updates) without having to bump the version number in their configuration.

This is a good point. I agree with an extra "latest" tag. However, that doesn't correspond to them automatically getting security updates, as we only update this repository once every GHC version or so, so the "latest" tag won't be updated as often we'd like for security purposes.

I feel like there are a few things moving here:

  1. GHC version
  2. Our ghc-musl version, which includes the alpine Linux version we're depending on
  3. Exact time we built the Docker image

Currently, the items 1. and 2. are reflected in our versioning schema, but the number 3. is missing. But I'd argue that number 3. is the most important one in terms of security updates. So, I think ideally we'd also include a timestamp on our tags, have a periodic (maybe weekly) CI process which builds and pushes fresh images with latest packages, and the latest tag would correspond to the latest (ghc-musl-version, build-date) tuple.

But let's discuss this on the issue #23, as it seems like a bigger change.

Minimize Internal Targets

I combined some internal targets (such as ghcup, ghc-deps, and result) because there is no practical reason for them to be separate. IMHO, it makes the Earthfile a bit easier to read and understand.

I agree, your version of the Earthfile does indeed look cleaner. I think the slight advantage of the previous version was that it'd reuse the ghcup installation for each GHC version, but the new one would download and install it three times. It's not a major issue for me, so I'm happy to leave it to your judgment.

but I think adding some documentation in comments (or a separate file in the repo) could help people who are not familiar with Earthly.

I completely agree with this. I also read your writeups, so my apologies that use of Earthly was not very well documented here.

I think we ought to add both some code comments on our Earthfile, and at least mention how to use Earthly on our README. But, we do not need to wait for documentation before getting these changes in, as they are already a good improvement to the existing codebase.


So, I'd be happy to get a PR in with the Earthfile you posted here as-is. We can discuss & apply other improvemenst you mentioned on other issues (especially #23).

As you can see, I usually only have a look at the PR's over the weekend, so in case this becomes a bottleneck for you I added you as a collaborator, so feel free to get these changes straight in.

Going forward, if you intend to keep maintaining this project we should spend some time to move this away from the DockerHub repository under my name so you can push the images, and add a CI process. Let me know if you're interested in tackling those together :).

@utdemir
Copy link
Owner

utdemir commented Apr 3, 2022

Oh also, do you mind if we put a link in our README to your blog articles on how you got isupg built statically using this project? I think there's a great deal of learnings there.

@TravisCardwell
Copy link
Collaborator Author

Thank you very much for the feedback!

Add latest Tags

However, that doesn't correspond to them automatically getting security updates, as we only update this repository once every GHC version or so, so the "latest" tag won't be updated as often we'd like for security purposes.

That is a very good point. Companies/people who care about security (updates) generally build their own images and do not use Docker Hub at all. Providing an easy way to build specific images using custom tags helps such people use the project.

Having some sort of latest tag would ease the maintenance burden in projects like lsupg. I would not need to update tags and make releases in order for the main branch to use the latest images with the most recent updates.

I feel like there are a few things moving here:

  1. GHC version
  2. Our ghc-musl version, which includes the alpine Linux version we're depending on
  3. Exact time we built the Docker image

Currently, the items 1. and 2. are reflected in our versioning schema, but the number 3. is missing. But I'd argue that number 3. is the most important one in terms of security updates. So, I think ideally we'd also include a timestamp on our tags, have a periodic (maybe weekly) CI process which builds and pushes fresh images with latest packages, and the latest tag would correspond to the latest (ghc-musl-version, build-date) tuple.

Number 3 is indeed the most important when using alpine:latest or running apk upgrade as part of the build.

Creating automated releases that include the latest updates using a timestamped tag would indeed be ideal, as updated images would not replace previous versions (providing users with reproducibility) while the latest tag could include recent updates (providing users an easy way to have improved security). One possible concern is the increased usage of resources, both for the CI as well as Docker Hub. I guess this is not currently an issue?

Perhaps we could use lsupg in the CI process. It could be used to check for updates and only create a new build when updates are available. If updates are not too frequent, then the CI process could run frequently in order to provide timely updates. For example, a daily check might work well. Note that timestamp %Y%m%d would be sufficient in this case, but a more granular timestamp would be preferred to allow for users who want more frequent checks in internal environments. I wonder how frequently updates are made available for the packages used in these images...

Currently, the Docker Hub images provide reproducibility without including recent updates. Users need to do builds themselves in order to include the most recent updates. Perhaps it is worth considering the opposite strategy: provide images via Docker Hub that include recent updates but do not provide reproducibility. If only images with the latest build are provided, tagged with just the GHC version such as utdemir/ghc-musl:ghc922, then lsupg could be used to update the images whenever alpine:latest has updates (and all tests pass, of course). In this case, users must build images themselves for reproducibility, and the Earthfile could still include the code for easy timestamping. Please note that I am not necessarily voting for this; I am just mentioning it as another strategy, as this is the common strategy for the official images.

But let's discuss this on the issue #23, as it seems like a bigger change.

Indeed; such changes are only possible/meaningful if alpine:latest is used or apk upgrade is run.

Minimize Internal Targets

I think the slight advantage of the previous version was that it'd reuse the ghcup installation for each GHC version, but the new one would download and install it three times. It's not a major issue for me, so I'm happy to leave it to your judgment.

That is a mistake on my part. I overlooked the fact that ghcup and cabal are downloaded and installed multiple times when building multiple images. Thank you much for pointing it out! My opinion is that it is worth having a separate ghcup target in order to avoid this.

I completely agree with this. I also read your writeups, so my apologies that use of Earthly was not very well documented here.

I think we ought to add both some code comments on our Earthfile, and at least mention how to use Earthly on our README. But, we do not need to wait for documentation before getting these changes in, as they are already a good improvement to the existing codebase.

No problem!

Sounds good! I agree.


So, I'd be happy to get a PR in with the Earthfile you posted here as-is. We
can discuss & apply other improvemenst you mentioned on other issues
(especially #23).

I would like to separate the ghcup target, and I will remove the apk upgrade command since that is part of #23.

As you can see, I usually only have a look at the PR's over the weekend, so in case this becomes a bottleneck for you I added you as a collaborator, so feel free to get these changes straight in.

No problem at all. I hope that my submissions do not cause any pressure or stress. I am not in a rush.

My availability to work on the project is dependent on my job/projects. I have been able to work on it on weekdays, but this will likely soon change to weekends for me as well.

Thank you for adding me as a collaborator. I will push changes when I feel that it is appropriate, and I will continue to create PRs when you might have a different opinion or the code could use a review.

Going forward, if you intend to keep maintaining this project we should spend some time to move this away from the DockerHub repository under my name so you can push the images, and add a CI process. Let me know if you're interested in tackling those together :).

Sure, I am interested! 😄

I am fine with keeping it under your name, though moving it to a different name is fine as well, if it helps.

Perhaps we can discuss this after the current batch of changes.

Oh also, do you mind if we put a link in our README to your blog articles on how you got isupg built statically using this project? I think there's a great deal of learnings there.

I do not mind, though it might be preferable to instead organize the useful parts into documentation. My blog rambles quite a bit... Keeping it separate from curated articles provides me the freedom to write without worrying too much about quality. This allows me to post regularly, while my articles are few and far between.

TravisCardwell added a commit that referenced this issue Apr 8, 2022
This change implements the changes discussed in issue #24.  Minimal
changes to the documentation are made so that it is consistent with the
new `Earthfile`.  More significant documentation changes will be made in
a separate commit.
@TravisCardwell
Copy link
Collaborator Author

I pushed a commit with the updated Earthfile to the main branch. I made some minimal changes to update-readme.sh in the commit, to match the new Earthfile.

I am going to work on the documentation next.

@utdemir
Copy link
Owner

utdemir commented Apr 8, 2022

Thanks @TravisCardwell ! That all sounds great. As the changes shouldn't cause an interface change, I say there's no need to release a new version; but do let me know if you want me to push a new release.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants