Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Docker caching #31

Open
tuler opened this issue Nov 1, 2019 · 48 comments
Open

Docker caching #31

tuler opened this issue Nov 1, 2019 · 48 comments
Assignees
Labels
area:docker-cache documentation Improvements or additions to documentation enhancement New feature or request feature request

Comments

@tuler
Copy link

tuler commented Nov 1, 2019

Should we try to use this cache action to cache docker layers, doing trickery with docker save and docker load, or are you working on a different path for Docker caching?

@joshmgross
Copy link
Member

You're welcome to use this for caching docker layers! If you get something figured out we can add it as an example usage of this action.

are you working on a different path for Docker caching?

Nothing is in scope right now for v1 of this action that's specific to the Docker scenario, but it might already work.

Related: https://github.com/actions/toolkit/issues/197

@joshmgross joshmgross added documentation Improvements or additions to documentation enhancement New feature or request labels Nov 1, 2019
@crazy-max
Copy link
Contributor

I think I can do something with ghaction-docker-buildx action and use the local --output.

@tuler
Copy link
Author

tuler commented Nov 1, 2019

I tried but I hit the file limit, discussed at #6

@steebchen
Copy link

steebchen commented Nov 2, 2019

Since docker is now natively integrated in Github actions, I think it would be nice if there was an option to enable docker caching globally, like CircleCI does.

@peaceiris
Copy link

peaceiris commented Nov 2, 2019

@tuler also tried to cache with a tarball on #33. It seems to be a good solution.

#37 also was opened.

@peter-evans
Copy link

Opened PR #37 with an example for Docker layer caching.

@tuler
Copy link
Author

tuler commented Nov 4, 2019

@steebchen I agree. Docker caching should be natively supported, not through this action.

@hamelsmu
Copy link

hamelsmu commented Nov 4, 2019

cc: @neovintage

@joshmgross
Copy link
Member

The cache file limit has been increased from 400 MB to 2 GB, so this may be more viable than previously due to docker layers being larger than the previous limit.

@DannyBen
Copy link

@steebchen I agree. Docker caching should be natively supported, not through this action.

Seems like a lot of people agree to that statement. Where is the proper place to post such a request? In the runner repo? Perhaps someone in the inner circle should post it as a feature request in the appropriate place and post a link here?

@joshmgross
Copy link
Member

@DannyBen See #81 where we're tracking Docker caching of actions

@Sytten
Copy link

Sytten commented Apr 14, 2020

Any update of the progress?

@jimgreer
Copy link

jimgreer commented May 3, 2020

+1

@joshmgross
Copy link
Member

👋 Hi all, right now we are focusing on other priorities and there are no updates for docker caching. We appreciate your feedback and revisit priorities based on user feedback, so please continue sending us your input.

@steebchen
Copy link

steebchen commented May 5, 2020

What's bothering me most is that all Github Actions (which are based on Docker) are always rebuild from scratch, in every single action. I've never worked in projects as big as GitHub, but I can't imagine how caching is not the solution here – just for the reason that it costs bandwidth to fetch the images each time. I'd also be happy to upgrade to a more expensive plan or pay for the cache storage costs.

(edit: I just realised I already commented in this thread... 😄)

schlessera added a commit to ampproject/amp-wp that referenced this issue May 22, 2020
No caching of Docker images yet, as that is not yet supported by GitHub Actions (see actions/cache#31).
schlessera added a commit to ampproject/amp-wp that referenced this issue May 23, 2020
No caching of Docker images yet, as that is not yet supported by GitHub Actions (see actions/cache#31).
@crazy-max
Copy link
Contributor

Working example for me using buildx here: #260 (comment)

schlessera added a commit to ampproject/amp-wp that referenced this issue Jul 20, 2020
No caching of Docker images yet, as that is not yet supported by GitHub Actions (see actions/cache#31).
tktcorporation added a commit to tktcorporation/typescript-docker-template that referenced this issue Jul 29, 2020
benmezger added a commit to benmezger/dotfiles that referenced this issue Aug 15, 2020
@jeacott1
Copy link

My ci processes use several pre-built docker images, some of them are very large - 3.5GB!
github actions pulls these images again and again on every run burning a lot of time and cpu.
A built in cache thats got a reasonable expiry behind each repo is really required here. Sooo much dealing with github actions seems to be spent working around paradigm/api failures or missing features.
This is a big one, please give us some better options.

@thisismydesign
Copy link

thisismydesign commented Apr 13, 2022

The cache option on docker/build-push-action works great for building images.

env:
  REGISTRY: ghcr.io
  IMAGE_NAME: ${{ github.repository }}

jobs:
  build-and-upload-docker-image:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    permissions:
      contents: read
      packages: write
    outputs:
      docker-tag: ${{ steps.meta.outputs.tags }}

    steps:
      - name: Set up Docker Buildx
        uses: docker/setup-buildx-action@v1

      - name: Log in to the Container registry
        uses: docker/login-action@v1
        with:
          registry: ${{ env.REGISTRY }}
          username: ${{ github.actor }}
          password: ${{ secrets.GITHUB_TOKEN }}

      - name: Extract metadata (tags, labels) for Docker
        id: meta
        uses: docker/metadata-action@v3
        with:
          images: ${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}

      - name: Build and push
        id: docker_build
        uses: docker/build-push-action@v2
        with:
          push: true
          file: Dockerfile.prod
          tags: ${{ steps.meta.outputs.tags }}
          labels: ${{ steps.meta.outputs.labels }}
          cache-from: |
            type=registry,ref=${{ steps.meta.outputs.tags }}
            type=registry,ref=${{ env.REGISTRY }}/${{ env.IMAGE_NAME }}:master
          cache-to: type=inline

However, I have a use case for running tests in a dockerized setup.

jobs:
  test:
    runs-on: ubuntu-latest
    timeout-minutes: 10
    needs: [build-and-upload-docker-image]

    steps:
    - uses: actions/checkout@v2

    - name: Log in to the Container registry
      uses: docker/login-action@v1
      with:
        registry: ${{ env.REGISTRY }}
        username: ${{ github.actor }}
        password: ${{ secrets.PACKAGE_ACCESS_TOKEN }}

    - name: Run tests
      run: docker-compose run web yarn test

Here every run pulls dependencies defined in docker-compose (e.g. postgres, redis, etc). Could caching of those be supported natively or by an action?

@DannyBen
Copy link

The cache option on docker/build-push-action works great for building images.

Well, that is not that great in the minimalist's opinion. The syntax is very convoluted, and as you noticed, works partially. I hope some day, $ docker build would work on the GitHub action runners in the same way it works on any other long-running machine, with docker layer cache built in and mounted automatically, with a simple or no config.

@thisismydesign
Copy link

@DannyBen Yeah it's verbose but if it bothers you, create an action to wrap these into a one-liner.

@Kurt-von-Laven
Copy link

Should we try to use this cache action to cache docker layers, doing trickery with docker save and docker load, or are you working on a different path for Docker caching?

My company, ScribeMD, faced a similar issue and just released a dirt simple docker-cache action, which caches all Docker images whether built or pulled using said docker save/docker load trickery. Please file an issue if you have any feedback!

@l1553k
Copy link

l1553k commented Apr 26, 2022

Just leaving an idea, haven't seen it discussed yet:

      - uses: actions/cache@v3
        with:
          path: ~/.cache/docker
          key: ${{ runner.os }}-docker
          restore-keys: |
            ${{ runner.os }}-docker
      - name: Mount docker inside ~/.cache/docker
        run: |
          sudo systemctl stop docker.service
          sudo systemctl stop docker.socket
          sudo sed -i "s!ExecStart=/usr/bin/dockerd!ExecStart=/usr/bin/dockerd -g $HOME/.cache/docker!" /lib/systemd/system/docker.service
          sudo systemctl daemon-reload
          sudo systemctl start docker

...

      - name: Make cache accessible to caching action (put this at the end of yml)
        run: |
          sudo chown $(whoami):$(whoami) -R ~/.cache

It has some problems which I hope someone smarter will know how to solve and share that knowledge:

Post job cleanup.
/usr/bin/tar --posix --use-compress-program zstd -T0 -cf cache.tzst -P -C /home/runner/work/my-repo/my-repo --files-from manifest.txt
/usr/bin/tar: ../../../.cache/docker/overlay2/7aca327c7588ed39f3f42dee1459c9ca3e42d699db7f03c0152212c9b8d347be/work/work: Cannot open: Permission denied
...
/usr/bin/tar: ../../../.cache/docker/overlay2/9873a6171ad3ecbd9d5541d49b8ca221e1d6e39871e356ff07f5ff22715f8959/work/work: Cannot open: Permission denied
/usr/bin/tar: Exiting with failure status due to previous errors
Warning: Tar failed with error: The process '/usr/bin/tar' failed with exit code 2

Soooo... it does not work (as I said, this is an idea, not a solution :) ).

@Kurt-von-Laven
Copy link

Myself and others have run into a similar issue. I am not aware of another way around the permission errors currently besides using docker save and docker load. Running Docker in rootless mode addresses some Docker permission issues, but not these. If someone knows of a better approach, please feel free to send a PR to ScribeMD/docker-cache.

@Kurt-von-Laven
Copy link

I have since learned of a much faster approach provided you have control over where the Docker image is pulled from. You can simply upload the Docker images you are pulling to the GitHub Container Registry, which spares you the cost of docker save + cache upload on cache miss. For those, like us, who don't have such control, docker-cache is probably your best bet for now.

@EpsilonAlpha
Copy link

I run into the same issue: tried to cache /var/lib/docker with the result after cleanup (which means saving the cache I guess):

Post job cleanup.
Warning: EACCES: permission denied, scandir '/var/lib/docker'

Donno if I made a mistake on the workflow file in the Cache Step:

    - name: Cache Docker Layer
      id: cache-docker
      uses: actions/cache@v3
      env:
        cache-name: cache-docker-layer
      with:
        path: /var/lib/docker
        key: ${{ runner.os }}-build-${{ env.cache-name }}-${{ hashFiles('**/image/overlay2/repositories.json') }}
        restore-keys: |
          ${{ runner.os }}-build-${{ env.cache-name }}-
          ${{ runner.os }}-build-
          ${{ runner.os }}-

@Kurt-von-Laven
Copy link

No, you didn't make a mistake. Please give docker-cache a try.

@EpsilonAlpha
Copy link

No, you didn't make a mistake. Please give docker-cache a try.

Thanks, I wanted to try it out before reporting back and works good, but not what I expected:

  • The first run with docker-cache took whooping 8m6s, I guess it needed to build up the cache. Acceptable imo for the initial build.
  • The second run with docker-cache took 1m50s.
  • But for comparison: Without any caching method the workflow (which builds the image) runs in 26s till 35s.

So caching is for me at this point a disadvantage, which bumps up the runtime of the workflow very drastically. Reducing the workflow runtime, was my goal by caching the layers.
The description of docker-cache also states that: "Note that this action does not perform Docker layer caching."
And that is exactly what I wanted.

So thanks for the alternative, but I will skip using caching at this point. Will give it a try if the solution evolved.

@Kurt-von-Laven
Copy link

Apologies; I overlooked the fact that you were looking for Docker layer caching, which indeed is not the intended purpose of docker-cache. As your experiment demonstrated, docker-cache is geared towards those who are pulling images, not building them. I suggest using the official Docker build push action, which performs layer caching and/or hosting your images in the GitHub Container Registry (possibly in addition to anywhere else they might be hosted).

@EpsilonAlpha
Copy link

No Problem, that's why I tested it out if it would work for my use case. And thanks for the Build Push Action and the GitHub Container Registry. Maybe the speed from a GH-Runner to that registry is faster than pushing to resources outside.

I will also try out, if a image hosted on GCR can be pulled down faster than Docker Hub, thanks for the idea 👍🏼 this maybe also speed up the process if it works

@Kurt-von-Laven
Copy link

Yes, credit to Thai Pangsakulyanont for his excellent blog post showing that pulling from GHCR is as fast as pulling from the cache, but since it's a pure docker pull, you don't even need the subsequent docker load step required when restoring from the GitHub Actions cache. Similarly, you can push to GHCR without the preceding docker save step required when saving to the cache, which I expect is the primary reason saving the cache took 8 minutes during your experiment. You most likely have many Docker images with a lot of overlapping layers between them if you are building images, which makes docker save extremely inefficient. When you are only pulling, you only have the precise images you pulled, so the performance tends to be far better in that case as you end up pushing far last data. As with any performance-related work, I recommend to anyone else reading this that you measure the results in your specific case, but to summarize docker-cache works best for those who are only pulling images from a registry other than GHCR and don't control which registry the images are housed at.

@github-actions
Copy link

This issue is stale because it has been open for 200 days with no activity. Leave a comment to avoid closing this issue in 5 days.

@github-actions github-actions bot added the stale label Jul 24, 2023
@torreytsui
Copy link

Would love to see this feature in 2023.

@github-actions github-actions bot removed the stale label Jul 28, 2023
@lukasz-mitka
Copy link

What would be the benefit of implementing this here if there's a GH caching available via docker/build-push-action?

Example:

      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: user/app:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

Links:
https://docs.docker.com/build/ci/github-actions/cache/#github-cache
https://github.com/docker/build-push-action

@dudicoco
Copy link

dudicoco commented Aug 8, 2023

What would be the benefit of implementing this here if there's a GH caching available via docker/build-push-action?

Example:

      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: user/app:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

Links: https://docs.docker.com/build/ci/github-actions/cache/#github-cache https://github.com/docker/build-push-action

This forces you to use the docker action rather than relying on a generic docker caching solution.

@Nefcanto
Copy link

Nefcanto commented Feb 4, 2024

What would be the benefit of implementing this here if there's a GH caching available via docker/build-push-action?

Example:

      - name: Build and push
        uses: docker/build-push-action@v4
        with:
          context: .
          push: true
          tags: user/app:latest
          cache-from: type=gha
          cache-to: type=gha,mode=max

Links: https://docs.docker.com/build/ci/github-actions/cache/#github-cache https://github.com/docker/build-push-action

It does not cache pulled images. For us, that is the most time spent in our actions.

@rickerp
Copy link

rickerp commented Apr 29, 2024

What would be the benefit of implementing this here if there's a GH caching available via docker/build-push-action?

It does not cache pulled images. For us, that is the most time spent in our actions.

This 👆

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
area:docker-cache documentation Improvements or additions to documentation enhancement New feature or request feature request
Projects
None yet
Development

No branches or pull requests