New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
[ECR] [request]: support cache manifest #876
Comments
|
BuildKit 0.8 will default to using an OCI media type for its caches (see moby/buildkit#1746) which I assume should make this work, but I haven't tested it myself. |
|
It still doesn't work with recently released buildkit 0.8.0 |
|
I am seeing the same error on buildkit 0.8.0 even when setting oci-mediatypes explicitly to true: Deamon logs: |
|
Also seeing this for private repos, although it doesn't seem to be an issue with public ECR repos.. |
|
Is there a timeframe for this feature request available? This could help to tremendously speed up CI builds. |
|
We have been experimenting with this buildkit feature for some time now and it works wonders. |
|
Any indication as to if/when this will ever be available? Using buildkit would really improve our CI build times |
|
One year passed and still nothing |
|
We'd really like to see support of this with ECR private repos As of today, it still does not work: |
|
Yes please! |
|
Can we get any kind of communications on this? |
|
Is there any workaround available? |
|
For the teams using Github but wishing to keep images in ECR, it is possible to leverage the cache manifest support from Github Container Registry (GHCR) and push the image to ECR at the same time. When pushing to ECR, only new layers get pushed. Github Actions workflow example: jobs:
docker_build:
strategy:
matrix:
name:
- my-image
include:
- name: my-image
registry_ecr: my-aws-account-id.dkr.ecr.us-east-1.amazonaws.com
registry_ghcr: ghcr.io/my-github-org-name
dockerfile: ./path/to/Dockerfile
context: .
extra_args: ''
steps:
- uses: actions/checkout@v2
- name: Install Buildkit
uses: docker/setup-buildx-action@v1
id: buildx
with:
install: true
- name: Login to GitHub Container Registry
uses: docker/login-action@v1
with:
registry: ghcr.io
username: ${{ github.actor }}
password: ${{ secrets.GITHUB_TOKEN }}
- name: Configure AWS Credentials
uses: aws-actions/configure-aws-credentials@v1
with:
aws-access-key-id: ${{ secrets.AWS_ACCESS_KEY_ID }}
aws-secret-access-key: ${{ secrets.AWS_SECRET_ACCESS_KEY }}
aws-region: us-east-1
role-skip-session-tagging: true
role-duration-seconds: 1800
role-session-name: GithubActionsBuildDockerImages
- name: Login to Amazon ECR
uses: aws-actions/amazon-ecr-login@v1
- name: Build & Push (ECR)
# - https://docs.docker.com/engine/reference/commandline/buildx_build/
# - https://github.com/moby/buildkit#export-cache
run: |
docker buildx build \
--cache-from=type=registry,ref=${{ matrix.registry_ghcr }}/${{ matrix.name }}:cache \
--cache-to=type=registry,ref=${{ matrix.registry_ghcr }}/${{ matrix.name }}:cache,mode=max \
--push \
${{ matrix.extra_args }} \
-f ${{ matrix.dockerfile }} \
-t ${{ matrix.registry_ecr }}/${{ matrix.name }}:${{ github.sha }} \
${{ matrix.context }} |
|
@ynouri Just be careful of your storage costs in GHCR. It's oddly expensive. I found https://github.com/snok/container-retention-policy to help solve that use case for me. |
|
ECR still not supporting this is unbelievably amateurish, it doesn't suit AWS... |
Use another Docker registry. Dockerhub, or perhaps your own tiny EC2 with some fat storage. |
|
This seems to have started working unannounced, at least when using docker 20.10.11 to build |
I'm still seeing |
is this confirmed? |
I'm wondering the same thing. Could you share some more info @poldridge ? |
|
I've just faced the same issue with Docker version 20.10.12, build e91ed57. Would appreciate any hints or workarounds. |
Did not work for me using |
|
Can we get any kind of communication on this? Being able to use remote cache is a major benefit to all our build pipelines. |
|
FWIW, my team uses buildctl directly, instead of the buildx wrapper.
On Sat, Oct 22, 2022 at 10:45 Connor Bell ***@***.***> wrote:
Would also love to hear how people are working around this limitation
today, as it provides additional context about the underlying customer need
In our case, before Buildx had support for s3 caches - since we host an
s3-backed registry as a pull-through cache
<https://docs.docker.com/registry/recipes/mirror/> for our ECS
clusters/build workers anyway(to cache base images pulled from docker hub,
as they implemented pull restrictions), it was pretty easy to configure
that as a build cache store as well. I can understand why people are
frustrated at having to host an additional service as well though, as for
some it can mean quite a lot of additional infrastructure to support it, if
they're not able to just tack it on to their existing setup.
Thank you for the links you provided, by the way. It really helped my
understanding of how ECS (and apparently most suppliers who follow the
spec!) came to not support caches produced by buildx.
—
Reply to this email directly, view it on GitHub
<#876 (comment)>,
or unsubscribe
<https://github.com/notifications/unsubscribe-auth/AAJZKI2TOTMYSY4BPWEBEHTWEPVYTANCNFSM4MZQVARQ>
.
You are receiving this because you are subscribed to this thread.Message
ID: ***@***.***>
--
Thiago Augusto V. Lima
|
Do you have an example of that? |
Sure. Please see the examples here. |
|
For those looking for guidance on the S3 build cache with buildx this worked for me.. and Will probably wait until S3 cache is released before using in production. |
|
I feel like we are going off-topic with all the workarounds being mentioned. Every information is helpful, but in the end we still need ECR to support cache manifests so we can use it to store our cache layers. What's the current progress on this? |
It seems that the comments above are related to this post, which also contains a detailed explanation and questions for users.
From my observation, only the
I know customers have asked JFrog to support a manifest cache in their Artifactory, but now it works with a size limit of 2G per layer |
Our current work around is we are still using our Google GCR since the inline caching has been working there but would nice to keep image layers from having to go cloud to cloud. |
|
Really wish this was supported by now, everything else seems like a workaround, and it's confusing to look at in the build pipelines. |
|
Hi all, Providing a quick update here. I can't comment on dates in a public roadmap, but note that we are still working on this :) I can provide an overview of our thinking but please note that this is a snapshot of our current plans, and not set in stone or a firm commitment. We are closely looking at the release of the OCI 1.1 specification, which includes OCI Artifacts and the Referrers API. We think that OCI Artifact manifests are perfect for this. The 1.1 specification is in a release candidate state, and the OCI has set a 1/31 date for the final release. Our intention is to see if we can work with Buildkit to have them move over to OCI Artifacts once the specification is official and ECR supports it, which would immediately make this work for everyone using it today without having to make changes. We do want to make sure that other registries also support this implementation, as it's not our intention to break anyone's existing workflows. I apologize for the wait - I know this has been open for a long time. It's very high on our priority list, but even then it is taking a bit to get over the finish line. |
|
Any updates pls ? This feature alone keeps me from promoting GHA build/deploy to my org. We currently use CodeBuild which sks.. |
|
I am also looking forward to this feature! |
|
Note: This bit me and my team in the behind just now trying to use buildkit and cache things in ECR. It's 3 years later and this still isn't possible. Can this please get some serious attention? |
For now, you can leverage the newly announced support in buildkit for S3 caching. Speed up your docker builds on AWS! Docker Now Supports Native S3 Caching. Build command supports new flags --attest and shorthands --sbom and --provenance for adding attestations for your current build. When creating OCI images a minimal provenance attestation is included with the image by default. This feature requires BuildKit v0.11.0+ and Dockerfile 1.5.0+. |
|
@osterman We tried, it segfaults hard on the latest master version as of this morning. S3 support is in (very early) beta, at best on buildkit. Can't get it to work personally. |
|
We've been using S3 as a backend to local instance of registry, running together with the build. |
Lifecycle policies? |
lifecycle policies alone will corrupt the registry store as it will delete parts of the cache that are still referenced elsewhere |
|
Hi all, I'm sorry that we are taking so long with this. I unfortunately don't have too many updates to report: as of right now, we are working on supporting the OCI 1.1 specification (which continues to be target for a 1/31), and have also reached out to BuildKit to drive a change on how they store cache manifests. I appreciate everyone's patience with this. |
|
@rchernobelskiy Ah, if I understand it you're using S3 as the backend instead of using local, and you weren't using their caching feature at all then eh? I didn't imagine to try that route, could you perhaps share the commands/settings/config/setup you're using to get this to work? I'd like to explore it more. Thanks! (sorry folks for this comment not being directly related to this issue) |
|
@AndrewFarley we're using a temporarily spun up registry (https://github.com/distribution/distribution) for cache for each build, and for that registry we use it with s3 as the storage backend to persist the cache across builds. This is the only reliable and maintenance-free persistent cache solution we've found after a lot of experimentation about a year ago. |
|
@rchernobelskiy We are on AWS and the solution we landed on with some experimentation is using an FSX disk as our cache, which is a multi-mount capable super-fast shared disk solution, so we just tell buildkit to use a local disk cache on all our runners pointed to this disk. Even at very low provisioned speeds its lightning fast. |
Sounds nice! Do you have any comparisons of this vs. using S3 cache + the local disc? |
|
@marcesengel I didn't get S3 cache working, so I can't say, and I don't have the time to experiment with the "distribution" registry proxy idea that @rchernobelskiy shared above. I can say that our builds are crazy fast. I was previously doing builds with Kaniko with AWS ECR Registry layer caching I was seeing 1-3 minute build times on some of our test repos, where with BuildKit and FSX we are seeing 10-20 second build times regularly. This may however be more indicative of BuildKit rather than of S3, however I would guess that there's far more overhead to talk to a remote registry a bunch of times to grab a bunch of cached layers, opening up an TCP connection, doing an HTTPS handshake, and doing SSL encryption/decryption, etc, rather than just reading from a local disk. Try it out, I bet you won't be disappointed. If you get both working and compare speeds I'd be eager to hear what the numbers look like. |
|
Here is an example for S3 cache: See also: https://github.com/moby/buildkit#s3-cache-experimental Our settings are working successfully on GitHub self-hosted runner. One problem is that S3 cache do not remove old layer caches automatically, |
|
@roy-ht I'm aware of those examples. Those examples hard-crash for us with the latest release and on |
|
@AndrewFarley I got it working with S3 by using the Edit: Also thanks a lot for your in-depth answer, much appreciated. |
|
I reckon even when ECR supports cache manifests, EFS will most likely stay the fastest option because of better transfer rates than the ECR + one could even store the cache without compression using EFS. What do you think? |
|
Let's keep pressure on getting ECR to support this instead of workarounds I definitely do not look forward to manually cleaning up docker registry cache, should it for example go to S3.. |
|
Using a local cache, which is what you guys are doing by mounting an EFS/FSX volume on the CI machine, will always be faster than using a remote one. Could you now stop spamming this issue about support for cache manifest in ECS, please? I'd say the workarounds have been discussed at length. |
|
When we tried EFS about a year ago it was a disaster, I really don't recommend it. FSX seems a different beast though. |
Would be great if ECR could support cache-manifest (see: https://medium.com/titansoft-engineering/docker-build-cache-sharing-on-multi-hosts-with-buildkit-and-buildx-eb8f7005918e)
The text was updated successfully, but these errors were encountered: