Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Investigate checksum harvesting for public Amazon ECR registries #4330

Closed
kathy-t opened this issue Jun 29, 2021 · 2 comments
Closed

Investigate checksum harvesting for public Amazon ECR registries #4330

kathy-t opened this issue Jun 29, 2021 · 2 comments
Assignees
Labels
bug review merged but pending a third party look at whether it makes sense/is working web-service

Comments

@kathy-t
Copy link
Contributor

kathy-t commented Jun 29, 2021

This ticket is split out from #4283 because it requires more brainstorming. Currently, there is no clear solution to get the digest for images from public Amazon ECR registries. The difficulty is that in order to use the Amazon ECR API to get image metadata (like the digest), the user must have permissions to that public registry.

Ideas so far:

  1. Delay the checksum harvesting until we implement a backup function. We will be downloading full containers and uploading them somewhere in the future, so we could grab the checksums then).
    • Downside: this means that snapshotting works differently for Amazon ECR
  2. Run something in the background that's asynchronous, like a lambda that pulls docker images and uploads the digest back to the webservice
    • May have a bit of a lag because the image is pulled at snapshot time
    • Downside: it's another lambda to manage. Lambdas also have a 500MB disk size and some images are larger than that.
  3. Might be unlikely, but we could wait for Amazon ECR to implement [ECR] [request]: public registry tag listing aws/containers-roadmap#1262

┆Issue is synchronized with this Jira Story
┆friendlyId: DOCK-1843
┆sprint: Sprint 67- Reef shark
┆taskType: Story

@kathy-t
Copy link
Contributor Author

kathy-t commented Jun 30, 2021

Idea 4: We could use a command line tool created by Red Hat called skopeo.

  • Skopeo can inspect a remote image without requiring you to pull the image. If it was from a registry that allowed you to list tags, like Docker Hub, you would be able to get the image digest from the output of the command (sample output).
  • For Amazon ECR, the inspect command doesn't work completely since you can't list the tags, but we can still run inspect --raw, which returns the raw manifest for the image.
  • After that, we can apply skopeo's manifest-digest command which computes a digest for the manifest file inputted. The digest returned matches the image's digest. Alternatively, we could sha256 the manifest file ourselves.

Idea 5: If for some reason we can't use skopeo, we could try to do what skopeo does manually.

  • Pull the image manifest using Docker's API. This may require AWS authentication.
  • Calculate the sha256 digest using the image manifest obtained. May need to format the image manifest returned from the API correctly before applying sha256 (example: newlines at the end of the file will affect the digest)

kathy-t added a commit that referenced this issue Aug 5, 2021
#4330 
* Add checksum harvesting for Amazon ECR images in workflows

* Add Amazon ECR image tests
@unito-bot unito-bot added the review merged but pending a third party look at whether it makes sense/is working label Aug 6, 2021
@unito-bot
Copy link

➤ Natalie Perez commented:

Snapshotted a workflow that included this docker image public.ecr.aws/amazonlinux/amazonlinux:1, and saw the checksum in the image table

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
bug review merged but pending a third party look at whether it makes sense/is working web-service
Projects
None yet
Development

No branches or pull requests

3 participants