Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I know the content digest of an image I built? #18133

Closed
bitglue opened this issue Nov 20, 2015 · 11 comments
Closed

How can I know the content digest of an image I built? #18133

bitglue opened this issue Nov 20, 2015 · 11 comments

Comments

@bitglue
Copy link

bitglue commented Nov 20, 2015

Use case: I check out my project's source code, and I run "make deploy". That triggers the build of a bunch of docker images that compose my project. Then it goes on to trigger automation that makes a bunch of compute nodes that run my project.

I want to be sure that the cluster ends up running the images I built, not:

  • some other image that someone else decided to tag with the same name
  • some other image that is a corruption (gamma rays, faulty hardware, whatever) of the images I built
  • some other image that a MITM is trying to pass off as my image

No problem: when I'm creating my compute cluster, I'll just write the configuration there using references by content hash. This hash might not bear any authentication, but ostensibly it's does confer integrity. As long as the configuration files are reasonably protected (like, only root can write them), then it should be pretty tough to trick my compute cluster into running anything other than the docker image I intended. Unless the root account is compromised of course, in which case it isn't necessary to trick docker to run arbitrary code.

The trouble is there seems to be no way to get the content digest of an image, even one that was just built. This doesn't really make any kind of sense: what kind of content digest is it if I can't calculate the digest if I have the content? Or is the image not actually the content? 😕

Furthermore, even with content trust enabled, pulls succeed if referenced by content hash. Ostensibly this is because the content hash should be a strong integrity mechanism, and if you are referencing something by that hash you've already decided (out of band) that it's authentic.

But unless I'm mistaken, there is no way to get the content hash except to push and then pull an image. I guess that's because:

You don't see a digest when doing 'docker build' because the registry is currently responsible for calculating the digest of the uploaded image manifest and returning that value to the client. (source)

😮

What's to say the registry isn't lying about the hash? How can I give someone else a reference to a specific image which I know to be the right one, how can I do that if I'm reliant on some registry, which I don't trust, to tell me what the "content hash" is?

And if I have to send the image to a registry to get the hash, does that mean it's because the client isn't actually capable of calculating it? And if that's the case, when the client requests an image by a particular content hash, how does it verify the integrity of that image? Does it not actually very anything? Then how is it justified to allow content-hash pulls to succeed even when content trust is enabled?

@GordonTheTurtle
Copy link

Hi!

Please read this important information about creating issues.

If you are reporting a new issue, make sure that we do not have any duplicates already open. You can ensure this by searching the issue list for this repository. If there is a duplicate, please close your issue and add a comment to the existing issue instead.

If you suspect your issue is a bug, please edit your issue description to include the BUG REPORT INFORMATION shown below. If you fail to provide this information within 7 days, we cannot debug your issue and will close it. We will, however, reopen it if you later provide the information.

This is an automated, informational response.

Thank you.

For more information about reporting issues, see https://github.com/docker/docker/blob/master/CONTRIBUTING.md#reporting-other-issues


BUG REPORT INFORMATION

Use the commands below to provide key information from your environment:

docker version:
docker info:
uname -a:

Provide additional environment details (AWS, VirtualBox, physical, etc.):

List the steps to reproduce the issue:
1.
2.
3.

Describe the results you received:

Describe the results you expected:

Provide additional info you think is important:

----------END REPORT ---------

#ENEEDMOREINFO

@cpuguy83
Copy link
Member

ping @diogomonica @stevvooe

@MHBauer
Copy link
Contributor

MHBauer commented Nov 20, 2015

#17670 possibly related? And does #17924 solve this when it gets in?

@bitglue
Copy link
Author

bitglue commented Nov 21, 2015

Related yes -- I probably should have been more specific in my description. As I read #17670 it was mostly about usability issues, and I wanted to raise some questions about security issues. Seemed different enough I thought it could benefit from a separate discussion.

@stevvooe
Copy link
Contributor

@bitglue The output of the digest during the pull command is not calculated by the registry (although it is verified against the registry's digest). It is calculated by the local pulling process. In other words, you can trust the output.

#17924 works to better integrate content digests into the docker daemon. It does a lot of work to ensure that layer digests are consistent, ensuring that manifest digests are consistent across multiple pulls and pushes.

@bitglue
Copy link
Author

bitglue commented Nov 25, 2015

It is calculated by the local pulling process. In other words, you can trust the output.

That's only half the problem. So you are saying if Alice wants to send an image to Bob, then Bob can verify the digest when he pulls the image. But how does Alice obtain the digest in the first place, so she can communicate it (securely, out of band) to Bob?

The current UI would strongly suggest that Alice does not calculate the digest: the registry does. That means Alice is really just relaying information from the registry to Bob. Bob thinks he's trusting Alice, but actually he's trusting the registry.

@stevvooe
Copy link
Contributor

@bitglue You're making incorrect inferences from the UI. That is simply not how it works. Even if Alice gets the digest from the registry, she can verify it before sending it to Bob. She calculates it independently in the process of doing so. That is what is output in the UI. Communicating that hash to Bob, through a name, is part of the content trust system, provided by notary.

You don't see a digest when doing 'docker build' because the registry is currently responsible for calculating the digest of the uploaded image manifest and returning that value to the client. (source)

I think I missed this earlier, but this quote is just wrong and is way out of context (sorry @ncdc ;) ). The registry calculates the canonical hash, from the perspective of the registry and is responsible for maintaining that canonical hash. The client must verify this hash. The hash output to the UI is the verified hash.

@bitglue
Copy link
Author

bitglue commented Dec 2, 2015

OK, thank you for clarifying.

@bitglue bitglue closed this as completed Dec 2, 2015
@stevvooe
Copy link
Contributor

stevvooe commented Dec 2, 2015

@bitglue No problem. I hope I've clarified everything appropriately.

@mschwager
Copy link

Is there a simple way to just calculate the RepoDigests value from existing information? If I, for example, wanted to calculate said information before pushing to a remote source. If it's calculated client side, then why do I need to attempt to push it to a remote source before being able to view the information? It makes sense to be able to view this information about a newly created local image, with, say docker inspect.

Is there some way to go from the existing local image information to the RepoDigests value easily? Something like RepoDigests = name + "@" + "sha256" + sha256(this_information + that_information) would be great!

@mic4ael
Copy link

mic4ael commented Jan 25, 2022

Is there a simple way to just calculate the RepoDigests value from existing information? If I, for example, wanted to calculate said information before pushing to a remote source. If it's calculated client side, then why do I need to attempt to push it to a remote source before being able to view the information? It makes sense to be able to view this information about a newly created local image, with, say docker inspect.

Is there some way to go from the existing local image information to the RepoDigests value easily? Something like RepoDigests = name + "@" + "sha256" + sha256(this_information + that_information) would be great!

That is actually what I have been trying to understand as well. I found in several places that RepoDigests is calculated on docker image push and it should be equal to sha256 of the image manifest but I guess it is not as simple as that and there is something more to it since I can't easily get what docker image ls --digests returns.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

7 participants