Tarsum insecurity #9719

Closed
titanous opened this Issue Dec 18, 2014 · 37 comments

Projects

None yet
@titanous
Contributor

The current tarsum system is fatally flawed.

The goal of a cryptographic checksum is to help build a secure system that prevents messages from being tampered with. In order to build such a system, the message must not be processed in any way before the checksum is verified. Every line of code that does processing before a checksum is verified is dealing with entirely untrusted bits and is a huge target for exploitation.

Currently a massive amount of processing is done before verification of the checksum, namely image unpacking.

Recent Docker vulnerabilities demonstrate this point as CVE-2014-9356, CVE-2014-9357, and CVE-2014-6407 all occur during extraction and could have been triggered using a tampered image that had an invalid tarsum.

Even if unpacking did not happen before the tarsum is checked, the tarsum spec requires that the tar stream be decoded in order to verify the checksum, which is an unacceptable amount of processing to perform before verifying a cryptographic checksum.

I propose that instead, a simple checksum of the tarball is taken after it is packed up. That way image layers can be downloaded to a temporary location, the checksum (and hopefully signature) verified, and then after successful verification they can be unpacked.

On top of this, layer tarsums do not appear to be chained, possibly allowing layer swap attacks. Any checksum of a layer should include the checksum of its parent and so on. One option is to put the checksum in the layer manifest and then after downloading and verifying the signature of the leaf layer, use the parent checksums embedded in each layer manifest. I have not evaluated the current layer signing code as it appears to be not fully deployed.

From a usability perspective, tarsums are only calculated on push and are not exposed in any way locally via the API. This makes it really hard to build other programs that work with Docker images.

@vbatts vbatts added the Trust label Dec 18, 2014
@vbatts
Contributor
vbatts commented Dec 18, 2014

@titanous I do not disagree that it is a lot of processing, though signing the tar artifact means that artifact itself must be preserved. This would also be argued for being an unacceptable amount of added storage.
In the meantime, just taking the checksum of the in flight tar archive would only be a temporary sum, and not something that ISVs or partners could distribute to users/customers for long term trusted image story.

Next, we can both agree that the CVEs are in some way separate of the TarSum checksum, because they exist due to the nature of tar archive extraction. Your point is that presently the tar archive is extracted. I want these steps separated, so the validation of the image archive can be out-of-band for the graph function of the archive. Not a shameless plug, but that is why I have docker-utils that provide a command for calculating the TarSum of a stream or local archive file (go get github.com/vbatts/docker-utils/cmd/dockertarsum). Not extracting, just processing. Last week began on a PoC for fetching images from a registry, for use on docker load (go get github.com/vbatts/docker-utils/cmd/docker-fetch). This way, still not extracting the tar archive, but allowing to save the archive to disk, and validate a detached signature, before loading the image. (docker-fetch is not readied code, but for the sake of illustrating an example).

As for the chaining, I'll focus on the v1 image format. The tarsum of each layer includes the tar archive and the json for that image layer. That json file includes the reference to the parent image ID. If you attempted to side load an impersonator image with the same image ID, that impersonator ID would fail its checksum. If you attempted to change the layer json to reference an alternate parent, then this layer would fail its checksum. Despite checksum validation not being a first class citizen in the v1 image docker daemon, there is attestable assurance against this swap attack.
For v2 that ancestry information is bubbled up to a single manifest, which is verifiable and signed. And the tar layers are addressed by their checksum, so again safe from this swap attack.

I'm glad you open this issue and discussion. It confirms that I am not the only one concerned for this story in the docker v1 image format. I feel that this is an incredibly strong story in making docker daemon more composable and a better security position.

Thoughts?

@titanous
Contributor

@vbatts

signing the tar artifact means that artifact itself must be preserved.

Is this actually true? If tarsum code provides a deterministic way of encoding a tar, can the checksum not be taken of the final tar? Wouldn't this be reproducible if the same algorithm is used to regenerate the tar?

Next, we can both agree that the CVEs are in some way separate of the TarSum checksum

Indeed, the mentioned CVEs only happened in the physical unpack step as opposed to the tarsum calculation step. I used this to demonstrate the problem with doing processing before a checksum is verified. In the case of the tarsum code, there is the possibility of vulnerabilities in the logic due to the amount of processing that happens.

I want these steps separated, so the validation of the image archive can be out-of-band for the graph function of the archive.

Yes, this is very important, the daemon should not unpack at all before the checksum is verified.

The tarsum of each layer includes the tar archive and the json for that image layer. That json file includes the reference to the parent image ID.

Is the image ID the same as a tarsum? If it is not, then an attacker could provide an image with the same ID but a different tarsum, this is the vector I'm pointing out.

For v2 that ancestry information is bubbled up to a single manifest, which is verifiable and signed. And the tar layers are addressed by their checksum, so again safe from this swap attack.

Yes, this would mitigate the issue.

@jlhawn
Contributor
jlhawn commented Dec 18, 2014

The tarsum of each layer includes the tar archive and the json for that image layer.

I don't believe this is the case. The layer checksum field of the current image JSON objects that docker produces is only a checksum of the layer tar archive.

edit: I think it's very valuable to keep around the tarsum of only the layer archive. I would prefer to add a separate field in the image JSON - perhaps a contentAddress field which is a sum of all layer checksums.

@vbatts
Contributor
vbatts commented Dec 18, 2014

@jlhawn that is the reason for github.com/docker/docker/pkg/tarsum.(*TarSum).Sum([]byte). The []byte body of the layer's json manifest is passed to the current sum for the final hexdigest. It is possible to only pass nil, but that is its own use case.

@ewindisch
Contributor

@titanous The fact that IDs are not deterministic and content-based is known and something that I agree needs addressing. There are solutions here for solving this for layer chains such as adding a parent-sum field to the JSON, but this wouldn't eliminate all the risks.

The effort to build a v2 image format exists precisely due to the problems with the existing format.

@ewindisch
Contributor

@titanous @vbatts @jlhawn - I also remind everyone that if they feel there may be possible attacks against the current format, not to publicly discuss this on GitHub. If you do feel this way, I'll happily entertain a private discussion. While I generally prefer and encourage transparency in open source projects, we should be careful to practice responsible disclosure.

Of course, simply discussing improvements to the format in the open is fine.

@vbatts
Contributor
vbatts commented Dec 18, 2014

@titanous

Is this actually true? If tarsum code provides a deterministic way of encoding a tar, can the checksum not be taken of the final tar? Wouldn't this be reproducible if the same algorithm is used to regenerate the tar?

You're right, the option would be having a very strict and deterministic manipulation of the data as the tar is packed. This would not be backwards compatible. Also, it would include stripping/reseting data like atime, ctime and mtime. Not sure that is a route many folks would be comfortable with either.

In the case of the tarsum code, there is the possibility of vulnerabilities in the logic due to the amount of processing that happens.

More moving parts does increase surface area for vulnerabilities. I welcome improvement, and making a best effort in not taking the stance that all of the docker v1 image format is hopeless. This is why we've built out versioning of the algorithm, and now getting a spec for it so we can debate on it and improve on it.

Yes, this is very important, the daemon should not unpack at all before the checksum is verified.

We're agreement there. It is an important point.

Is the image ID the same as a tarsum? If it is not, then an attacker could provide an image with the same ID but a different tarsum, this is the vector I'm pointing out.

I see you're point, but as I illustrate on https://github.com/vbatts/docker-utils, having a detached signed sums file, you can ensure you trust the signer, and validate the id --> sum, before extracting it. Your very example.

@titanous
Contributor

@ewindisch

I also remind everyone that if they feel there may be possible attacks against the current format, not to publicly discuss this on GitHub. If you do feel this way, I'll happily entertain a private discussion. While I generally prefer and encourage transparency in open source projects, we should be careful to practice responsible disclosure.

Disclosing security vulnerabilities is a responsible thing to do. As a security researcher, it is entirely my choice how/when/if I disclose security issues. In this case, I'm not dropping any 0-days, just pointing out fundamental flaws in the current system. Fixing these flaws should be an open discussion, not a private one.

As far as "responsible disclosure" goes, it is only one vulnerability disclosure approach (the alternative is not "irresponsible disclosure"), and there is zero consensus about it.


@vbatts

I welcome improvement, and making a best effort in not taking the stance that all of the docker v1 image format is hopeless.

The image format in its current form is not secure. There are some things that could be done to improve the security of the implementation, but this will not entirely solve the issues I have raised.

This is why we've built out versioning of the algorithm, and now getting a spec for it so we can debate on it and improve on it.

Would you point me at the relevant issues so that I can take a look?

@vbatts
Contributor
vbatts commented Dec 18, 2014

@titanous

The image format in it's current form is not secure. There are some things that could be done to improve the security of the implementation, but this will not entirely solve the issues I have raised.

Let's talk about specifics, because your strongest points had more to do with aspects inherit to tar archives, and the unpacking of them.
Goals being assurance for docker v1 and forward image formats. Granted some process may still feel bolted on, but I don't see what's the use in taking a position that it's hopelessly flawed.

Would you point me at the relevant issues so that I can take a look?

@titanous
Contributor

@vbatts

Let's talk about specifics

Sure, here are my concrete suggestions:

  1. Don't unpack images before verifying the checksum. This can be done without modifying the image format.
  2. For the v1 format, add a field so that layer tarsums are chained.
  3. Keep the image checksum and signature information around, expose it via the API, and allow it to be pinned for a 'pull' request so that external systems can pass around secure image references.
  4. For the v2 format, drop tarsum entirely and hash the tar directly instead of decoding it during verification.
  5. For the v2 format, ensure that image identifiers are content-addressable so that passing an image reference cryptographically pins the contents of the image.
@titanous
Contributor

@vbatts Those issues all appear to be about tarsum, where are the issues about the v2 image format that has been mentioned?

@jlhawn
Contributor
jlhawn commented Dec 18, 2014
  1. Don't unpack images before verifying the checksum. This can be done without modifying the image format.

Currently the tarsum of a layer archive is calculated while it's being unpacked into the storage driver. This is done so that we can make pulling an efficient streaming process. The extraction of the archive is now done with a chroot to the destination to make unpacking safer. However, I do believe that @dmcgowan has made the v2 pull process save the layer archives to temporary files so that some can be downloaded in parallel and then unpacked in order when ready. He can probably better answer this and we can move tarsum calculation to this step of the process.

  1. For the v1 format, add a field so that layer tarsums are chained.

Yes! I agree and I can submit a PR for this when I have some time.

  1. Keep the image checksum and signature information around, expose it via the API, and allow it to be pinned for a 'pull' request so that external systems can pass around secure image references.

We do keep the image checksum around in the latest releases of Docker and the field should be in the result of docker inspect <image-id-or-name>. Signature should definitely stick around with the new format too, it just needs to be implemented, but first we need to agree on the new storage format.

  1. For the v2 format, drop tarsum entirely and hash the tar directly instead of decoding it during verification.

Many people wont like this because it requires significantly more storage space to keep around both the archives and the unpacked layers to be used by containers. I'm agreeable to keeping around the original (optionally compressed) layer archives though because storage is relatively cheap and available, and this way the SHA sum would always stay the same and we don't have to worry about maintaining TarSum at all.

  1. For the v2 format, ensure that image identifiers are content-addressable so that passing an image reference cryptographically pins the contents of the image.

The plan is for the v2 image manifests to be signed (verifying the name of the image and the integrity of the manifest) and for each of the layers to be addressable by checksum.

... where are the issues about the v2 image format that has been mentioned?

While I don't know of any specific discussion about the v2 format, we do have the provenance proposal issue which gives an example v2 image manifest, but all it does is wrap the v1 image JSON blobs so it really needs to have it's own issue where we can fully spec it out.

edit: @GitHub why does the rendered markdown show 1. for each quote section instead of the actual numbers? :(

@kofalt
kofalt commented Dec 18, 2014

@jlhawn

Many people wont like this because it requires significantly more storage space to keep around both the archives and the unpacked layers to be used by containers.

I think @titanous had a good response to this upthread:

Is this actually true? If tarsum code provides a deterministic way of encoding a tar, can the checksum not be taken of the final tar? Wouldn't this be reproducible if the same algorithm is used to regenerate the tar?

@dmcgowan
Member

v2 image related issues

#8093 - original proposal
#9015 - final v2 registry spec

Will respond to V2 related points
4. This would only be necessary if there was no way to securely compute the tarsum when doing image layer verification, however it is possible to compute a tarsum without unpacking. Its a fair argument that generating a deterministic tar and directly taking the hash of the archive has less potential for vulnerabilities. The spec leaves open the possibility of supporting both types.
5. We believe this is achieved in the v2 format

@titanous
Contributor

@dmcgowan #9015 appears to be about a version of the registry API, is the v2 image format intermingled with that proposal?

@dmcgowan
Member

@titanous sorry it does only mention the V2 image format, but with no changes from the previous proposal. Linked because it describes the API for getting the layers by content address.

@jlhawn
Contributor
jlhawn commented Dec 18, 2014

@kofalt wouldn't that mean that we'd have to spec, implement, and maintain our own custom Tar archive library that orders entries and headers in a certain way and even drops certain header values?

We may already be pretty close to this with the github.com/docker/docker/pkg/archive package. I believe it already walks the source directory in lexicographic order, it would just need to ensure that xattrs are in a certain order and that it zeros out access and modified times (but then these values are completely lost when unpacking this archive! tarsum just ignores them)?

@mmdriley
Contributor

Happy to see this broken out into its own issue. I've been asking about this for a while.

Gzip and Tar are nontrivial formats that weren't designed to be parsed securely. Assuming the parser doesn't choke on the structure of the file itself (buffer overflows, arithmetic overflow) then there's still a chance for the data to be interpreted differently by different components. We've already seen an example of the latter.

Memory-safety concerns may be less pressing for Go implementations, but the specification should be safely implementable in other languages.

You're hoping to build a data-authentication scheme that first requires parsing the data to be authenticated. That doesn't end well.

As @jlhawn points out, you're most of the way to establishing a canonical format for TAR files. Presumably a similar set of parameters for Gzip is practical too. Please, please, please just sign a SHA256 of the .tar.gz.

@ewindisch
Contributor

@mmdriley I agree the far-best solution here is to just hash the tar.gz. The problem is that the way Docker works today assumes the content of the layer on disk is the same as the tar.gz. Tarsum exists to provide a fuzzy matching between filesystems and tar algorithms, so the same tarsum may be generated from various systems that might tar the same directory contents, possibly with different algorithms. I think there are alternatives here, but they're non-trivial... (see below)

@jlhawn It seemed at least implied that @titanous was looking to create Docker layers outside of Docker itself. I know I have personally done this from time to time as well. I'm hesitant to demand layer archives be created by pkg/archive.

@titanous I at least agree we should not be parsing archives until after we have performed a tarsum. I admit even that it's potentially dangerous that we even use archive/tar to compute the tarsum.

The only real solution I see is to use something entirely alternative to tar. In theory, one solution would be to create a tarball on 'docker commit'. All such committed layers would be read directly from the tarball via a FUSE driver for tar archives. Only the last layer, the uncommitted layer, would use our storage driver(s). Note that in doing so, we open a can of worms in the stability and security of FUSE.

@ewindisch
Contributor

@inthecloud247 The blog post by @titanous is somewhat orthogonal to this conversation, I think. Everything in that post seems to be being addressed. The libtrust and v2 work is being audited and is getting further review from the community. That's great. That work has been put out there precisely to gather feedback. Limited privilege separation has been introduced with docker 1.3.2 around archive extraction and there are discussions of using this throughout docker and containerizing it more generally. That conversation also included containerization of the 'xz' process.

Right now, it seems we have two immediately actionable items:

  1. If broken, fix the pipeline. We should fully validate an image before we extract it
  2. Discuss possible alternatives to tar for a WORM filesystem that we can use across platforms, and how to integrate into the graph filesystem concept (i.e. layers).
@mmdriley
Contributor

I agree the far-best solution here is to just hash the tar.gz.

More importantly:

Hashing the downloaded file before parsing it is the only way to implement this security feature securely.

If you expose the attack surface of an unhardened parser -- especially one written in C like xz -- then the conversation moves to mitigations because we've already lost.

A discussion of how difficult it would be to do something else seems out-of-place. Docker has to do something else before finalizing this specification. Can we start talking in earnest about what that will be?

For example, are you asserting it's not possible for two systems to create the same .tar file from the same files deterministically? What's the biggest obstacle there?

@kofalt
kofalt commented Dec 23, 2014

Good points all around, though I'm not so sure about choosing a tar alternative.

Maybe a reasonable course of action here would be to yank the The image you are pulling has been verified message until these concerns are resolved, as both this thread and the blog post linked by @inthecloud247 have pointed out serious instances where that message can be misleading.

It would probably be best to be very conservative about the assertions made to users who may not have the full context of this security discussion and the current issues. Alternatively, adding the phrase An attempt has been made to verify... or similar, to be transparent about its current operation.

@pwaller
Contributor
pwaller commented Jan 17, 2015

I've raised @kofalt's suggestion of suppressing the message for now as a new issue #10158.

@pwaller pwaller added a commit to pwaller/docker that referenced this issue Jan 17, 2015
@pwaller pwaller Disable 'verified' message
Fix #10158. See also #9719.

It is known that the current security mechanisms don't provide
strong guarantees. To avoid misleading users,  docker should for
now not print that the image has been verified.

Signed-off-by: Peter Waller <p@pwaller.net>
1f94c31
@pwaller pwaller added a commit to pwaller/docker that referenced this issue Jan 17, 2015
@pwaller pwaller Disable 'verified' message
Fix #10158. See also #9719.

It is known that the current security mechanisms don't provide
strong guarantees. To avoid misleading users,  docker should for
now not print that the image has been verified.

Signed-off-by: Peter Waller <p@pwaller.net>
126e502
@pwaller pwaller added a commit to pwaller/docker that referenced this issue Jan 18, 2015
@pwaller pwaller Disable 'verified' message
Fix #10158. See also #9719.

It is known that the current security mechanisms don't provide
strong guarantees. To avoid misleading users,  docker should for
now not print that the image has been verified.

Signed-off-by: Peter Waller <p@pwaller.net>
92b070f
@pwaller
Contributor
pwaller commented Jan 20, 2015

#9784 was merged today making my pull request to remove the verified message obsolete. Is there a write-up somewhere of how the concerns were addressed? I'm still concerned that docker is doing a tarsum-like thing. Does that still mean that will happily invoke xz on the input before verification for example?

@mmdriley
Contributor

Haven't seen any movement on this in a while. Any path toward a better solution here?

@dmcgowan
Member

@mmdriley if you are interested in this topic much of the work is being done on the distribution end. See docker/distribution#62. We have had some other recent difficulties with tarsum unrelated to security that is pushing us away from it as a distribution content address. We have also made incremental improvements in 1.5 for inclusion of tarsum verification as well as limiting the messaging to users for when we have fully verified the manifest, signing keys, and tarsums. The future likely won't be without tarsum since it is a very valuable tar comparison tool, but using the raw sha on blobs is just easier.

@tphyahoo

Could this have label project/security added?

@gomex
gomex commented Apr 3, 2015

Any news about these problems?

@ewindisch
Contributor

@gomex see the docker/distribution work. Progress is being made on image management which will no longer utilize tarsum.

@jlhawn
Contributor
jlhawn commented Apr 3, 2015

Registry now uses SHA2 digests: docker/distribution#238
Docker now verifies images using SHA2 digests before unpacking: #11271

I think it might be okay to close this issue. @titanous what do you think?

@titanous
Contributor
titanous commented Apr 3, 2015

@jlhawn Is the digest of the raw bytes with no parsing of the tar?

@dmcgowan
Member
dmcgowan commented Apr 3, 2015

@jlhawn we are not doing a hard enforce of the digest yet for 1.6 while we give early adopters of the v2 registry a chance to generate images not using tarsum. Images pushed with 1.6 will not be using tarsum.

@titanous verifying the sha256 digest does not involve parsing the tar file. The digest is computed on the raw bytes as it is being streamed into a temp file. The tar is extracted after the digest computation is complete.

I would be ok with defining this issue as resolved once we switch to hard enforce of the digest and no longer support tarsum as an allowed digest. As long as tarsum is allowed, registries can cause a tarsum computation.

@titanous
Contributor
titanous commented Apr 3, 2015

I would be ok with defining this issue as resolved once we switch to hard enforce of the digest and no longer support tarsum as an allowed digest. As long as tarsum is allowed, registries can cause a tarsum computation.

I agree, this issue should stay open until tarsum is removed or prints a warning that asks for user confirmation when used.

@HuKeping HuKeping referenced this issue in HuKeping/Hello Jun 9, 2015
Open

Track #2

@icecrime icecrime removed the dist/trust label Jul 17, 2015
@alexanderkjeldaas

This was a long discussion indeed.

Integrity checking of the downloaded image doesn't have to involve temporary storage. The technique you want to utilize is called tree hashing, and is used in bittorrent and git for example.

Keccak (SHA-3) is particularly well suited to this. It is completely safe, as you can read in http://sponge.noekeon.org/TreeHashing.pdf. Most modern cryptographic hash functions are designed to support this mode.

Tree hashing simply means that the image hash is (recursively) a hash of other hashes, with hashes of chunks at the leaves. Some padding around the hashes is required to make this safe (read the paper).

With tree hashing you only need to transfer a few extra hashes as meta-data in the image, and using these hashes you can always know that what you have received is a valid part of the final image.

@thaJeztah
Member

I think this can be close now with the content addressable store and tarsum being removed from distribution as well docker/distribution#1271, but feel free to comment if I overlooked something

@thaJeztah thaJeztah closed this Apr 7, 2016
@vbatts
Contributor
vbatts commented Apr 7, 2016

oh what a gem of a thread. I had completely forgotten about. Glad to have
been part of facilitating the content addressibility support. Thanks
Sebastian for following up on this.

On Thu, Apr 7, 2016 at 2:32 AM Sebastiaan van Stijn <
notifications@github.com> wrote:

I think this can be close now with the content addressable store and
tarsum being removed from distribution as well docker/distribution#1271
docker/distribution#1271, but feel free to
comment if I overlooked something


You are receiving this because you were mentioned.
Reply to this email directly or view it on GitHub
#9719 (comment)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment