Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Proposal: push performance improvements #14018

Closed
3 tasks done
ncdc opened this issue Jun 18, 2015 · 18 comments
Closed
3 tasks done

Proposal: push performance improvements #14018

ncdc opened this issue Jun 18, 2015 · 18 comments

Comments

@ncdc
Copy link
Contributor

ncdc commented Jun 18, 2015

Proposal: push performance improvements

Background

Cluster operators often run Docker on multiple hosts. Clusters that provide image building functionality will commonly pull down an image, perform a series of build steps using it, generate a new image, and push the resulting image to a registry.

Problems

Inconsistent layer checksums across hosts

In a given cluster of hosts running Docker, if they all download the same image (e.g. centos:centos7), they will all receive the same set of layers, and they will all resolve to the same v1 layer IDs (because these IDs are stored in the history.v1Compatibility section of the image manifest). For example:

docker history centos:centos7
IMAGE               CREATED             CREATED BY                                      SIZE
fd44297e2ddb        8 weeks ago         /bin/sh -c #(nop) CMD ["/bin/bash"]             0 B
41459f052977        8 weeks ago         /bin/sh -c #(nop) ADD file:be2a22bb15fbbbf24b   215.7 MB
6941bfcbbfca        8 weeks ago         /bin/sh -c #(nop) MAINTAINER The CentOS Proje   0 B

These v1 layer IDs unfortunately don't correspond to the checksums the registry uses to identify them:

"fsLayers": [
   {
      "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
   },
   {
      "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
   },
   {
      "blobSum": "sha256:f9587eba3ab8840fb621e02a7f6c53439f1e2a651fa95bc7185ad3081a2eb795"
   },
   {
      "blobSum": "sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
   }
],

The Docker daemon does not store the blobs it downloaded from the registry. These are the compressed tar archives that are associated with the blobSum entries above. Because the daemon doesn't store the blobs, it has to recreate them when uploading them to a new repository. And when they're recreated, it's possible (even likely) that each host will generate a different checksum for the same layer. This is because Docker has to regenerate the compressed tar archive for the layer, and it's possible that each host has different file attributes for the layer's files (ctime, mtime, etc). Thus, when each host pushes what should be an identical layer, because the checksums differ, the registry is forced to accept and store a new copy of the same data.

The Docker daemon does cache the checksum of each layer it has pushed, and with #13945 it now caches the layer of each checksum on pull, which does help alleviate this issue slightly. Although, because only a single checksum is stored per layer, if you upload the layer to a repository that doesn't currently contain it, a new checksum will be calculated, and if it differs from the cached copy, the cached copy will be replaced with the new value.

Pushing a layer to a new repository requires the entire layer be uploaded

When Docker is pushing an image's layers to a repository, it first queries the registry to see if each layer has already been linked into the repository. If not, the registry requires the complete layer be uploaded, even if a blob with the same checksum already exists in the registry's storage. This is a security measure, to avoid someone from being able to gain access to a layer just by knowing its checksum.

In build clusters, a common scenario is to download a base image, add 1 or more layers, and upload the resulting image to the registry. Let's say you have multiple users, each with his/her own image repository. Using the centos:centos7 example above, that means that the layer with v1 ID 41459f052977 must be uploaded in its (compressed, approximately 74MB) entirety to each repository, even though the registry already contains that layer. This is inefficient and slows down the build process.

A full summary of this problem and a proposed solution is in distribution/distribution#634

Proposed solutions

Inconsistent layer checksums across hosts

The daemon should to be able to reconstruct a blob that always has the exact same checksum as blob it originally downloaded from the registry.

One approach is to store the blobs, unmodified, after they've been downloaded, and upload them when pushing. This has obvious storage impacts, as it means you're now storing a compressed copy of each layer along with the layer itself.

For another approach, @vbatts is working on code to incorporate his https://github.com/vbatts/tar-split library to store enough information about the original blob to be able to reconstruct it, without having to store the entire blob itself. I will link to his PR once he creates it.

With whichever approach is selected, we should probably:

  1. keep the current default behavior; i.e., only store the layers, not the blobs
  2. make it configurable to store the blobs and/or tar-split info
  3. provide some means for administrators to view and manage the stored blobs/tar-split info

Pushing a layer to a new repository requires the entire layer be uploaded

To summarize the proposed solution from distribution/distribution#634: add a new route to the registry to link a blob from 1 repository into another, provided the client can supply valid credentials with "pull" access to the source and "push" access to the target.

For example, if I:

  1. docker pull centos:centos7
  2. docker tag centos:centos7 ncdc/centos:centos7
  3. docker push ncdc/centos:centos7

When Docker attempts to push each layer, it would first ask the registry to link the layer from centos into ncdc/centos. If I have permission to pull from centos, this will succeed, saving me from having to upload layers that already exist in the registry's storage.

For full details, please see the linked issue.

Status

  • Store checksum on pull
  • Store blobs and/or tar-split metadata
  • Link layer from 1 repo to another
@ncdc
Copy link
Contributor Author

ncdc commented Jun 18, 2015

@stevvooe @dmcgowan @vbatts

@ncdc
Copy link
Contributor Author

ncdc commented Jun 18, 2015

@tiborvass PTAL

@thaJeztah
Copy link
Member

Thanks for describing this in such detail, @ncdc

@ejholmes
Copy link

ejholmes commented Jul 8, 2015

Just wanted to say thanks for attempting to address this. The overall slowness of docker build + docker push is a serious limitation for people trying to use Docker in production build systems.

@ejholmes ejholmes mentioned this issue Jul 8, 2015
2 tasks
@ejholmes
Copy link

ejholmes commented Jul 8, 2015

Might want to add #12489 to the list of problems.

@jberkus
Copy link

jberkus commented Aug 11, 2015

Adding Issue #13309 to this tracker.

I've been able to reproduce that issue multiple times; it really seems like docker push checksums each MB as it's uploaded, resulting in N^2 upload times for large layers.

@stevvooe
Copy link
Contributor

@jberkus I commented on your observation in #13309. I don't think it is N^2 but there is a performance knee caused by disk buffering.

@Fei-Guang
Copy link

highly expected feature improvement

@LK4D4
Copy link
Contributor

LK4D4 commented Nov 24, 2015

@tonistiigi I wonder if this will be fixed with #17924

@tonistiigi
Copy link
Member

@LK4D4 #17924 stores multiple compressed blobs per layer and uses this storage for both pulling and pushing. It also makes tar-split mandatory, tar files(read: layers) can never change with any docker command. Cross-repository push is mainly a registry feature and is not affected.

@stevvooe
Copy link
Contributor

@tonistiigi There does need to be some client-side improvements to let a layer link between two repositories. There was some work along these lines, but it has stalled out.

@ncdc What is the state of cross-repository push? At this point, the rest of these problems will be worked for 1.10.

@Soulou
Copy link
Contributor

Soulou commented Nov 30, 2015

Also expecting this feature a lot. A lot of workflow is based on a common 'base image' where layers are added, without that, the registry:v2 is almost unusable..

@jwhonce
Copy link
Contributor

jwhonce commented Nov 30, 2015

@stevvooe Yes, we're allocating people and planning on resuming this work.

@stevvooe
Copy link
Contributor

Also expecting this feature a lot. A lot of workflow is based on a common 'base image' where layers are added, without that, the registry:v2 is almost unusable..

Please keep your commentary constructive.

@stevvooe
Copy link
Contributor

stevvooe commented Dec 7, 2015

Store blobs and/or tar-split metadata

@ncdc Can we check this off? I am not sure that we are at a stable hash yet, but I know we have tar-split, in place.

Link layer from 1 repo to another

What is the status of this item?

@ncdc
Copy link
Contributor Author

ncdc commented Dec 7, 2015

@stevvooe box checked.

@jwhonce's team is now in charge of cross-repo pushing, so I will defer to him.

@dmcgowan
Copy link
Member

Status update: Distribution side PR for cross repository push is now in review distribution/distribution#1269 with engine side PR to follow.

@icecrime
Copy link
Contributor

Cross repository push was shipped in Docker Engine 1.10 / Registry 2.3.

I think it's safe to close this issue now, thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests