Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Uploading image with large layers fails after push #15719

Closed
AdolfVonKleist opened this issue Sep 30, 2021 · 5 comments · Fixed by #16322
Closed

Uploading image with large layers fails after push #15719

AdolfVonKleist opened this issue Sep 30, 2021 · 5 comments · Fixed by #16322
Assignees

Comments

@AdolfVonKleist
Copy link

AdolfVonKleist commented Sep 30, 2021

How can we help you?

We have an issue pushing a 15GB+ container with a maximum single layer size of 13.5GB.

We are running an instance of Harbor on ec2 via a t3a.small instance type:

  • Harbor: Version v2.1.2-fcc6751d
  • Storage backend is AWS S3

The system works fine with all containers to-date, and we also have several that are large than 15GB in total size. Today we tried to push an image with a maximum single-layer size of 13.5GB, and it failed repeatedly. On the client side we only see:

Sep 29 22:29:19 backend-ci dockerd[934]: time="2021-09-29T22:29:19.628869277+02:00" level=error msg="Upload failed, retrying: blob upload unknown"
Sep 29 22:38:28 backend-ci dockerd[934]: time="2021-09-29T22:38:28.463305244+02:00" level=error msg="Upload failed, retrying: blob upload unknown"
Sep 29 22:47:49 backend-ci dockerd[934]: time="2021-09-29T22:47:49.120969007+02:00" level=error msg="Upload failed, retrying: blob upload unknown"

We are not running any additional proxy, and we have the harbor nginx config set as recommended:

client_max_body_size 0;
proxy_send_timeout 900;
proxy_read_timeout 900;

We are running an instance of clair, for which I know you are removing support, and it seems to be quitting intermittently, but it does not seem to be an issue for any other push/pull activities.

NOTE: again the push activity actually completes, but the client side error only appears afterwards. I do see the following error in the registry.log:

registry[885]: time="2021-09-30T08:47:25.330317861Z" level=error msg="upload resumed at wrong offest: 10485760000 != 12341008872"

Any thoughts or advice will be very welcomed; perhaps there are other experiencing similar issues with very-large-layers.

I think it is possible this is related to:

Possibly unrelated, but the spelling error in the error message lead me to the following; the current release we have for harbor is using an older version of distribution:

perhaps it has been fixed in more recent versions of distribution.

@wy65701436 wy65701436 self-assigned this Oct 4, 2021
@wy65701436
Copy link
Contributor

can you try the same scenario with the native distribution? Or probably we can just replace the binary with latest distribution and rebuild the harbor-regsitry image. I know there were some fixes on s3 in distribution main, I'll go through them then.

@AdolfVonKleist
Copy link
Author

@wy65701436 thanks for your reply. Can you clarify what you mean by 'native distribution'?

@karamba-brgs
Copy link

Hi,
we have the same issue with 2.3.3 harbor version

to verify that it was an S3 issue, I had change the harbor storage's values from

type: s3
filesystem:
  rootdirectory: /storage
s3:
  region: eu-west-1
  bucket: name-of-bucket

to
type: filesystem
filesystem:
rootdirectory: /var/lib/registry
s3:
# region: eu-west-1
# bucket: name-of-bucket

this test works fine, we can upload large image with large layers, but our storage goal is S3

I then modified storage values to S3 and tested the last harbor release 2.4.0-rc1 and had the same issue (level=error msg="response completed with error" auth.user.name="harbor_registry_user" err.code="blob upload invalid" and level=error msg="upload resumed at wrong offest: 10485760000 != 12912793630")

what do you suggest ?
thanks you for your help

Sylvain

@AdolfVonKleist
Copy link
Author

any chance there is an update here or further information @wy65701436 ?

@tiezhuoyu
Copy link

Actually, it's a bug of docker registry. More details here Fixes max layer size of 10GB bug

The temporary solution is to build a registry image with the patch.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

4 participants