Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

'filesystem layer verification failed for digest' with latest 2016.03.a AMI #385

Closed
rodlogic opened this issue Apr 23, 2016 · 12 comments
Closed

Comments

@rodlogic
Copy link

Task failing and /var/log/docker show the following:

time="2016-04-23T13:34:19.137232584Z" level=info msg="POST /v1.17/images/create?fromImage=295240448163.dkr.ecr.us-east-1.amazonaws.com%2Fdocker-test%3A6.3.4"
time="2016-04-23T13:34:19.465342946Z" level=error msg="filesystem layer verification failed for digest sha256:af3cc4b92fa13ac06710e7bf75c5b7e57a069bebeb63df92b66648344c228f6b"
time="2016-04-23T13:34:19.497702490Z" level=error msg="filesystem layer verification failed for digest sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
time="2016-04-23T13:34:19.501345632Z" level=error msg="filesystem layer verification failed for digest sha256:658bc4dc70694b4f8005f9fc334bcf044001473ce3dc7c96dbfa5758590cd2c6"
time="2016-04-23T13:34:19.507415858Z" level=error msg="filesystem layer verification failed for digest sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
time="2016-04-23T13:34:19.510059285Z" level=error msg="filesystem layer verification failed for digest sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
time="2016-04-23T13:34:19.515344794Z" level=error msg="filesystem layer verification failed for digest sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d4"
time="2016-04-23T13:34:19.541650416Z" level=error msg="filesystem layer verification failed for digest sha256:d0034177ece9b3ca0d50856d5b28a5cd0454049f337bbd0d94436a0ff34de349"

The images were pushed from an ubuntu instance using docker 1.9.1.

@rodlogic
Copy link
Author

Digest verification is failing for when pulling from ECR but not when pulling from Docker Hub.

@rodlogic
Copy link
Author

With 2015.09.g, I still can't pull but I see a different message:

[ec2-user@ip-10-11-23-254 ~]$ docker -D pull 295240447163.dkr.ecr.us-east-1.amazonaws.com/docker-ubuntu:16.04
16.04: Pulling from docker-ubuntu
a129d595b99b: Verifying Checksum
6b05dd74adf5: Pulling fs layer
ab6a00dd1821: Pulling fs layer
8af37f0d4b22: Pulling fs layer
4d7d3a0a4125: Pulling fs layer
a474e23bc922: Pulling fs layer
0b6fed317e16: Pulling fs layer
Pulling repository 295240447163.dkr.ecr.us-east-1.amazonaws.com/docker-ubuntu
Error: image docker-ubuntu:16.04 not found

@rodlogic
Copy link
Author

And found the following in the AWS Forums: https://forums.aws.amazon.com/message.jspa?messageID=711336#711336

It seems that this is happening with docker v1.9.x and below. Docker 1.9.1 is the default that comes with 2016.03.a AMI.

Inlining for you convenience:

I apologize for the issues you are encountering. 
We have investigated the reports of "Error: image <image_name:tag> not found" being returning from the Docker Client when attempting to pull images from ECR. 
In nearly all cases there is a corresponding error message of "filesystem layer verification failed for digest sha256:xxxxx" in the docker logs.
We have been able to reproduce this issue internally, and have confirmed that in all cases images are available in the ECR repository and the integrity of the layer is intact. Our findings conclude that these errors are present in Docker versions prior to 1.10. 
The error is caused by an error earlier in the pull sequence that goes unchecked until the layer verification occurs. In nearly all cases this error is caused by a network timeout in 'io.Copy' (https://github.com/docker/docker/blob/a34a1d598c6096ed8b5ce5219e77d68e5cd85462/graph/pull_v2.go#L213) which results in an empty layer getting passed to the verifier. A failed verification causes the image pull to fail. The Docker Remote API then returns a "404 - Not Found” to the Docker Client.
From our testing this occurred in ~0.2% of layer downloads resulting in negative effects on ~1.5% of our image downloads (using a 10 layer image). To mitigate effects on Docker 1.9 and lower, we recommend reducing the number of layers in your image if possible and retrying failed image downloads. We will update this thread with additional details and resolution as we have them.
If you are seeing this issue at a higher rate or with a different signature, we would love to get more information so that we can further assist.

@rodlogic
Copy link
Author

How can I install docker 1.10.x or 1.11.x on Amazon Linux ECS 2016.03.a AMI?

@aaithal
Copy link
Contributor

aaithal commented May 2, 2016

@rodlogic We do not support docker versions beyond 1.9.1 yet on the Amazon Linux ECS Optimized AMI. We will update this thread when higher versions are available.

@fxp0
Copy link

fxp0 commented May 23, 2016

I had the same issue. It was fixed after I've added my route-table to the VPC S3 edpoints. Yes, it sounds odd... But after tens tries and fails it's finally working!
I found that in some subnets I can pull images and in other I can't. It was the same errors in logs:

level=error msg="Handler for POST /v1.21/containers/create returned error: No such image: 123456789000.dkr.ecr.us-east-1.amazonaws.com/example:latest"
level=error msg="HTTP Error" err="No such image: 123456789000.dkr.ecr.us-east-1.amazonaws.com/example:latest" statusCode=404
level=info msg="POST /v1.21/images/create?fromImage=123456789000.dkr.ecr.us-east-1.amazonaws.com%2Fexample&tag=latest"
level=error msg="filesystem layer verification failed for digest sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d5"
level=error msg="filesystem layer verification failed for digest sha256:a3ed95caeb02ffe68cdd9fd84406680ae93d633cb16422d00e8a7c22955b46d3"
level=error msg="filesystem layer verification failed for digest sha256:e3ef90ca74c5a427235319fa05d3c88b304e2acefb7460fc996e932ae2d479c1"
level=error msg="filesystem layer verification failed for digest sha256:d5474cbf707817254be9439378dd425a426eab1d89da74b0fe44891fe3ad6420"
level=error msg="filesystem layer verification failed for digest sha256:16f640cb193194f5d2add670c889485813afa52796d634bb744fbaf1a4797e9b"
level=error msg="filesystem layer verification failed for digest sha256:d195fadf623090c364199ee2c5b6cf8c725a7bcd2eba866e71c35c74d0327a10"
level=error msg="filesystem layer verification failed for digest sha256:6b43a32a04fc047a7ab6bbda5cdc8c7742ab77f7e3aaeda23997722905f6541e"
level=error msg="filesystem layer verification failed for digest sha256:a10cf901b4e3d70d3fbf6db3d7b7ffa8baf6545027c688d34015e45f23081d73"

Then I've found just one difference between subnets. It was VPC S3 endpoint... After I've added the route table (in which I couldn't made "docker pull") to the VPC endpoint's route tables list, the issue was fixed.
It's so strange, that ECR service was affected, as currently VPC endpoint works only for S3 (according to documentation).

weired...

@marcogallotta
Copy link

We ran into this issue using Docker 1.10.1. Pulling the same image from Docker 1.11.2 worked fine. Deleting the local cache where we were building images from solved the problem for us.

@samuelkarp
Copy link
Contributor

We have released Docker 1.11.1 last week in the 2016.03.c version of the ECS-optimized AMI. @rodlogic, can you try using the new AMI and let us know if that is working better?

@samuelkarp
Copy link
Contributor

@rodlogic We haven't heard from you in a while, so I'm going to close this issue. Please let us know if you're still running into problems with Docker 1.11.1 and ECR.

@kenny-house
Copy link

I just updated our cluster to the newest ECS optimized AMI (agent version 1.10.0) after having trouble pulling images from ECR. After updating I am currently unable to pull any images from ECR. This is immediately after a boot, so not likely a disk space issue. Going to try to gather more information.

Getting " filesystem layer verification failed for digest" consistently

@msurdi
Copy link

msurdi commented Jul 13, 2016

Running into this same issue too, running latest official ECS Optimized AMI

# docker pull *****.dkr.ecr.us-east-1.amazonaws.com/****/***:dev
dev: Pulling from ***/****
420890c9e918: Already exists 
0b0515b16c32: Already exists 
7b49d8cc8a09: Already exists 
4cb4eb2f12c4: Extracting [=======>                                           ] 7.373 MB/47.43 MB
6d1ae0be5225: Download complete 
0d34606bdea6: Download complete 
12be13d307f5: Download complete 
86de4355c1c9: Download complete 
03a66d1903c1: Download complete 
d302d8833f2f: Verifying Checksum 
bd2ea3d03f8b: Download complete 
8279f9c9cfd8: Download complete 
filesystem layer verification failed for digest sha256:d302d8833f2ff37136dcbcb986ac3cf028ea80a16e3083ab997007a866e11369
# docker --version
Docker version 1.11.1, build 5604cbe/

@jhovell
Copy link

jhovell commented Jul 26, 2016

Also running into this issue with latest ECS Optimized AMI (2016.03.e) occasionally (1% of the time):

from /var/log/docker:

time="2016-07-26T23:08:12.311521673Z" level=error msg="filesystem layer verification failed for digest sha256:864a98a84dd2bba52cf57d13161517ee01e2966e72c3ac842c6a3d49c07dcb37" 
time="2016-07-26T23:08:12.311561285Z" level=error msg="Download failed: filesystem layer verification failed for digest sha256:864a98a84dd2bba52cf57d13161517ee01e2966e72c3ac842c6a3d49c07dcb37" 
time="2016-07-26T23:08:25.678109684Z" level=error msg="Error trying v2 registry: filesystem layer verification failed for digest sha256:864a98a84dd2bba52cf57d13161517ee01e2966e72c3ac842c6a3d49c07dcb37" 
time="2016-07-26T23:08:25.678155231Z" level=error msg="Attempting next endpoint for pull after error: filesystem layer verification failed for digest sha256:864a98a84dd2bba52cf57d13161517ee01e2966e72c3ac842c6a3d49c07dcb37" 
time="2016-07-26T23:08:25.681163015Z" level=error msg="Handler for POST /v1.17/containers/create returned error: No such image: account-id.dkr.ecr.us-east-1.amazonaws.com/my-project:my-tag" 

Possibly related to this:

https://forums.aws.amazon.com/thread.jspa?threadID=227929

In my case it is a java8 based Docker image, but this issue was supposedly fixed in Docker 1.10.

More relevant to ECS: how do I healthcheck this state so I can terminate instances that get stuck here? The task shows as pending and gets restarted periodically, but assuming it ends up on the same host gets stuck in an endless loop.

jy19 added a commit to jy19/amazon-ecs-agent that referenced this issue Feb 14, 2020
for roadmap issue aws#385
aws/containers-roadmap#385
this commit adds the ability for customers to add parameters
to the secretsmanager ARN specified in containers. agent will be
able to retrieve secret by version or retrieve part of a secret
by json key.
this commit also fixes a minor issue breaking go vet in an unrelated test.
jy19 added a commit to jy19/amazon-ecs-agent that referenced this issue Feb 14, 2020
for roadmap issue aws#385
aws/containers-roadmap#385
this commit adds the ability for customers to add parameters
to the secretsmanager ARN specified in containers. agent will be
able to retrieve secret by version or retrieve part of a secret
by json key.
this commit also fixes a minor issue breaking go vet in an unrelated test.
jy19 added a commit to jy19/amazon-ecs-agent that referenced this issue Feb 14, 2020
for roadmap issue aws#385
aws/containers-roadmap#385
this commit adds the ability for customers to add parameters
to the secretsmanager ARN specified in containers. agent will be
able to retrieve secret by version or retrieve part of a secret
by json key.
this commit also fixes a minor issue breaking go vet in an unrelated test.
jy19 added a commit to jy19/amazon-ecs-agent that referenced this issue Feb 14, 2020
for roadmap issue aws#385
aws/containers-roadmap#385
this commit adds the ability for customers to add parameters
to the secretsmanager ARN specified in containers. agent will be
able to retrieve secret by version or retrieve part of a secret
by json key.
this commit also fixes a minor issue breaking go vet in an unrelated test.
jy19 added a commit to jy19/amazon-ecs-agent that referenced this issue Feb 17, 2020
for roadmap issue aws#385
aws/containers-roadmap#385
this commit adds the ability for customers to add parameters
to the secretsmanager ARN specified in containers. agent will be
able to retrieve secret by version or retrieve part of a secret
by json key.
this commit also fixes a minor issue breaking go vet in an unrelated test.
jy19 added a commit to jy19/amazon-ecs-agent that referenced this issue Feb 18, 2020
for roadmap issue aws#385
aws/containers-roadmap#385
this commit adds the ability for customers to add parameters
to the secretsmanager ARN specified in containers. agent will be
able to retrieve secret by version or retrieve part of a secret
by json key.
this commit also fixes a minor issue breaking go vet in an unrelated test.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

8 participants