Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

fix cancellation error not being detected and erroneously cached #2926

Merged
merged 1 commit into from
Jul 21, 2022

Conversation

tonistiigi
Copy link
Member

This should fix the EOF errors happening more frequently in the CI.

The problem is that when context is canceled in the middle of a fetch stream(grpc stream pushing to containerd), in certain conditions it ends with EOF errors instead of cancel error what actually triggered the case. This is deep in containerd libraries so can't be fixed here so instead we need to have more exception cases on checking the errors.

The errors are cached both in solver and in the image resolver so I combined them both to use same helper function.

Signed-off-by: Tonis Tiigi tonistiigi@gmail.com

Signed-off-by: Tonis Tiigi <tonistiigi@gmail.com>
// grpc does not set cancel correctly when stream gets cancelled and then Recv is called
if err != nil && ctx.Err() == context.Canceled {
// when this error comes from containerd it is not typed at all, just concatenated string
if strings.Contains(err.Error(), "EOF") {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Doesn’t seem robust.
Where is the source of this EOF

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

At least one of the places should be https://github.com/containerd/containerd/blob/main/content/proxy/content_writer.go#L44 but there might be more.

Should we ever match an error that is not a real cancellation error here it isn't a big issue. We have already verified that our own context was canceled so at least it was the user's intent. Matching non-cancel error in here means that the error does not get cached as we don't consider it a real error and another parallel process requesting the same result would need to try to recompute the result again.

@AkihiroSuda AkihiroSuda merged commit 23ab2d0 into moby:master Jul 21, 2022
charles-dyfis-net added a commit to charles-dyfis-net/oci-build-task that referenced this pull request Sep 27, 2022
Motivating change is moby/buildkit#2926, whereby errors were not correctly propagated.
charles-dyfis-net added a commit to charles-dyfis-net/oci-build-task that referenced this pull request Sep 27, 2022
Motivating change is moby/buildkit#2926, whereby errors were not correctly propagated.

Signed-off-by: Charles Duffy <charles@dyfis.net>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants