New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
fix(distribution): digest cache should not be moved if it was an auth #36509
Conversation
ping @dmcgowan @tonistiigi PTAL |
ping @dmcgowan |
distribution/push_v2.go
Outdated
@@ -580,14 +600,14 @@ func getMaxMountAndExistenceCheckAttempts(layer PushLayer) (maxMountAttempts, ma | |||
// then fallback to upload | |||
return 4, 3, true | |||
|
|||
// middle sized blobs; if we could not get the size, assume we deal with middle sized blob | |||
// middle sized blobs; if we could not get the size, assume we deal with middle sized blob |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should not have the extra indentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
sorry for my ide auto format the code
distribution/push_v2.go
Outdated
case size > smallLayerMaximumSize, err != nil: | ||
// 1st attempt to mount blobs of average size few times | ||
// 2nd try at most 1 existence check if there's an existing mapping to the target repository | ||
// then fallback to upload | ||
return 3, 1, false | ||
|
||
// small blobs, do a minimum number of checks | ||
// small blobs, do a minimum number of checks |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This comment should not have the extra indentation.
distribution/push_v2_test.go
Outdated
"github.com/docker/docker/distribution/metadata" | ||
"github.com/docker/docker/layer" | ||
"github.com/docker/docker/pkg/progress" | ||
"github.com/opencontainers/go-digest" | ||
"github.com/docker/docker/api/types" | ||
"github.com/docker/docker/registry" | ||
"net/url" |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put this above with "net/http".
distribution/push_v2.go
Outdated
@@ -397,8 +416,9 @@ func (pd *v2PushDescriptor) Upload(ctx context.Context, progressOutput progress. | |||
return distribution.Descriptor{}, retryOnError(err) | |||
} | |||
} | |||
defer layerUpload.Close() | |||
|
|||
if layerUpload != nil { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It doesn't seem right to allow layerUpload
to be nil and then immediately afterwards call pd.uploadUsingSession
with layerUpload
. Why would layerUpload
be nil when bs.Create
returned no error a few lines above?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yes, this is paranoid, when I test the code with mock method I thought that may be the case when error is empty while layerUpload is nil too, and that should not happened
distribution/push_v2.go
Outdated
switch e := e.(type) { | ||
case errcode.Error: | ||
if e.Code == errcode.ErrorCodeUnauthorized { | ||
logrus.Debugf("error code is unauthoriedError") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
unauthoriedError
is misspelled.
distribution/push_v2.go
Outdated
if isUnauthorizedError && !pd.pushState.hasAuthInfo { | ||
// break the attempt loop as user has not login | ||
logrus.Infoln("failed to push image to registry because unauthorized error and user not login") | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should continue instead of break. There might be other repos which do not require credentials to mount the blob. The continue can go inside the error checking code, instead of setting an isUnauthorizedError
variable (it would mean having a label on the candidate loop and using continue candidateLoop
).
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is modified as you suggested
Codecov Report
@@ Coverage Diff @@
## master #36509 +/- ##
=========================================
Coverage ? 34.82%
=========================================
Files ? 612
Lines ? 45412
Branches ? 0
=========================================
Hits ? 15814
Misses ? 27524
Partials ? 2074 |
PowerPC failing is due to a flaky test; https://jenkins.dockerproject.org/job/Docker-PRs-powerpc/8973/console
I see that one is being fixed through #36569 |
distribution/push_v2.go
Outdated
if e.Code == errcode.ErrorCodeUnauthorized && !pd.pushState.hasAuthInfo { | ||
// continue to try other candidate as other repos may not require credentials to mount the blob | ||
logrus.Debugf("failed to push layer to registry because unauthorized error and user not login") | ||
continue candidateLoop |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Does the layer upload need to be cancelled here? Maybe instead of a continue statement just ensure that the remove doesn't happen so there are fewer early exit/continue conditions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as @aaronlehmann mentioned
I think this should continue instead of break. There might be other repos which do not require credentials to mount the blob. The continue can go inside the error checking code, instead of setting an isUnauthorizedError variable (it would mean having a label on the candidate loop and using continue candidateLoop).
is that the reason when authorizedError happened, it should continue to check other candidates?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the main difference is do we break here when first authorizedError met or continue to try other candidates
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do we have some conclusion on this so that I can have another commit before this weekend? @dmcgowan @aaronlehmann : )
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmcgowan is right that this needs to cancel the layer upload.
How about adding && !isUnauthorizedError
to if len(mountCandidate.SourceRepository) > 0...
below, instead of the continue?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
How about adding && !isUnauthorizedError to if len(mountCandidate.SourceRepository) > 0... below, instead of the continue?
it looks like it will still continue to try next candidate which is not necessary?
it looks like it will still continue to try next candidate which is not necessary?
I think it's a good idea to try the next candidate, but don't feel strongly.
|
error local digest cache will be removed when error occured on push image but it should not be removed if it is an auth error while on auth was provided moby#36309 Signed-off-by: 慕陶 <jihui.xjh@alibaba-inc.com>
if len(mountCandidate.SourceRepository) > 0 && | ||
!(isUnauthorizedError && !pd.pushState.hasAuthInfo) && |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmcgowan @aaronlehmann please have a look at this commit, hope that makes you happy
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@dmcgowan let me know if you have any concern with this commit.
can we proceed this PR...? 🤓 |
LGTM |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM (thanks for reviewing @aaronlehmann @dmcgowan)
janky status seems not updated, actually it has finished |
|
fixes #36309
- How I did it
when push mount candidate to registry, first check if error is autherror, then check if user has authConfig
if both autherror while user has not login, do not remove the local cache.
- How to verify it
push the image without login, and login again, you still can mount the shared layer
- Description for the changelog
- A picture of a cute animal (not mandatory but encouraged)