Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor AWS credential provider #75587

Merged
merged 5 commits into from Mar 29, 2019

Conversation

@tiffanyfay
Copy link
Contributor

commented Mar 22, 2019

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

  • Adds init to aws cred provider for a bug fix
  • Removes excess ecrLazyProvide and moves main functionality into Provide. The next PR will remove LazyProvide completely as no provider is using it.
  • Uses ECR API's expiration instead of setting one globally
  • Doesn't cache if credentials for 12 hours if they aren't found which fixes a bug below
  • Breaks credential provider's dependency on cloud provider so cloud provider can be moved out of tree.
  • Creates only one provider. Also adds repoToPull to Provide interface.
  • AWS cred provider can also now get ECR credentials even without being on an EC2 instance -- it only needs AWS credentials to be set

Which issue(s) this PR fixes:

Fixes #37962
Fixes google/go-containerregistry#355
Needed for #69585

Special notes for your reviewer:
The Provide interface change to add repoToPull is needed for ECR to be able to get the registryID and region without using Cloud Provider. There are then used with the ECR API to get credentials. LazyProvide was added for AWS and no other provider uses it, nor are we using it anymore, so it will be removed in a future PR. This PR requires #75585 to be merged.

Does this PR introduce a user-facing change?:

The AWS credential provider can now obtain ECR credentials even without the AWS cloud provider or being on an EC2 instance. Additionally, AWS credential provider caching has been improved to honor the ECR credential timeout.

/sig aws
/sig cloud-provider
/cc @yastij @andrewsykim @mcrute @nckturner @cheftako @jonjohnsonjr
/assign @liggitt @justinsb
/milestone v1.15
/priority important-soon

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2019

@tiffanyfay: You must be a member of the kubernetes/kubernetes-milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your and have them propose you as an additional delegate for this responsibility.

In response to this:

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

  • Adds init to aws cred provider for a bug fix
  • Removes excess ecrLazyProvide and moves main functionality into Provide. The next PR will remove LazyProvide completely as no provider is using it.
  • Uses ECR API's expiration instead of setting one globally
  • Doesn't cache if credentials for 12 hours if they aren't found which fixes a bug below
  • Breaks credential provider's dependency on cloud provider so cloud provider can be moved out of tree.
  • Creates only one provider. Also adds repoToPull to Provide interface.
  • AWS cred provider can also now get ECR credentials even without being on an EC2 instance -- it only needs AWS credentials to be set

Which issue(s) this PR fixes:

Fixes #37962
Fixes google/go-containerregistry#355

Special notes for your reviewer:
The Provide interface change to add repoToPull is needed for ECR to be able to get the registryID and region without using Cloud Provider. There are then used with the ECR API to get credentials. LazyProvide was added for AWS and no other provider uses it, nor are we using it anymore, so it will be removed in a future PR. This PR requires #75585 to be merged.

Does this PR introduce a user-facing change?:

The AWS credential provider can now obtain ECR credentials even without the AWS cloud provider or being on an EC2 instance. Additionally, AWS credential provider caching has been improved to honor the ECR credential timeout.

/sig aws
/sig cloud-provider
/cc @yastij @andrewsykim @mcrute @nckturner @cheftako @jonjohnsonjr
/assign @liggitt @justinsb
/milestone v1.15
/priority important-soon

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2019

@tiffanyfay: GitHub didn't allow me to request PR reviews from the following users: jonjohnsonjr.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

  • Adds init to aws cred provider for a bug fix
  • Removes excess ecrLazyProvide and moves main functionality into Provide. The next PR will remove LazyProvide completely as no provider is using it.
  • Uses ECR API's expiration instead of setting one globally
  • Doesn't cache if credentials for 12 hours if they aren't found which fixes a bug below
  • Breaks credential provider's dependency on cloud provider so cloud provider can be moved out of tree.
  • Creates only one provider. Also adds repoToPull to Provide interface.
  • AWS cred provider can also now get ECR credentials even without being on an EC2 instance -- it only needs AWS credentials to be set

Which issue(s) this PR fixes:

Fixes #37962
Fixes google/go-containerregistry#355

Special notes for your reviewer:
The Provide interface change to add repoToPull is needed for ECR to be able to get the registryID and region without using Cloud Provider. There are then used with the ECR API to get credentials. LazyProvide was added for AWS and no other provider uses it, nor are we using it anymore, so it will be removed in a future PR. This PR requires #75585 to be merged.

Does this PR introduce a user-facing change?:

The AWS credential provider can now obtain ECR credentials even without the AWS cloud provider or being on an EC2 instance. Additionally, AWS credential provider caching has been improved to honor the ECR credential timeout.

/sig aws
/sig cloud-provider
/cc @yastij @andrewsykim @mcrute @nckturner @cheftako @jonjohnsonjr
/assign @liggitt @justinsb
/milestone v1.15
/priority important-soon

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Mar 22, 2019

Hi @tiffanyfay. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@tiffanyfay

This comment has been minimized.

Copy link
Contributor Author

commented Mar 22, 2019

/assign random-liu

@dims

This comment has been minimized.

Copy link
Member

commented Mar 22, 2019

/ok-to-test

@andrewsykim

This comment has been minimized.

Copy link
Member

commented Mar 22, 2019

Thanks for the PR @tiffanyfay! This is also needed as part of #69585

return nil, err
}
splitURL := strings.Split(parsed.Host, ".")
if len(splitURL) < 4 || splitURL[2] != "ecr" || splitURL[4] != "amazonaws" {

This comment has been minimized.

Copy link
@samuelkarp

samuelkarp Mar 22, 2019

splitURL[2] can also (validly) be ecr-fips. If it's ecr-fips, you should also override the ECR API endpoint. You can construct the correct endpoint with something like this:

resolver := endpoints.DefaultResolver()
endpoint, err := resolver.EndpointFor("ecr-fips", region, func(opts *endpoints.Options) {
		opts.ResolveUnknownService = true
})

and then you can pass the endpoint in when creating the API client.

This comment has been minimized.

Copy link
@tiffanyfay

tiffanyfay Mar 22, 2019

Author Contributor

Will make this as a separate PR, thanks!

@tiffanyfay

This comment has been minimized.

Copy link
Contributor Author

commented Mar 22, 2019

/retest

tiffanyfay added some commits Mar 22, 2019

Refactors and fixes bugs in AWS credentialprovider
Adds caching per registry. Fixes caching of invalid ECR tokens.

@tiffanyfay tiffanyfay force-pushed the tiffanyfay:cred-provider branch from 568556d to 0d63fa4 Mar 28, 2019

@k8s-ci-robot k8s-ci-robot removed the lgtm label Mar 28, 2019

@tiffanyfay

This comment has been minimized.

Copy link
Contributor Author

commented Mar 28, 2019

Thanks @liggitt ! Addressed them all.

@liggitt

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

structure lgtm

will defer to sig-aws reviews on AWS specifics and sufficient test coverage (does this actually get exercised in CI?)

@andrewsykim

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

/approve
/lgtm

I don't think there's e2e coverage for this. Some follow-up items before v1.15 goes out:

  • remove LazyProvide() in all providers - can this be removed in the keyring as well?
  • lots of testing - ideally add e2e tests for this, though this will be a bit more involved since we'll likely need CNCF to provision ECR repo for us

@justinsb PTAL :)

@mcrute

This comment has been minimized.

Copy link
Member

commented Mar 28, 2019

/lgtm
/approve

Also approved as a member of @kubernetes/sig-aws-misc

@liggitt

This comment has been minimized.

Copy link
Member

commented Mar 29, 2019

Given the parsed repoName is already being passed to keyring.Lookup(repoName) today, I think this PR is fine, but I'd like an ack from sig-node that the parsed repoName being passed to credentialprovider.Provide() is coherent with the idea of the kubelet being neutral about image name meaning (xref #58955).

/cc @derekwaynecarr @smarterclayton @mrunalp @Random-Liu @feiskyer

From my reading of this, the credential provider interface is oriented around docker image specs (DockerConfigProvider, DockerKeyring), so it's appropriate to assume docker image spec format here.

@smarterclayton

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

I am ok with passing repoName to credential provider, because credential provider assumes docker pull specs as supported by all standard CRI implementations.

One question - does this has performance implications since we're now having to load more providers?

@liggitt

This comment has been minimized.

Copy link
Member

commented Mar 29, 2019

One question - does this has performance implications since we're now having to load more providers?

It shouldn't, it's one additional provider, which does fast detection (in the Enabled() check), equivalent to the existing providers.

@liggitt

This comment has been minimized.

Copy link
Member

commented Mar 29, 2019

/lgtm
/approve

@k8s-ci-robot

This comment has been minimized.

Copy link
Contributor

commented Mar 29, 2019

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim, liggitt, mcrute, tiffanyfay

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot merged commit dabeb20 into kubernetes:master Mar 29, 2019

17 checks passed

cla/linuxfoundation tiffanyfay authorized
Details
pull-kubernetes-bazel-build Job succeeded.
Details
pull-kubernetes-bazel-test Job succeeded.
Details
pull-kubernetes-conformance-image-test Skipped.
pull-kubernetes-cross Skipped.
pull-kubernetes-e2e-gce Job succeeded.
Details
pull-kubernetes-e2e-gce-100-performance Job succeeded.
Details
pull-kubernetes-e2e-gce-device-plugin-gpu Job succeeded.
Details
pull-kubernetes-godeps Skipped.
pull-kubernetes-integration Job succeeded.
Details
pull-kubernetes-kubemark-e2e-gce-big Job succeeded.
Details
pull-kubernetes-local-e2e Skipped.
pull-kubernetes-node-e2e Job succeeded.
Details
pull-kubernetes-typecheck Job succeeded.
Details
pull-kubernetes-verify Job succeeded.
Details
pull-publishing-bot-validate Skipped.
tide In merge pool.
Details
if len(splitURL) == 0 {
return nil, fmt.Errorf("%s is not a valid ECR repository URL", parsed.Hostname())
}
return &parsedURL{

This comment has been minimized.

Copy link
@tedyu

tedyu Mar 30, 2019

Contributor

Should there be a check on the number of parts in splitURL before accessing ?

f.mutex.Lock()
defer f.mutex.Unlock()

if getter, ok := f.cache[region]; ok {

This comment has been minimized.

Copy link
@tedyu

tedyu Mar 30, 2019

Contributor

It seems we can introduce RWLock and use read lock here, write lock for line 253.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.