Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Refactor AWS credential provider #75587

Merged
merged 5 commits into from Mar 29, 2019

Conversation

tiffanyfay
Copy link
Contributor

@tiffanyfay tiffanyfay commented Mar 22, 2019

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

  • Adds init to aws cred provider for a bug fix
  • Removes excess ecrLazyProvide and moves main functionality into Provide. The next PR will remove LazyProvide completely as no provider is using it.
  • Uses ECR API's expiration instead of setting one globally
  • Doesn't cache if credentials for 12 hours if they aren't found which fixes a bug below
  • Breaks credential provider's dependency on cloud provider so cloud provider can be moved out of tree.
  • Creates only one provider. Also adds repoToPull to Provide interface.
  • AWS cred provider can also now get ECR credentials even without being on an EC2 instance -- it only needs AWS credentials to be set

Which issue(s) this PR fixes:

Fixes #37962
Fixes google/go-containerregistry#355
Needed for #69585

Special notes for your reviewer:
The Provide interface change to add repoToPull is needed for ECR to be able to get the registryID and region without using Cloud Provider. There are then used with the ECR API to get credentials. LazyProvide was added for AWS and no other provider uses it, nor are we using it anymore, so it will be removed in a future PR. This PR requires #75585 to be merged.

Does this PR introduce a user-facing change?:

The AWS credential provider can now obtain ECR credentials even without the AWS cloud provider or being on an EC2 instance. Additionally, AWS credential provider caching has been improved to honor the ECR credential timeout.

/sig aws
/sig cloud-provider
/cc @yastij @andrewsykim @mcrute @nckturner @cheftako @jonjohnsonjr
/assign @liggitt @justinsb
/milestone v1.15
/priority important-soon

@k8s-ci-robot k8s-ci-robot added the release-note Denotes a PR that will be considered when it comes time to generate release notes. label Mar 22, 2019
@k8s-ci-robot
Copy link
Contributor

@tiffanyfay: You must be a member of the kubernetes/kubernetes-milestone-maintainers GitHub team to set the milestone. If you believe you should be able to issue the /milestone command, please contact your and have them propose you as an additional delegate for this responsibility.

In response to this:

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

  • Adds init to aws cred provider for a bug fix
  • Removes excess ecrLazyProvide and moves main functionality into Provide. The next PR will remove LazyProvide completely as no provider is using it.
  • Uses ECR API's expiration instead of setting one globally
  • Doesn't cache if credentials for 12 hours if they aren't found which fixes a bug below
  • Breaks credential provider's dependency on cloud provider so cloud provider can be moved out of tree.
  • Creates only one provider. Also adds repoToPull to Provide interface.
  • AWS cred provider can also now get ECR credentials even without being on an EC2 instance -- it only needs AWS credentials to be set

Which issue(s) this PR fixes:

Fixes #37962
Fixes google/go-containerregistry#355

Special notes for your reviewer:
The Provide interface change to add repoToPull is needed for ECR to be able to get the registryID and region without using Cloud Provider. There are then used with the ECR API to get credentials. LazyProvide was added for AWS and no other provider uses it, nor are we using it anymore, so it will be removed in a future PR. This PR requires #75585 to be merged.

Does this PR introduce a user-facing change?:

The AWS credential provider can now obtain ECR credentials even without the AWS cloud provider or being on an EC2 instance. Additionally, AWS credential provider caching has been improved to honor the ECR credential timeout.

/sig aws
/sig cloud-provider
/cc @yastij @andrewsykim @mcrute @nckturner @cheftako @jonjohnsonjr
/assign @liggitt @justinsb
/milestone v1.15
/priority important-soon

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files. sig/aws sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. labels Mar 22, 2019
@k8s-ci-robot k8s-ci-robot added the priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. label Mar 22, 2019
@k8s-ci-robot k8s-ci-robot added the cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. label Mar 22, 2019
@k8s-ci-robot
Copy link
Contributor

@tiffanyfay: GitHub didn't allow me to request PR reviews from the following users: jonjohnsonjr.

Note that only kubernetes members and repo collaborators can review this PR, and authors cannot review their own PRs.

In response to this:

What type of PR is this?
/kind cleanup

What this PR does / why we need it:

  • Adds init to aws cred provider for a bug fix
  • Removes excess ecrLazyProvide and moves main functionality into Provide. The next PR will remove LazyProvide completely as no provider is using it.
  • Uses ECR API's expiration instead of setting one globally
  • Doesn't cache if credentials for 12 hours if they aren't found which fixes a bug below
  • Breaks credential provider's dependency on cloud provider so cloud provider can be moved out of tree.
  • Creates only one provider. Also adds repoToPull to Provide interface.
  • AWS cred provider can also now get ECR credentials even without being on an EC2 instance -- it only needs AWS credentials to be set

Which issue(s) this PR fixes:

Fixes #37962
Fixes google/go-containerregistry#355

Special notes for your reviewer:
The Provide interface change to add repoToPull is needed for ECR to be able to get the registryID and region without using Cloud Provider. There are then used with the ECR API to get credentials. LazyProvide was added for AWS and no other provider uses it, nor are we using it anymore, so it will be removed in a future PR. This PR requires #75585 to be merged.

Does this PR introduce a user-facing change?:

The AWS credential provider can now obtain ECR credentials even without the AWS cloud provider or being on an EC2 instance. Additionally, AWS credential provider caching has been improved to honor the ECR credential timeout.

/sig aws
/sig cloud-provider
/cc @yastij @andrewsykim @mcrute @nckturner @cheftako @jonjohnsonjr
/assign @liggitt @justinsb
/milestone v1.15
/priority important-soon

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added the needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. label Mar 22, 2019
@k8s-ci-robot
Copy link
Contributor

Hi @tiffanyfay. Thanks for your PR.

I'm waiting for a kubernetes member to verify that this patch is reasonable to test. If it is, they should reply with /ok-to-test on its own line. Until that is done, I will not automatically test new commits in this PR, but the usual testing commands by org members will still work. Regular contributors should join the org to skip this step.

Once the patch is verified, the new status will be reflected by the ok-to-test label.

I understand the commands that are listed here.

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

@k8s-ci-robot k8s-ci-robot added area/kubelet sig/node Categorizes an issue or PR as relevant to SIG Node. labels Mar 22, 2019
@tiffanyfay
Copy link
Contributor Author

/assign random-liu

@dims
Copy link
Member

dims commented Mar 22, 2019

/ok-to-test

@k8s-ci-robot k8s-ci-robot added ok-to-test Indicates a non-member PR verified by an org member that is safe to test. and removed needs-ok-to-test Indicates a PR that requires an org member to verify it is safe to test. labels Mar 22, 2019
@andrewsykim
Copy link
Member

Thanks for the PR @tiffanyfay! This is also needed as part of #69585

return nil, err
}
splitURL := strings.Split(parsed.Host, ".")
if len(splitURL) < 4 || splitURL[2] != "ecr" || splitURL[4] != "amazonaws" {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

splitURL[2] can also (validly) be ecr-fips. If it's ecr-fips, you should also override the ECR API endpoint. You can construct the correct endpoint with something like this:

resolver := endpoints.DefaultResolver()
endpoint, err := resolver.EndpointFor("ecr-fips", region, func(opts *endpoints.Options) {
		opts.ResolveUnknownService = true
})

and then you can pass the endpoint in when creating the API client.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Will make this as a separate PR, thanks!

@tiffanyfay
Copy link
Contributor Author

/retest

@liggitt
Copy link
Member

liggitt commented Mar 28, 2019

structure lgtm

will defer to sig-aws reviews on AWS specifics and sufficient test coverage (does this actually get exercised in CI?)

@andrewsykim
Copy link
Member

andrewsykim commented Mar 28, 2019

/approve
/lgtm

I don't think there's e2e coverage for this. Some follow-up items before v1.15 goes out:

  • remove LazyProvide() in all providers - can this be removed in the keyring as well?
  • lots of testing - ideally add e2e tests for this, though this will be a bit more involved since we'll likely need CNCF to provision ECR repo for us

@justinsb PTAL :)

@k8s-ci-robot k8s-ci-robot added the lgtm "Looks good to me", indicates that a PR is ready to be merged. label Mar 28, 2019
@mcrute
Copy link
Contributor

mcrute commented Mar 28, 2019

/lgtm
/approve

Also approved as a member of @kubernetes/sig-aws-misc

@liggitt
Copy link
Member

liggitt commented Mar 29, 2019

Given the parsed repoName is already being passed to keyring.Lookup(repoName) today, I think this PR is fine, but I'd like an ack from sig-node that the parsed repoName being passed to credentialprovider.Provide() is coherent with the idea of the kubelet being neutral about image name meaning (xref #58955).

/cc @derekwaynecarr @smarterclayton @mrunalp @Random-Liu @feiskyer

From my reading of this, the credential provider interface is oriented around docker image specs (DockerConfigProvider, DockerKeyring), so it's appropriate to assume docker image spec format here.

@smarterclayton
Copy link
Contributor

I am ok with passing repoName to credential provider, because credential provider assumes docker pull specs as supported by all standard CRI implementations.

One question - does this has performance implications since we're now having to load more providers?

@liggitt
Copy link
Member

liggitt commented Mar 29, 2019

One question - does this has performance implications since we're now having to load more providers?

It shouldn't, it's one additional provider, which does fast detection (in the Enabled() check), equivalent to the existing providers.

@liggitt
Copy link
Member

liggitt commented Mar 29, 2019

/lgtm
/approve

@k8s-ci-robot
Copy link
Contributor

[APPROVALNOTIFIER] This PR is APPROVED

This pull-request has been approved by: andrewsykim, liggitt, mcrute, tiffanyfay

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@k8s-ci-robot k8s-ci-robot added the approved Indicates a PR has been approved by an approver from all required OWNERS files. label Mar 29, 2019
@k8s-ci-robot k8s-ci-robot merged commit dabeb20 into kubernetes:master Mar 29, 2019
if len(splitURL) == 0 {
return nil, fmt.Errorf("%s is not a valid ECR repository URL", parsed.Hostname())
}
return &parsedURL{
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Should there be a check on the number of parts in splitURL before accessing ?

f.mutex.Lock()
defer f.mutex.Unlock()

if getter, ok := f.cache[region]; ok {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It seems we can introduce RWLock and use read lock here, write lock for line 253.

k8s-ci-robot added a commit that referenced this pull request Aug 7, 2019
…5585-#75587-upstream-release-1.14

Automated cherry pick of #75585: Updated client-go expiration cache to take in expiration #75587: Remove aws cred provider dep on cloud provider
@matthewdfuller
Copy link

@tiffanyfay - thanks for this change, it's super helpful to us to have this! Are there any docs or example use cases on where/how to pass the AWS credentials (assuming access key/secret) into a non-AWS-based Kubelet? Should they be environment variables?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
approved Indicates a PR has been approved by an approver from all required OWNERS files. area/cloudprovider area/kubelet cncf-cla: yes Indicates the PR's author has signed the CNCF CLA. kind/cleanup Categorizes issue or PR as related to cleaning up code, process, or technical debt. lgtm "Looks good to me", indicates that a PR is ready to be merged. ok-to-test Indicates a non-member PR verified by an org member that is safe to test. priority/important-soon Must be staffed and worked on either currently, or very soon, ideally in time for the next release. release-note Denotes a PR that will be considered when it comes time to generate release notes. sig/cloud-provider Categorizes an issue or PR as relevant to SIG Cloud Provider. sig/node Categorizes an issue or PR as relevant to SIG Node. size/XL Denotes a PR that changes 500-999 lines, ignoring generated files.
Projects
None yet
Development

Successfully merging this pull request may close these issues.

k8schain doesn't work with ECR Kubelet doesn't retry if failed to get ECR auth token