Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add cache functionality for tokens #140

Closed
wants to merge 1 commit into from

Conversation

@ArdaXi
Copy link

@ArdaXi ArdaXi commented Aug 30, 2018

As we have been running into the same issue as #99 and I hadn't seen any progress on that issue, I took a stab at it myself. This adds optional cache functionality that keeps the token valid for an hour and stores it in the cache directory.

This means we no longer have to provide a new MFA token for every single kubectl invocation.

@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Aug 30, 2018

[APPROVALNOTIFIER] This PR is NOT APPROVED

This pull-request has been approved by: ArdaXi
To fully approve this pull request, please assign additional approvers.
We suggest the following additional approver: nckturner

If they are not already assigned, you can assign the PR to them by writing /assign @nckturner in a comment when ready.

The full list of commands accepted by this bot can be found here.

The pull request process is described here

Needs approval from an approver in each of these files:

Approvers can indicate their approval by writing /approve in a comment
Approvers can cancel approval by writing /approve cancel in a comment

@ArdaXi ArdaXi force-pushed the ArdaXi:add-cache branch 2 times, most recently from a71768f to fdcfc00 Sep 6, 2018
@chrisanthropic
Copy link

@chrisanthropic chrisanthropic commented Sep 24, 2018

This PR is super exciting, my team really wants to stop being prompted for their MFA token for every helm and kubectl command.

@daraacca
Copy link

@daraacca daraacca commented Sep 26, 2018

Really looking forward to this, will solve some headaches!

Copy link
Contributor

@nckturner nckturner left a comment

I think having a cache for tokens generated by assumed-roles is reasonable. Do you have any measurements of how much time we would be saving with this cache? The goal is to save the time it takes to assume the role, is that correct? EDIT: Of course, not getting prompted to enter MFA credentials for every kubectl call is a great use case.

// generate an sts:GetCallerIdentity request and add our custom cluster ID header
request, _ := stsAPI.GetCallerIdentityRequest(&sts.GetCallerIdentityInput{})
request.HTTPRequest.Header.Add(clusterIDHeader, clusterID)

expiryTime := time.Minute

// if we're using a cache, keep the token valid for an hour (like STS does by default)

This comment has been minimized.

@nckturner

nckturner Oct 16, 2018
Contributor

STS by default generates temporary credentials valid for an hour, but according to some brief testing I did pre-signed STS urls (which is what our token is) are always valid for 15 minutes.

@@ -210,23 +266,56 @@ func (g generator) GetWithRoleForSession(clusterID string, roleARN string, sess
stsAPI = sts.New(sess, &aws.Config{Credentials: creds})
}

return g.GetWithSTS(clusterID, stsAPI)
return g.GetWithSTS(clusterID, roleARN, stsAPI)

This comment has been minimized.

@nckturner

nckturner Oct 16, 2018
Contributor

What about the case where roleARN is empty?

This comment has been minimized.

@ArdaXi

ArdaXi Oct 23, 2018
Author

At that stage, the roleARN is only used to determine the filename for the cache file. If it's empty, the corresponding filename segment will be empty as well, which will produce expected behaviour. That is, if a roleARN is specified, only a cache with that roleARN will be used, if not, only a cache without a roleARN will be used.

This comment has been minimized.

@nckturner

nckturner Dec 9, 2018
Contributor

Makes sense. Although, someone could use multiple IAM users without a role, on the same cluster, and then get the same token, correct?

This comment has been minimized.

@ArdaXi

ArdaXi Dec 9, 2018
Author

If someone changes their credentials without assuming a role with the same clusterID, they'll get the cache for the former, yes. I'm not entirely sure how this could be fixed easily, since you'd need to know before making any calls to AWS what your calling identity is if you want to avoid that.

}

func getCacheFile(clusterID string, roleARN string) (string, error) {
cacheDir := os.Getenv("XDG_CACHE_HOME")

This comment has been minimized.

@nckturner

nckturner Oct 16, 2018
Contributor

What about caching defaults for macOS and windows?

This comment has been minimized.

@ArdaXi

ArdaXi Oct 23, 2018
Author

Ideally this would be handled by an external dependency like https://github.com/casimir/xdg-go or https://github.com/OpenPeeDeeP/xdg or https://github.com/zchee/go-xdgbasedir but I'd like to get some feedback before introducing a new dependency.

This comment has been minimized.

@nckturner

nckturner Dec 9, 2018
Contributor

I think its reasonable to use a dependency to help us use customary default locations for placing the cache file (or just look at the locations they use and avoid the dependency).

This comment has been minimized.

@ArdaXi

ArdaXi Dec 9, 2018
Author

The main reason this would require a dependency (or at least one would be preferred) is because this is usually done by choosing the paths to include at compile time. If there's no preference I'll just pick one and update accordingly.

@@ -122,6 +122,26 @@ type getCallerIdentityWrapper struct {
} `json:"GetCallerIdentityResponse"`
}

// this structure is saved to cache for future invocations
type cachedToken struct {

This comment has been minimized.

@nckturner

nckturner Oct 16, 2018
Contributor

Just use the ExecCredential struct, it has an ExpirationTimestamp and Token.


// if we're using a cache, keep the token valid for an hour (like STS does by default)
if g.useCache {
expiryTime = time.Hour

This comment has been minimized.

@nckturner

nckturner Oct 16, 2018
Contributor

This shouldn't change because STS ignores it, unfortunately. See my PR which adds an expiration to the token response (#160). You can rebase once that gets merged?

return v1Prefix + base64.RawURLEncoding.EncodeToString([]byte(presignedURLString)), nil
tok := v1Prefix + base64.RawURLEncoding.EncodeToString([]byte(presignedURLString))

if g.useCache {

This comment has been minimized.

@nckturner

nckturner Oct 16, 2018
Contributor

Since I don't think it makes sense to cache tokens generated from users, only assumed roles, there should be a check that useCache is only set when a role is passed.

This comment has been minimized.

@ArdaXi

ArdaXi Oct 23, 2018
Author

Personally I believe that caching tokens generated from users does make sense in the MFA case, because even if you don't assume a role you're going to be prompted for credentials, and we'd like to avoid having to do that for every kubectl call. Either way having it be a command-line argument should make it plenty flexible and I'd like to avoid magic behaviour based on something other than that option.

This comment has been minimized.

@nckturner

nckturner Dec 9, 2018
Contributor

Makes sense.

@ArdaXi ArdaXi force-pushed the ArdaXi:add-cache branch from fdcfc00 to 9d745ec Oct 23, 2018
@esselius
Copy link

@esselius esselius commented Dec 3, 2018

Any progress with this?

@ArdaXi
Copy link
Author

@ArdaXi ArdaXi commented Dec 8, 2018

I believe I've addressed everything I can, I'm just waiting for feedback.

Copy link
Contributor

@nckturner nckturner left a comment

Thanks for your patience, a couple more comments. I'll get back to it more quickly for the next review, (though maybe after Kubecon).

return v1Prefix + base64.RawURLEncoding.EncodeToString([]byte(presignedURLString)), nil
tok := v1Prefix + base64.RawURLEncoding.EncodeToString([]byte(presignedURLString))

if g.useCache {

This comment has been minimized.

@nckturner

nckturner Dec 9, 2018
Contributor

Makes sense.

@@ -210,23 +266,56 @@ func (g generator) GetWithRoleForSession(clusterID string, roleARN string, sess
stsAPI = sts.New(sess, &aws.Config{Credentials: creds})
}

return g.GetWithSTS(clusterID, stsAPI)
return g.GetWithSTS(clusterID, roleARN, stsAPI)

This comment has been minimized.

@nckturner

nckturner Dec 9, 2018
Contributor

Makes sense. Although, someone could use multiple IAM users without a role, on the same cluster, and then get the same token, correct?

@llamahunter
Copy link
Contributor

@llamahunter llamahunter commented Jan 3, 2019

I have a version of this, coming soon, that uses the actual credential expiration time from aws. Waiting for aws-sdk-go people to take the necessary code change there to make it happen. Ah, they took the pull. aws/aws-sdk-go#2375

@llamahunter
Copy link
Contributor

@llamahunter llamahunter commented Jan 3, 2019

Looking at this implementation, it seems to me the caching is being done at the wrong spot? The expensive part, at least in my experience, is doing the sts.GetCallerIdentity() when you have to go through an external credential_process to do SSO stuff. Also, the SSO credentials have configurable expiration times of much longer than 15 minutes (default is 1 hour, but can be configured up to 12 hours). My caching solution (coming as soon as #182 is accepted) caches the underlying credential provider value for as long as AWS says it is valid.

@therealgambo
Copy link

@therealgambo therealgambo commented Feb 3, 2019

Any chance of merging this, #193 or both any time soon?

@genebean
Copy link

@genebean genebean commented Mar 7, 2019

Looks like #193 just got some updated comments so maybe there is a fix coming soon :)

@solacens
Copy link

@solacens solacens commented Apr 18, 2019

Any chance of merging this, #193 or both any time soon?

Looks like #193 just got some updated comments so maybe there is a fix coming soon :)

Looks like #193 is caching AWS credential instead of the STS token.

Also, I suggest that the flag --cache shouldn't be used as it's a completely different feature.

I'm wondering if this feature will continue or not.

Temporary solution here.

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Jul 17, 2019

Issues go stale after 90d of inactivity.
Mark the issue as fresh with /remove-lifecycle stale.
Stale issues rot after an additional 30d of inactivity and eventually close.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle stale

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Aug 16, 2019

Stale issues rot after 30d of inactivity.
Mark the issue as fresh with /remove-lifecycle rotten.
Rotten issues close after an additional 30d of inactivity.

If this issue is safe to close now please do so with /close.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/lifecycle rotten

@fejta-bot
Copy link

@fejta-bot fejta-bot commented Sep 15, 2019

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

@k8s-ci-robot
Copy link
Contributor

@k8s-ci-robot k8s-ci-robot commented Sep 15, 2019

@fejta-bot: Closed this PR.

In response to this:

Rotten issues close after 30d of inactivity.
Reopen the issue with /reopen.
Mark the issue as fresh with /remove-lifecycle rotten.

Send feedback to sig-testing, kubernetes/test-infra and/or fejta.
/close

Instructions for interacting with me using PR comments are available here. If you have questions or suggestions related to my behavior, please file an issue against the kubernetes/test-infra repository.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked issues

Successfully merging this pull request may close these issues.

None yet