-
Notifications
You must be signed in to change notification settings - Fork 190
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add handling for mfa_serial in ~/.aws/config #1359
Conversation
- The default provider will simply log/return an error if an MFA token is requested, but users can plug their own ProvideMfaToken implementations (e.g. to read from stdin).
…er and ClientConfiguration The structs already have non-public fields so non_exhaustive is redundant
96134d5
to
e1f14cb
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks a ton for contributing this! ❤️
The overall structure looks good to me. You raise a very good point about the lazy cache timeout potentially timing out the MFA user input use-case. We definitely need to think more about that, and maybe there is some structural change that needs to happen to ProvideCredentials
to support this use-case (something the cache can be aware of and exclude from its timeout). I don't have any good suggestions there yet and need to mull on it for a bit.
is aws_config the right place for the new mfa_provider module?
This is a great question! The aws-types
crate is meant to have as stable of an API as possible, while aws-config
provides implementations for aws-types
and is intended to be less stable. I think I would answer this question with: is the AssumeRoleProvider
exposed in aws-types
? If not, then the MFA provider should probably live in aws-config
.
is aws_config::profile the right place?
Naively, it seems like it should go in the same place organizationally as the AssumeRoleProvider
, but I'm not certain if there are other credential providers that have similar MFA features.
general naming stuff
This is always tough. In general, if it's related to the AssumeRoleProvider
, we should take the names from STS's AssumeRole
operation. However, those names are less meaningful in the context of the larger profile credentials provider, so we may want to name them differently there.
Thinking about the cache timeout issue more, I'm wondering if It seems like an application requiring MFA is going to need to be more aware of the overall MFA input process. For example, what is the expected behavior if the MFA token expires for the TTY input use-case? Would the user be prompted for another token? Would that interrupt something else in the TTY? Or thinking about a GUI application that wants to run an AWS command on the user's behalf. If a dialog prompt came up asking for MFA, and the user clicked cancel, would there need to be some way to communicate that cancellation action back so that the application can handle it accordingly rather than assuming some unrecoverable error occurred? Anyway, that's my brain dump for now. Will keep thinking about this. |
I'm not sure I understand this comment:
I'm not wedded to it being async by any means, but I think maybe it's orthogonal to the rest of the issue? (I mean that the next part of the sentance, asking the user ahead of time, can work just as well if it stays async, I think?) |
GetSessionToken also supports MFA. |
I think we have a potential solution for the timeout issue. What we can do is modify the pub trait ProvideMfaToken: Send + Sync + Debug {
fn mfa_token(&self, mfa_serial: &str) -> Result<MaybeMfaToken, MfaTokenFetchError>;
} With an enum that allows for later, timeout free, resolution of the MFA token: #[non_exhaustive]
pub enum MaybeMfaToken {
// The names definitely need some thinking...
// This variant states that we believe the token will become available before the timeout
NearTerm(future::ProvideMfaToken),
// And this variant says we know the token likely won't be available before the timeout
NotReady(future::WhenReady),
} When the The nice part about this approach is that it:
Downsides are:
I'm definitely open to more ideas, but I think overall we need an RFC to examine this deeper (if for no other reason than to answer what went into our decision making). If that's something you're interested in tackling, I'm happy to help out! |
Hi @jdisanti , first of all thanks for the thoughtful response on this caching / timeouts issue. I had hoped to have a bit more time on this during the week but it doesn't look like that's panned out. I'm happy in theory to look at an RFC, just with the caveat that my latency on discussions is going to be quite high over the next couple of weeks - I don't think that's a huge deal since this wasn't being worked on before (?), but if folks are keen to make progress here then I don't want to be a bottleneck. I have two general thoughts on this topic before thinking harder about solutions though (I'm reaching for ways for it not to be a problem in the first place - I wonder whether we're over-complicating things?):
So, from the looks of it, the timeout in Overall, I do think the One missing piece in the "just widen the timeout on the cache" model is: the cache timeout is what's being configured with a call to |
So, setting aside the "just widen the timeout" idea in my previous for the time being, and engaging with the idea of having the I think the basic idea of using a boxed trait to back-channel information back up to the I wonder whether there's a way we can take your idea a step further; instead of:
Can we say directly that "some providers are special", and skip the timeout dance entirely, for certain providers?
This would involve a new trait, something like this: /// `ProvideCredentialsWithoutTimeout` implementations are not subject to the normal timeout
/// rules imposed by the `LazyCachingCredentialsProvider`.
pub(crate) trait ProvideCredentialsTimeoutCustomizer {
/// If the provider returns its own timeout, this is what will be used in `LazyCachingCredentialsProvider`
fn provider_timeout_override(self) -> Option<Duration>;
} And then in let load_timeout = if let Some(timeout_customizer) = (loader.as_ref() as &dyn Any).downcast_ref::<dyn ProvideCredentialsTimeoutCustomizer>() {
// Defer to the provider
timeout_customizer.provider_timeout_override().unwrap_or(self.load_timeout)
} else {
// As before
self.load_timeout
}; The tricky part is that there's some plumbing required to make I guess one final possibility that doesn't require any downcasting hijinks is to add something like the If you let me know your thoughts, I'd be happy to take a stab at drafting an RFC some time over the next 2 weeks to bring all these three paths together into a coherent proposal. I find the angle that there's no timeout around this in boto quite persuasive, though presumably timeout was added to the cache layer for a good reason(!). |
…token_handling
This path doesn't use the ProvideMfaToken trait, since parameters are expected to be provided to AssumeRoleProviderBuilder up-front (c.f. region is supplied directly, rather than as a ProvideRegion).
@tschuett thanks for the note. I'm not sure whether you're suggesting a change here, or just pointing out that one can use this API as-is, since it accepts |
There was a discussion whether AssumeRole is the only provider using MFA. The Rust people use it: |
If every implementation of |
Ah, got it, thanks. As it happens, what I was trying to implement against the Rust SDK when I stumbled into this was extremely similar to that tool, so good to know! |
If you put a MFA protected bucket policy on your S3 bucket, then AssumeRole is insufficient. |
@jelford - You raise some very good points, and the example of boto separating the concerns of caching and timeout is particularly intriguing. I'm definitely open to that approach so long as we figure out how that plays into the default credential provider chain and its configuration.
That was something I considered before arriving at the boxed back-channel error approach, but the API change to the
Yeah, I see what you mean. I spent a bit of time playing around with it in the Rust Playground and wasn't able to get from an
I don't recall there being a good reason, but @rcoh might know of one.
This sounds great! I think an RFC exploring the different solutions and their pros and cons will help a lot with deciding which we want to go with. I think the biggest snag with the separation of cache and timeout will be figuring out exactly how it works in the default chain. There are lots of questions that come to mind, such as: what are the defaults? Is the timeout configurable for each individual provider in the chain? Or does the timeout become an implementation detail of each individual provider? Are there API changes? And so on... Thanks for the great discussion so far! |
Closing this as I'm not currently actively working on it. I still want to pursue MFA token support, but as discussed, there should probably be an RFC for any changes around how timeouts are handled, and I notice that RFC-0014 has landed since I was first looking at this a few months ago, so I think it's a good idea to pause and digest what's gone on there before proceeding. I would like to come back to this, but spare time to work on things outside of "work" is at a premium right now. I'd be happy to chat through if anyone is eager to make progress here and wants to pick it up. |
Motivation and Context
Fixes: awslabs/aws-sdk-rust#527
Description
Adds trait and configuration machinery for supplying MFA tokens. The goal is to fit into the
AssumeRole
workflow in a similar way to what happens inboto3
.The default provider will just log/return an error if an MFA token is requested, but users can plug their own ProvideMfaToken implementations (e.g. to read from stdin). There are a few outstanding things I know about for the PR to be ready:
But I'd also like to get some feedback before addressing those on the general shape of the PR (I think it's about in line with what was discussed in awslabs/aws-sdk-rust#527), and I wasn't sure on a few code structure things - in particular:
profile::credentials::exec::AssumeRoleProvider::credentials()
the right place to be calling out to users? One thing that's a little awkward about this is that by the time we get to this point in the implementation, we're already being wrapped in a (rather short) timeout for the cache... which might be a bit of a smell, if we're then pausing for user input?aws_config
the right place for the newmfa_provider
module? I notice thatProvideCredentials
and friends come fromaws_types
- but I'm not sure that the MFA code is really "shared" between services in a way that would qualify it for theaws_types
crate?aws_config::profile
the right place? Maybe it should go beneathprofile::credentials
? I'm not totally clear on the structure here.ProvideCredentials
, is that about right?One thing that I proposed in awslabs/aws-sdk-rust#527 was to add a default
ProvideMfaToken
implementation that would read the code fromstdin
, analogously to how boto usesgetpass
. @Velfi suggested over there that it would probably be worth an RFC for something like this, and having done a rough implementation for testing, I think they were right - there are a few choices to be made for implementation (I don't think Rust has a convenientgetpass
that we can reach for), so a more conservative approach seems appropriate. I would think it would be convenient at some point to have parity with other SDKs here, but this initial implementation at least gives us the ability to bring our own (and maybe gives the ecosystem space to play around and find the right / minimal way of doing it).Here's a sample implementation of
ProvideMfaToken
that grabs the token from stdin using theatty
andrpassword
crates:Testing
Cargo.toml
overrides (assumingsmithy-rs
is checked out in a neighbouring repository): https://github.com/jelford/aws_mfa_token_provider. That's what I've been using for exploratory testing, and have verified that:ProvideMfaToken
implementation, a reasonably friendly error comes back to the user.Checklist
CHANGELOG.next.toml
if I made changes to the smithy-rs codegen or runtime cratesCHANGELOG.next.toml
if I made changes to the AWS SDK, generated SDK code, or SDK runtime cratesBy submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.