-
Notifications
You must be signed in to change notification settings - Fork 472
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
JWT workload API extensions #616
Conversation
- adds CLI to fetch and validate JWT SVIDs - adds an "issued at" claim to server minted JWTs - extends node API to convey "issued at" so the agent can cache appropriately w/o having to parse the JWT. - JWT lifetime is currently hard-coded to five minutes w/ a renewal time of 2.5 minutes. Signed-off-by: Andrew Harding <azdagron@gmail.com>
cmd/spire-agent/cli/api/common.go
Outdated
return net.DialTimeout("unix", addr, timeout) | ||
} | ||
// Workload API is unauthenticated | ||
conn, err := grpc.Dial(socketPath, grpc.WithInsecure(), grpc.WithDialer(dialer)) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We should probably put this in /api
directory so folks consuming the workload api client library can benefit from these updates. In general, it's good practice to build the CLI tooling on top of our own client libraries
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
you're talking about the dialing details? sure, i can put them in some helper in /api/workload.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
hmm, turns out there is already an /api/workload/dial package with Dial functionality. I'll switch to it.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or are you talking about making a high level JWTClient or something?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yes and yes :)
The code in CLI package should be relatively minimal, a thin interaction layer that sits on top of the client library. IMO, having a lot of logic that lives in the CLI code indicates that the client library is lacking in some way
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Are you ok with this work being in a follow up PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sure, let's make an issue though so we don't forget to do it
|
||
"github.com/mitchellh/cli" | ||
"github.com/spiffe/spire/proto/api/workload" | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spacing?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
|
||
func (c *validateJWTCommand) appendFlags(fs *flag.FlagSet) { | ||
fs.Var(&c.audience, "audience", "comma separated list of audience values") | ||
fs.StringVar(&c.svid, "svid", "", "JWT SVID") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
should the token be modeled as a flag like this, or as the last argument?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
for consistency sake, i'd say a flag. we have a lot of required flags across our various CLI's that are modeled in this way...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SGTM
|
||
resp, err := c.validateJWTSVID(ctx, client) | ||
if err != nil { | ||
return fmt.Errorf("unable to validate JWT SVID: %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we should distinguish between "we ran into an error" and "this jwt is not valid"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
so i can see that happening in two ways: 1) check for some explicit gRPC error which makes the coupling tight between client and server, or 2) ValidateJWTSVIDResponse gains a bool valid
field (or we could treat an empty spiffe_id field as "not valid", but i prefer the boolean).
I'm leaning towards number #2, I don't like relying on the gRPC transport to deliver "failure".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
SIG-Spec discussed this at one point, and the consensus was that number one was the way to go for safety reasons (it is easy to get a false positive if a client forgets to check the value of the valid
bool).
We are missing text in the current Workload API extensions document that describes this, as well as what error code should be used. Looking back through SIG-Spec notes, this was discussed on the August 23rd call, and the result was "use InvalidArgument with an appropriate description attached"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok, i'll go with InvalidArgument
@@ -187,9 +187,9 @@ plugins { | |||
|
|||
14. Simulate the workload API interaction and retrieve the workload SVID bundle by running the `api` subcommand in the agent. Run the command as user **_workload_** created in step #4 with uid 1000 | |||
|
|||
su -c "./cmd/spire-agent/spire-agent api fetch" workload | |||
su -c "./cmd/spire-agent/spire-agent api fetch x509 " workload |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if there are other places that we need to update this... Should we make x509
the default subcommand if one isn't given?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
that seems like a safe default, i'll do that.
@@ -131,13 +131,13 @@ func (s *HandlerTestSuite) TestFetchX509SVID() { | |||
func (s *HandlerTestSuite) TestSendResponse() { | |||
emptyUpdate := new(cache.WorkloadUpdate) | |||
s.stream.EXPECT().Send(gomock.Any()).Times(0) | |||
err := s.h.sendResponse(emptyUpdate, s.stream) | |||
err := s.h.sendX509SVIDResponse(emptyUpdate, s.stream) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Tests for JWT handler functionality?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oops. good catch.
pkg/agent/manager/cache/cache.go
Outdated
// GetJWTSVID retrieves a cached JWT SVID based on the subject and | ||
// intended audience. | ||
GetJWTSVID(spiffeID string, audience []string) *client.JWTSVID | ||
// SetJWTSVID caches a JWT SVID based on the subject and intended audience. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: errant tab
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
pkg/agent/manager/cache/cache.go
Outdated
|
||
// GetJWTSVID retrieves a cached JWT SVID based on the subject and | ||
// intended audience. | ||
GetJWTSVID(spiffeID string, audience []string) *client.JWTSVID |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nit: personal preference for ok
pattern here since it gives explicit error checking over implicit
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
proto/api/workload/workload.proto
Outdated
@@ -42,7 +44,51 @@ message X509SVID { | |||
repeated string federates_with = 5; | |||
} | |||
|
|||
|
|||
message JWTASVID { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
JWTSVID
... sorry for the last minute switch
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
proto/api/workload/workload.proto
Outdated
} | ||
|
||
message ValidateJWTASVIDRequest { | ||
repeated string audience = 1; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I wonder if this should be repeated or not? Feels like the validator should expect only a single audience, but maybe I'm overlooking something
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
One scenario I can think of is migration of audience value. You'd want a window where the old audience value is still acceptable until all of the JWTs minted against the old audience expired.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hmm yea... It can be done on the other side too, i.e. the subject sends a token with both old and new audiences, and the migrating validator identifies with one or the other.
I'm having a hard time forming an opinion here, but it kind of feels like it's more natural to have the validator identify with a single subject... I'm curious to hear other thoughts on the matter though
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess my point is that the subject would have to be updated to know about the new audience value. I can imagine a partial rollout where old subjects are still requesting the old value, updated subjects are requesting the new value, and the target would be receiving JWT's from both.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
After some offline discussion, we've decided to move to a non-repeated audience field for now. It is much easier to transition to a repeated field later if we determine we need it than moving in the opposite direction.
Signed-off-by: Andrew Harding <azdagron@gmail.com>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
🤴 ⏯
return nil, errs.New("audience must be specified") | ||
} | ||
if req.Svid == "" { | ||
return nil, errs.New("svid must be specified") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
seems like these should also be InvalidArgument gRPC code?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
👍
key := keyFromJWTSpiffeIDAndAudience(spiffeID, audience) | ||
c.m.Lock() | ||
defer c.m.Unlock() | ||
return c.jwtSVIDS[key] | ||
svid, ok := c.jwtSVIDS[key] | ||
return svid, ok |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
interesting...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
yep. gotta coerce a comma-ok here.
Signed-off-by: Andrew Harding <azdagron@gmail.com>
agent can cache appropriately w/o having to parse the JWT.
of 2.5 minutes.