-
Notifications
You must be signed in to change notification settings - Fork 458
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We鈥檒l occasionally send you account related emails.
Already on GitHub? Sign in to your account
API Refactor: Make API error reporting consistent #1734
API Refactor: Make API error reporting consistent #1734
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks for this @marcosy! It's looking great.
return nil, status.Error(codes.InvalidArgument, "only the trust domain of the server can be appended") | ||
} | ||
|
||
dsResp, err := s.ds.FetchBundle(ctx, &datastore.FetchBundleRequest{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like this was not really used, good catch!
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
the intention in that code was to validate that bundle exist before trying to append a new certificate, can you add a new test to validate that ds prevents that scenary?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Oh I see. The AppendBundle
datastore RPC checks for the existence of the bundle and creates it if it does not exists.
The current implementation (the one that makes the Fetch call) fails if the bundle does not exist.
I'm not sure which is the desired behaviour for this RPC. If we want it to fail when the bundle does not exist we should update the AppendBundle
datastore RPC to handle this and avoid possible races.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this preemptive fetch is necessary. The datastore will do the right thing here.
pkg/server/api/status.go
Outdated
} | ||
|
||
// MakeStatus logs and returns a status composed of: msg, err and code. | ||
// Errors are treated differentially according to its gRPC code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Errors are treated differentially according to its gRPC code. | |
// Errors are treated differently according to its gRPC code. |
pkg/server/api/status.go
Outdated
return CreateStatus(code, status.Convert(e).Message()) | ||
} | ||
|
||
// MakeErr logs and returns an error comoposed of: msg, err and code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// MakeErr logs and returns an error comoposed of: msg, err and code. | |
// MakeErr logs and returns an error composed of: msg, err and code. |
pkg/server/api/status.go
Outdated
} | ||
|
||
// MakeErr logs and returns an error comoposed of: msg, err and code. | ||
// Errors are treated differentially according to its gRPC code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
// Errors are treated differentially according to its gRPC code. | |
// Errors are treated differently according to its gRPC code. |
pkg/server/api/agent/v1/service.go
Outdated
@@ -250,21 +230,18 @@ func (s *Service) RenewAgent(ctx context.Context, req *agent.RenewAgentRequest) | |||
log := rpccontext.Logger(ctx) | |||
|
|||
if err := rpccontext.RateLimit(ctx, 1); err != nil { | |||
log.WithError(err).Error("Rejecting request due to renew agent rate limiting") | |||
return nil, err | |||
return nil, api.MakeErr(log, codes.Unavailable, "rejecting request due to renew agent rate limiting", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Maybe codes.ResourceExhausted
is a better fit for rate limiting errors?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
do you verify what error is returned by rate limit? i can recall it returned a grpc status with the correct code
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I was wondering the same... I am not sure which one to choose. I found precedence for both of them:
-
The HTTP to gRPC mapping table suggests mapping code
429
Too Many Requests
toUNAVAILABLE
. -
The Google cloud API design guide suggests using
RESOURCE_EXAUSTED
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I did not see your previous comment @MarcosDY. Now that you mention it, it seems to me that the right thing to do is to extract and use the error code that comes from the rate limiter. Which could be: Internal, ResourceExausted or InvalidArgument.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
or we can just use error from limiter as it was
Data: logrus.Fields{ | ||
telemetry.TrustDomainID: serverTrustDomain.String(), | ||
logrus.ErrorKey: "unable to parse JWT authority: missing key ID", | ||
}, | ||
}, | ||
}, | ||
}, | ||
{ | ||
name: "datasource fails", |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We still call the datastore to append the bundle, so I think that we should keep the datastore failure case. It needs to be updated with the "failed to append bundle" error instead of "failed to fetch server bundle".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right! good catch!
pkg/server/api/agent/v1/service.go
Outdated
} | ||
|
||
_, err = s.ds.DeleteAttestedNode(ctx, &datastore.DeleteAttestedNodeRequest{ | ||
SpiffeId: id.String(), | ||
}) | ||
switch status.Code(err) { | ||
case codes.OK: | ||
log.Info("Agent deleted") | ||
return &empty.Empty{}, nil | ||
return &empty.Empty{}, api.MakeErr(log, codes.OK, "agent deleted", nil) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it does not feel right to use MakeErr for creating an OK message, we really need to have a message for OK? it looks to me like a Succes is enough in all case
pkg/server/api/status.go
Outdated
// Add the prefix 'Invalid argument' for InvalidArgument errors | ||
if err != nil { | ||
log = log.WithError(err) | ||
errMsg = fmt.Sprintf(msg+": %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it is a little hard to read, why not make: fmt.Sprintf("%s: %v", msg, err)
?
pkg/server/api/status.go
Outdated
errMsg = fmt.Sprintf(msg+": %v", err) | ||
} | ||
|
||
log.Error("Invalid argument: " + msg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
log.Errorf("Invalid argument: %s", msg)
// MakeErr logs and returns an error comoposed of: msg, err and code. | ||
// Errors are treated differentially according to its gRPC code. | ||
func MakeErr(log logrus.FieldLogger, code codes.Code, msg string, err error) error { | ||
errMsg := msg |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errMsg is not really used, it is replaced after initialized
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
errMsg
is used in the default
and InvalidArgument
cases. In those cases the returned error contains the description of the inner error (if not nil).
case codes.NotFound: | ||
// Do not log nor return the inner error for NotFound errors | ||
log.Error(capitalize(msg)) | ||
return status.Error(code, errMsg) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
no changes on errMsg you can use msg here
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Right, both can be used. I'd rather use errMsg
here for consistency:
msg
-> for all the logged messages
errMsg
-> for the returned error.
pkg/server/api/status.go
Outdated
default: | ||
if err != nil { | ||
log = log.WithError(err) | ||
errMsg = fmt.Sprintf(msg+": %v", err) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
same comments that for InvalidArgument
func MakeErr(log logrus.FieldLogger, code codes.Code, msg string, err error) error { | ||
errMsg := msg | ||
switch code { | ||
case codes.OK: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
it feels like OK
can have its own methods, it does not feel right to call MakeErr on case of success
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good! I'll change it
return nil, status.Error(codes.InvalidArgument, "only the trust domain of the server can be appended") | ||
} | ||
|
||
dsResp, err := s.ds.FetchBundle(ctx, &datastore.FetchBundleRequest{ |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this preemptive fetch is necessary. The datastore will do the right thing here.
pkg/server/api/bundle/v1/service.go
Outdated
} | ||
|
||
resp, err := s.ds.AppendBundle(ctx, &datastore.AppendBundleRequest{ | ||
Bundle: &common.Bundle{ | ||
TrustDomainId: td.IDString(), | ||
JwtSigningKeys: dsBundle.JwtSigningKeys, | ||
RootCas: dsBundle.RootCas, | ||
RefreshHint: dsBundle.RefreshHint, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Append only operates on the JWT signing key and Root CAs.
I'm wondering if taking a bundle
as input is even appropriate considering we'll ignore the sequence number and refresh hint...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Good catch! I thought it was missing. Right, perhaps the AppendBundleRequest
could take only the Root CAs and the JWT keys.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks, @marcosy !
This commit improves the consistency for the returned and logged errors in the new API. In addition, the AppendBundle RPC is slightly modified to not require a full bundle but only the Root CAs and JWT keys instead. Signed-off-by: Marcos Yedro <marcosyedro@gmail.com>
8e05078
to
84923e6
Compare
Pull Request check list
Affected functionality
Error reporting and logging for the new API
Description of change
This PR aims to increase the consistency in the error reporting and logging for the new API.
The following consistency criteria were adopted based on offline conversations and the current API style (feel free to propose changes!):
InvalidArgument
code must prefix the logged message with an "Invalid argument:" string.NotFound
code, must ignore the inner error and just log/return the message.It was hard to find a pattern for the
OK
status code. Sometimes it is desired to log a message, sometimes not, sometimes the log level is different.It would be great to hear ideas/proposals about it 馃檪
Signed-off-by: Marcos Yedro marcosyedro@gmail.com