Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix phony bundle updates #3102

Merged
merged 7 commits into from
Jun 1, 2022
Merged

Conversation

maxlambrecht
Copy link
Member

Pull Request check list

  • Commit conforms to CONTRIBUTING.md?
  • Proper tests/regressions included?
  • Documentation updated?

Affected functionality

Description of change
Change the behavior of the Fetch bundle methods in the workload handler so they only send the updates when they are new bundles, preventing spurious updates.

Which issue this PR fixes
fixes #3101

Signed-off-by: Max Lambrecht max.lambrecht@hpe.com

Signed-off-by: Max Lambrecht <max.lambrecht@hpe.com>
Copy link
Member

@azdagron azdagron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks for this, @maxlambrecht!

Can we add a test case for this?

return nil, status.Errorf(codes.Unavailable, "could not serialize response: %v", err)
}

if reflect.DeepEqual(resp.GetBundles(), previousResponse.GetBundles()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeepEqual won't do the right thing here. We should use proto.Equal instead. We should also compare the full response instead of just the bundle fields (more robust in the face of additional fields).

Suggested change
if reflect.DeepEqual(resp.GetBundles(), previousResponse.GetBundles()) {
if proto.Equal(resp, previousResponse) {

return nil, status.Errorf(codes.Unavailable, "could not serialize response: %v", err)
}

if reflect.DeepEqual(resp.GetBundles(), previousResponse.GetBundles()) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

DeepEqual won't do the right thing here. We should use proto.Equal instead. We should also compare the full response instead of just the bundle fields (more robust in the face of additional fields).

Suggested change
if reflect.DeepEqual(resp.GetBundles(), previousResponse.GetBundles()) {
if proto.Equal(resp, previousResponse) {

Max Lambrecht added 2 commits May 25, 2022 09:29
Signed-off-by: Max Lambrecht <max.lambrecht@hpe.com>
Signed-off-by: Max Lambrecht <max.lambrecht@hpe.com>
@maxlambrecht
Copy link
Member Author

maxlambrecht commented May 25, 2022

Can we add a test case for this?

I added a couple of tests for the FetchX509Bundles method, to cover multiple different updates and spurious updates. Let me know if they are ok and I'll add similar tests for the FetchJWTBundles.

Signed-off-by: Max Lambrecht <max.lambrecht@hpe.com>
@azdagron azdagron self-assigned this May 25, 2022

runTest(t, params,
func(ctx context.Context, client workloadPB.SpiffeWorkloadAPIClient) {
timeoutCtx, cancel := context.WithTimeout(ctx, 5*time.Second)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hmm, this will add 5 seconds to the unit test, correct? I wonder if there is another way we can test this deterministically without introducing this time. I have two reasons:

  1. This kind of change seems small but adds up in the aggregate, increasing our already terrible CI/CD pipeline times
  2. When running in CI/CD, due to the often resource constrained running environment, waiting for small time values like this doesn't always mean that the code we're waiting on had the chance to do everything. For example, in this case, it could be that another response was going to be sent down the stream, but the code was executing slowly enough that it would have taken longer than five seconds to do so. In which case we'd have a false positive indication of success.

One way we might be able to test this deterministically is by creating a test hook in the stream response loop so we can determine that we've received an update from the cache, we've sent the response (or not if there was a change) and are waiting for the next update. I think we could probably implement that hook without impacting the production code by adding some code to the fakeSubscriber, which gets a call to Update in the stream update loop.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm looking into how to test it with a hook in the stream update loop.

As an alternative, what do you think about adding a debug log?

if proto.Equal(resp, previousResponse) {
    log.Debug("spurious update")
    return previousResponse, nil
}

and add it to expectedLogs in the test.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That will have the same problem with determinism. You need to have a way to know when the handler code has completed evaluating the cache change and is waiting for the next one to deterministically assert that it handled the change correctly.

@@ -140,10 +141,11 @@ func (h *Handler) FetchJWTBundles(req *workload.JWTBundlesRequest, stream worklo
subscriber := h.c.Manager.SubscribeToCacheChanges(selectors)
defer subscriber.Finish()

var resp *workload.JWTBundlesResponse
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: can we rename this previousResp or something?

Signed-off-by: Andrew Harding <aharding@vmware.com>
@azdagron azdagron added this to the 1.3.1 milestone Jun 1, 2022
Signed-off-by: Max Lambrecht <max.lambrecht@hpe.com>
Copy link
Member

@azdagron azdagron left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks so much for this, @maxlambrecht !

@azdagron azdagron merged commit 6e0f759 into spiffe:main Jun 1, 2022
stevend-uber pushed a commit to stevend-uber/spire that referenced this pull request Oct 16, 2023
Signed-off-by: Max Lambrecht <max.lambrecht@hpe.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants