Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fix race in "spire" UpstreamAuthority plugin #1917

Merged

Conversation

azdagron
Copy link
Member

The "spire" UpstreamAuthority plugin maintains a local copy of the upstream bundle. When the upstream bundle changes, updates are sent back to SPIRE core. The local copy is adjusted synchronously with the response of NewDownstreamX509CA and PublishJWTAuthority RPCs. However, there is also an asynchronous goroutine polling for updates to the bundle.
To prevent the pollling goroutine from overwriting the results of the aforementioned RPCs, the plugin maintains a simple "version" number that is bumped whenever the bundle is updated. The polling goroutine captures
the version number before polling for the bundle and is supposed to only update the local copy if the local copy hasn't otherwise been updated (by NewDownstreamX509CA or PublishJWTAuthority responses).

However, the version number check and update operation, i.e. "replace if version matches" operation does not take place under the same lock, opening up a race condition where the version check succeeds, the local copy is updated by other goroutines, and then those updates are overwritten by the now-stale bundle retrieved by the polling goroutine.

This PR fixes that race condition by performing the version check and replacement while under the lock.

It also fixes races in the unit-tests due to the wrong mock clock implemtnation being used and not triggering timers enough to ensure updates have been picked up before asserting the response.

The "spire" UpstreamAuthority plugin maintains a local copy of the
upstream bundle. When the upstream bundle changes, updates are sent back
to SPIRE core. The local copy is adjusted synchronously with the
response of NewDownstreamX509CA and PublishJWTAuthority RPCs. However,
there is also an asynchronous goroutine polling for updates to the bundle.
To prevent the pollling goroutine from overwriting the results of the
aforementioned RPCs, the plugin maintains a simple "version" number that
is bumped whenever the bundle is updated. The polling goroutine captures
the version number before polling for the bundle and is supposed to only
update the local copy if the local copy hasn't otherwise been updated
(by NewDownstreamX509CA or PublishJWTAuthority responses).

However, the version number check and update operation, i.e.
"replace if version matches" operation does not take place under the
same lock, opening up a race condition where the version check succeeds,
the local copy is updated by other goroutines, and then those updates
are overwritten by the now-stale bundle retrieved by the polling
goroutine.

This PR fixes that race condition by performing the version check and
replacement while under the lock.

It also fixes races in the unit-tests due to the wrong mock clock
implemtnation being used and not triggering timers enough to ensure
updates have been picked up before asserting the response.

Signed-off-by: Andrew Harding <aharding@vmware.com>
if m.bundleVersion == preFetchCallVersion && resp != nil {
m.setBundle(resp)
} else {
m.setBundleIfVersionMatches(resp, preFetchCallVersion)
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This is the fix to the race in the production code. The rest applies to the unit-tests.

Copy link
Member

@evan2645 evan2645 left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

awesome, thank you for this @azdagron! nice catch.

we will ship this patch in the next point release

@evan2645 evan2645 merged commit c38b05a into spiffe:master Oct 19, 2020
@azdagron azdagron deleted the fix-spire-upstreamauthority-bundle-race branch October 19, 2020 18:27
@APTy APTy added this to To be cherry-picked in 0.11.2 Release Oct 20, 2020
This was referenced Oct 27, 2020
@evan2645 evan2645 moved this from To be cherry-picked to Merged in 0.11.2 Release Oct 30, 2020
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
No open projects
Development

Successfully merging this pull request may close these issues.

None yet

2 participants