Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ca: fix a masked bug in leaf cert generation that would not be notified of root cert rotation after the first one #15005

Merged
merged 2 commits into from Oct 17, 2022

Conversation

rboyer
Copy link
Member

@rboyer rboyer commented Oct 17, 2022

Description

In practice this was masked by #14956 and was only uncovered fixing the other bug.

go test ./agent -run TestAgentConnectCALeafCert_goodNotLocal

would fail when only #14956 was fixed.

…ed of root cert rotation after the first one

In practice this was masked by #14956 and was only uncovered fixing the
other bug.

  go test ./agent -run TestAgentConnectCALeafCert_goodNotLocal

would fail when only #14956 was fixed.
@rboyer rboyer requested a review from a team October 17, 2022 15:41
@rboyer rboyer self-assigned this Oct 17, 2022
@rboyer rboyer requested review from erichaberkorn and removed request for a team October 17, 2022 15:41
@@ -165,6 +165,7 @@ func (c *ConnectCALeaf) fetchDone(rootUpdateCh chan struct{}) {
if len(c.rootWatchSubscribers) == 0 && c.rootWatchCancel != nil {
// This was the last request. Stop the root watcher.
c.rootWatchCancel()
c.rootWatchCancel = nil
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

fetchStart uses the nil-ness of rootWatchCancel to decide if it needs to launch the goroutine again, so when we invoke the cancel and do not nil it out it means that the goroutine will never run again.

The fix for #14956 mostly caused the cache requests for things like ca roots to act more like the responses were cached or at least singleflighted better, and apparently that rugged "just try again" spirit kept this bug from manifesting in practice in leaf cert generation.

Copy link
Contributor

@boxofrad boxofrad left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yikes! Great catch 👏🏻

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backport/1.12 Changes are backported to 1.12 backport/1.13 Changes are backported to 1.13 pr/no-metrics-test
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants