Skip to content

Conversation

@ramiro-gamarra
Copy link
Contributor

Reason for Change:

This PR adds a CertRefresher construct, which allows background pull of new certificates from KeyVault. We use this construct to provide a tls.Certificate in a dynamic manner to an HTTP server that uses a tls.Config.

Issue Fixed:

Requirements:

Notes:

@ramiro-gamarra ramiro-gamarra requested a review from a team as a code owner June 9, 2022 02:23
@ramiro-gamarra ramiro-gamarra requested review from rbtr and removed request for a team June 9, 2022 02:23
@ramiro-gamarra ramiro-gamarra requested a review from matmerr as a code owner June 9, 2022 17:19
rbtr
rbtr previously approved these changes Jun 9, 2022
Copy link
Collaborator

@rbtr rbtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some suggestions but looks good already 🚀

}

go func() {
errChan <- cr.Refresh(ctx, tlsSettings.KeyVaultCertificateRefreshInterval)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't think this will ever return cleanly unless the certificate's expiration is before the next refresh interval. The context has no cancellation. That might be fine if it's intended to share the lifetime of the application, but I figured I'd mention it.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, ideally, context would be propagated all the way from main to handle signals cleanly but there are a few layers of indirection unfortunately, so I opted for a context.TODO() in the meantime. This goroutine is indeed tied to the lifetime of the application so a temporary leak might be ok for now. Since I have some changes to push, I may try to propagate context, but will depend on size of changes.

aegal
aegal previously approved these changes Jun 10, 2022
Copy link
Contributor

@aegal aegal left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this looks good

@ramiro-gamarra ramiro-gamarra dismissed stale reviews from aegal and rbtr via 16d26cb June 14, 2022 05:26
Copy link
Contributor Author

@ramiro-gamarra ramiro-gamarra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Added some context to a few of the changes

},
"MSISettings": {
"ResourceID": ""
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

adding these to keep this documentation up to date

}
m sync.RWMutex
Orchestrator string
NodeID string
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

motivation for changes in this file are making this cns logger easier to stub out in tests, as well as addressing what I'm pretty sure is a race condition when setting these fields, since the logging methods here read them all the time but they are set, possibly multiple times, from an http handler.

return nil
}

if err := retry.Do(retryFn, retry.Context(ctx), retry.Delay(time.Second), retry.DelayType(retry.FixedDelay)); err != nil {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

changed behavior a bit from latest revision. the intent is now to retry to fetch from keyvault until a deadline set to the current cert expiration date is met. basically, we try to keep cns alive for as long as the cert is valid.

return &listener, nil
}

func GetTlsConfig(tlsSettings localtls.TlsSettings) (*tls.Config, error) {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

moved the below functions out of this package because injecting a cns logger here creates an import cycle. we should really do away with common

GetLatestTLSCertificate(ctx context.Context, certName string) (tls.Certificate, error)
}

type logger interface {
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I switched from zap to existing logging mechanism to keep any possible side effects from this PR to a minimum. I enjoyed adding logs with zap so when given a chance I plan to revisit this package (or CNS in general) to contribute to its adoption.


// Initialize CNS Logger
func InitLogger(fileName string, logLevel, logTarget int, logDir string) {
Log = &CNSLogger{
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

❤️

Copy link
Collaborator

@rbtr rbtr Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

ditto, but maybe fixing the logger should be a separate set of changes (that go first, if necessary)? scope is creeping 🙁

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I don't mind extracting this out if it makes it easier to review

Copy link
Member

@timraymond timraymond left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this looks good, but I'll defer to someone with more familiarity in CNS for approval. I've got just a couple of questions / suggestions.

Comment on lines 162 to 167
func (c *CNSLogger) getOrchestratorAndNodeID() (orch, nodeID string) {
c.m.RLock()
orch, nodeID = c.Orchestrator, c.NodeID
c.m.RUnlock()
return
}
Copy link
Member

@timraymond timraymond Jun 15, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't seem like the named returns are super necessary here unless I'm missing something...

Suggested change
func (c *CNSLogger) getOrchestratorAndNodeID() (orch, nodeID string) {
c.m.RLock()
orch, nodeID = c.Orchestrator, c.NodeID
c.m.RUnlock()
return
}
func (c *CNSLogger) getOrchestratorAndNodeID() (orch, nodeID string) {
c.m.RLock()
defer c.m.RUnlock()
return c.Orchestrator, c.NodeID
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

the linter actually suggested this. I kind of agree, only when there are multiple returns of the same type.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, I can see the names would be useful in the signature. I think in this case I would name them, but still explicitly return. Edited the suggestion to show what I mean.

Comment on lines 206 to 62
func Close() {
Log.Close()
}

func InitLogger(fileName string, logLevel, logTarget int, logDir string) {
Log, _ = NewCNSLogger(fileName, logLevel, logTarget, logDir)
}

func InitAI(aiConfig aitelemetry.AIConfig, disableTraceLogging, disableMetricLogging, disableEventLogging bool) {
Log.InitAI(aiConfig, disableTraceLogging, disableMetricLogging, disableEventLogging)
}

func SetContextDetails(orchestrator, nodeID string) {
Log.SetContextDetails(orchestrator, nodeID)
}

func Printf(format string, args ...interface{}) {
Log.Printf(format, args...)
}

func Debugf(format string, args ...interface{}) {
Log.Debugf(format, args...)
}

func LogEvent(event aitelemetry.Event) {
Log.LogEvent(event)
}

func Errorf(format string, args ...interface{}) {
Log.Errorf(format, args...)
}

func Request(tag string, request interface{}, err error) {
Log.Request(tag, request, err)
}

func Response(tag string, response interface{}, returnCode types.ResponseCode, err error) {
Log.Response(tag, response, returnCode, err)
}

func ResponseEx(tag string, request, response interface{}, returnCode types.ResponseCode, err error) {
Log.ResponseEx(tag, request, response, returnCode, err)
}

func SendMetric(metric aitelemetry.Metric) {
Log.SendMetric(metric)
}
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be nice to move these functions off to another file just to really call out that this is a compatibility layer.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good idea. Will address in the PR I'm extracting

Comment on lines 175 to 178
CustomDimensions: map[string]string{
OrchestratorTypeStr: orch,
NodeIDStr: nodeID,
},
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is it worth storing a map as just a general bag of keys and values like many structured logging tools do? We can set "orchestrator" and "nodeID" by default in there, but have the flexibility to add more as necessary. Unsure what the limitations of AI are though, so it may not be feasible :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If I understand correctly, is the suggestion to add a map of default custom dimensions to CNSLogger? I would agree, and we do something similar in DNC. In this case, these properties are lazily added; there will be a time during CNS runtime when they haven't been set by DNC yet, so they are not quite "default". Ideally, once set they should never change, but I don't want to change behavior too much this time around.

Copy link
Collaborator

@rbtr rbtr left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lgtm

@ramiro-gamarra ramiro-gamarra enabled auto-merge (squash) June 16, 2022 21:10
@ramiro-gamarra ramiro-gamarra merged commit 69f851a into Azure:master Jun 16, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

4 participants