Skip to content
This repository has been archived by the owner on Jan 11, 2023. It is now read-only.

rollout: Include health report in service object #51

Merged
merged 7 commits into from
Aug 7, 2020
Merged

rollout: Include health report in service object #51

merged 7 commits into from
Aug 7, 2020

Conversation

gvso
Copy link
Contributor

@gvso gvso commented Aug 4, 2020

This adds annotation about the last health report. It uses the key rollout.cloud.run/lastHealthReport and assigns it a string with the last diagnosis (healthy, unhealthy) and the check results.

@@ -6,7 +6,6 @@ require (
cloud.google.com/go v0.60.0 // indirect
github.com/TV4/logrus-stackdriver-formatter v0.1.0
github.com/go-stack/stack v1.8.0 // indirect
github.com/google/go-cmp v0.5.0
Copy link
Contributor Author

@gvso gvso Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

testify is able to compare more than primitives types, so I really didn't need go-cmp.

Comment on lines 15 to 55
report := map[string]interface{}{
"metricsType": criteria.Type,
"threshold": result.Threshold,
"actualValue": result.ActualValue,
"isCriteriaMet": result.IsCriteriaMet,
}
if criteria.Type == config.LatencyMetricsCheck {
report["percentile"] = criteria.Percentile
}
Copy link
Contributor Author

@gvso gvso Aug 4, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I also considered using the health.Diagnosis struct directly, but there are two drawbacks

  1. We have previously decided not to include the metrics type in health.CheckResult since that would be redundant. Plus, the percentile is also not part of that struct.
  2. The OverallDiagnosis property is a DiagnosisResult (int), so the JSON representation would turn out to be a value 0..3, which is not very helpful for our users.

@gvso
Copy link
Contributor Author

gvso commented Aug 4, 2020

Btw, in the project document, we talk about storing a history of all decisions. This makes this annotation seem redundant, but it might be useful to find why the last decision was taken faster.

If not, we can remove the lastHealthReport annotation part and keep the logic to generate an element of the history of decisions.

@@ -22,6 +22,19 @@ const (
Unhealthy
)

func (d DiagnosisResult) String() string {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

+++ good way

Comment on lines 26 to 32
if criteria.Type == config.LatencyMetricsCheck {
report += fmt.Sprintf("\n- %s[p%.0f]: %.2f (threshold %.2f)", criteria.Type, criteria.Percentile, result.ActualValue, criteria.Threshold)
continue
}

format := "\n- %s: %.2f (threshold %.2f)"
if criteria.Type == config.RequestCountMetricsCheck {
// No decimals for request count.
format = "\n- %s: %.0f (threshold %.0f)"
}
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Hopefully, this is readable enough and the example in the comment make it clear. If not, any recommendations?

// JSONReport returns a JSON representation of the diagnosis.
//
// TODO: consider if this function is useful (e.g. for a dashboard) since it's
// not used anywhere yet.
Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I kept this function since we might need it eventually and the tests are a little bit "tedious" to write.

@gvso gvso requested a review from ahmetb August 5, 2020 19:28
//
// TODO: consider if this function is useful (e.g. for a dashboard) since it's
// not used anywhere yet.
func JSONReport(healthCriteria []config.Metric, diagnosis Diagnosis) (string, error) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'd delete json for now

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I deleted it. I am keeping this branch in my fork, so I have access to that work anyways.

Getulio Valentin Sánchez added 6 commits August 6, 2020 17:03
This adds annotation about the last health report. It uses the key
rollout.cloud.run/lastHealthReport and assigns it a JSON string with the last
diagnosis (healthy, unhealthy) and the check results.

Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Comment on lines 11 to 15
// The returned string has the format:
// status: healthy
// metrics:
// - request-latency[p99]: 500.00 (needs 750.0)
// - request-count: 800 (needs 1000)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this will most likely go out of date, I'd refrain from listing it here, but it's ok

// - request-latency[p99]: 500.00 (needs 750.0)
// - request-count: 800 (needs 1000)
func StringReport(healthCriteria []config.Metric, diagnosis Diagnosis) string {
report := fmt.Sprintf("status: %s\n", diagnosis.OverallResult.String())
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

what if we also added the time we wrote this down?
does the user have any way of telling when was this status report for?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

We could, but I'd probably have a function setHealthReportAnnotation(report string) in rollout since there are cases in which this function is not called (e.g. new candidate). In that function, we can append the time at the end of any report.

I will defer this after #57 since we would need the mocked time introduced there. Doing it here would be double work and would make this PR much longer.

@@ -114,7 +122,7 @@ func (r *Rollout) UpdateService(svc *run.Service) (*run.Service, error) {
if isNewCandidate(svc, candidate) {
r.log.Debug("new candidate, assign some traffic")
svc = r.PrepareRollForward(svc, stable, candidate)
svc = r.updateAnnotations(svc, stable, candidate)
svc = r.updateAnnotations(svc, stable, candidate, "new candidate, no health report available yet")
Copy link
Contributor

@ahmetb ahmetb Aug 6, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this is not a great way. it keeps overloading the updateAnnotations. I'll offer alternatives.

  1. pass a statusMsgFunc func()string that returns the msg you wanna set.
  2. create a new method svc = r.setAnnotation(key, msg)

(2) is cleaner.

Copy link
Contributor

@ahmetb ahmetb left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

some nits

Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
delete(svc.Metadata.Annotations, CandidateRevisionAnnotation)
return svc
}

svc.Metadata.Annotations[StableRevisionAnnotation] = stable
svc.Metadata.Annotations[CandidateRevisionAnnotation] = candidate
setAnnotation(svc, StableRevisionAnnotation, stable)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nice, yeah you can just use it here :D

svc = r.updateAnnotations(svc, stable, candidate)

// TODO(gvso): create a setHealthReportAnnotation that appends the
// current time to the report before setting the annotation.
Copy link
Contributor

@ahmetb ahmetb Aug 7, 2020

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

this probably could be part of the health report string generation IMO.
doesn't have to be here.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My initial thought was also this one, but the problem is that we don't generate a report when we have a new candidate. We have a string literal in that case

@gvso gvso merged commit 6f36920 into GoogleCloudPlatform:main Aug 7, 2020
@gvso gvso deleted the health-annotations branch August 7, 2020 17:55
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

2 participants