-
Notifications
You must be signed in to change notification settings - Fork 10
rollout: Include health report in service object #51
Conversation
@@ -6,7 +6,6 @@ require ( | |||
cloud.google.com/go v0.60.0 // indirect | |||
github.com/TV4/logrus-stackdriver-formatter v0.1.0 | |||
github.com/go-stack/stack v1.8.0 // indirect | |||
github.com/google/go-cmp v0.5.0 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
testify is able to compare more than primitives types, so I really didn't need go-cmp.
pkg/health/report.go
Outdated
report := map[string]interface{}{ | ||
"metricsType": criteria.Type, | ||
"threshold": result.Threshold, | ||
"actualValue": result.ActualValue, | ||
"isCriteriaMet": result.IsCriteriaMet, | ||
} | ||
if criteria.Type == config.LatencyMetricsCheck { | ||
report["percentile"] = criteria.Percentile | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I also considered using the health.Diagnosis
struct directly, but there are two drawbacks
- We have previously decided not to include the metrics type in
health.CheckResult
since that would be redundant. Plus, the percentile is also not part of that struct. - The
OverallDiagnosis
property is aDiagnosisResult
(int
), so the JSON representation would turn out to be a value 0..3, which is not very helpful for our users.
Btw, in the project document, we talk about storing a history of all decisions. This makes this annotation seem redundant, but it might be useful to find why the last decision was taken faster. If not, we can remove the |
@@ -22,6 +22,19 @@ const ( | |||
Unhealthy | |||
) | |||
|
|||
func (d DiagnosisResult) String() string { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
+++ good way
pkg/health/report.go
Outdated
if criteria.Type == config.LatencyMetricsCheck { | ||
report += fmt.Sprintf("\n- %s[p%.0f]: %.2f (threshold %.2f)", criteria.Type, criteria.Percentile, result.ActualValue, criteria.Threshold) | ||
continue | ||
} | ||
|
||
format := "\n- %s: %.2f (threshold %.2f)" | ||
if criteria.Type == config.RequestCountMetricsCheck { | ||
// No decimals for request count. | ||
format = "\n- %s: %.0f (threshold %.0f)" | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hopefully, this is readable enough and the example in the comment make it clear. If not, any recommendations?
pkg/health/report.go
Outdated
// JSONReport returns a JSON representation of the diagnosis. | ||
// | ||
// TODO: consider if this function is useful (e.g. for a dashboard) since it's | ||
// not used anywhere yet. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I kept this function since we might need it eventually and the tests are a little bit "tedious" to write.
pkg/health/report.go
Outdated
// | ||
// TODO: consider if this function is useful (e.g. for a dashboard) since it's | ||
// not used anywhere yet. | ||
func JSONReport(healthCriteria []config.Metric, diagnosis Diagnosis) (string, error) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'd delete json for now
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I deleted it. I am keeping this branch in my fork, so I have access to that work anyways.
This adds annotation about the last health report. It uses the key rollout.cloud.run/lastHealthReport and assigns it a JSON string with the last diagnosis (healthy, unhealthy) and the check results. Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
pkg/health/report.go
Outdated
// The returned string has the format: | ||
// status: healthy | ||
// metrics: | ||
// - request-latency[p99]: 500.00 (needs 750.0) | ||
// - request-count: 800 (needs 1000) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this will most likely go out of date, I'd refrain from listing it here, but it's ok
// - request-latency[p99]: 500.00 (needs 750.0) | ||
// - request-count: 800 (needs 1000) | ||
func StringReport(healthCriteria []config.Metric, diagnosis Diagnosis) string { | ||
report := fmt.Sprintf("status: %s\n", diagnosis.OverallResult.String()) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
what if we also added the time we wrote this down?
does the user have any way of telling when was this status report for?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could, but I'd probably have a function setHealthReportAnnotation(report string)
in rollout
since there are cases in which this function is not called (e.g. new candidate). In that function, we can append the time at the end of any report.
I will defer this after #57 since we would need the mocked time introduced there. Doing it here would be double work and would make this PR much longer.
pkg/rollout/rollout.go
Outdated
@@ -114,7 +122,7 @@ func (r *Rollout) UpdateService(svc *run.Service) (*run.Service, error) { | |||
if isNewCandidate(svc, candidate) { | |||
r.log.Debug("new candidate, assign some traffic") | |||
svc = r.PrepareRollForward(svc, stable, candidate) | |||
svc = r.updateAnnotations(svc, stable, candidate) | |||
svc = r.updateAnnotations(svc, stable, candidate, "new candidate, no health report available yet") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this is not a great way. it keeps overloading the updateAnnotations. I'll offer alternatives.
- pass a
statusMsgFunc func()string
that returns the msg you wanna set. - create a new method
svc = r.setAnnotation(key, msg)
(2) is cleaner.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
some nits
Signed-off-by: Getulio Valentin Sánchez <gvso@google.com>
delete(svc.Metadata.Annotations, CandidateRevisionAnnotation) | ||
return svc | ||
} | ||
|
||
svc.Metadata.Annotations[StableRevisionAnnotation] = stable | ||
svc.Metadata.Annotations[CandidateRevisionAnnotation] = candidate | ||
setAnnotation(svc, StableRevisionAnnotation, stable) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice, yeah you can just use it here :D
svc = r.updateAnnotations(svc, stable, candidate) | ||
|
||
// TODO(gvso): create a setHealthReportAnnotation that appends the | ||
// current time to the report before setting the annotation. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
this probably could be part of the health report string generation IMO.
doesn't have to be here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
My initial thought was also this one, but the problem is that we don't generate a report when we have a new candidate. We have a string literal in that case
This adds annotation about the last health report. It uses the key rollout.cloud.run/lastHealthReport and assigns it a string with the last diagnosis (healthy, unhealthy) and the check results.