Skip to content
This repository has been archived by the owner on Nov 1, 2022. It is now read-only.

release: record release phases #334

Merged
merged 3 commits into from
Mar 31, 2020
Merged

release: record release phases #334

merged 3 commits into from
Mar 31, 2020

Conversation

hiddeco
Copy link
Member

@hiddeco hiddeco commented Mar 19, 2020

This PR adds a phase field to the status object of the HelmRelease resource, this field records the current or latest action phase performed for the resource.

It also introduces new Prometheus metrics for this field and restores the failure Kubernetes Event, addressing #281 and #62 (comment).

@hiddeco hiddeco changed the title Enhancement/status release: record release phases Mar 19, 2020
@hiddeco hiddeco changed the title release: record release phases release: record release phases in status Mar 19, 2020
@hiddeco hiddeco added the enhancement New feature or request label Mar 19, 2020
@hiddeco hiddeco force-pushed the enhancement/status branch 2 times, most recently from 4a89d12 to 3aa6b4d Compare March 19, 2020 17:04
@hiddeco
Copy link
Member Author

hiddeco commented Mar 19, 2020

@sa-spag can you take a look and see if the added histogram (flux_helm_operator_release_phase_duration_seconds) provides all the information necessary for you to do your dashboard and alerting?

@sa-spag
Copy link
Contributor

sa-spag commented Mar 20, 2020

@sa-spag can you take a look and see if the added histogram (flux_helm_operator_release_phase_duration_seconds) provides all the information necessary for you to do your dashboard and alerting?

Yes it does, and...

const (
InstallAction action = "install"
UpgradeAction action = "upgrade"
SkipAction action = "skip"
RollbackAction action = "rollback"
UninstallAction action = "uninstall"
DryRunCompareAction action = "dry-run-compare"
AnnotateAction action = "annotate"
)

... seems to cover all interesting cases. Great work, thanks!

@sa-spag
Copy link
Contributor

sa-spag commented Mar 27, 2020

I successfully deployed an image built from your work and applied the new CRD as you suggested on a staging cluster. I'll keep you posted.

@sa-spag
Copy link
Contributor

sa-spag commented Mar 27, 2020

Seems like there's a regression throughput-wise, perhaps workers no longer parallelize the releases:
grafana staging-1 blbl cr_d_Dc2iVU1Zk_flux-and-helm-operator_orgId=1 from=1585312414364 to=1585314597041
The green curve corresponds to flux_helm_operator_release_queue_length_count, the vertical blue line is a Grafana annotation which marks the new version deployment.

Turns out it's not related to this pull-request, as fluxcd/helm-operator-prerelease:master-997e8f23 also shows the same behavior.

pkg/status/conditions.go Outdated Show resolved Hide resolved
pkg/operator/operator.go Outdated Show resolved Hide resolved
pkg/operator/operator.go Outdated Show resolved Hide resolved
pkg/apis/helm.fluxcd.io/v1/types_helmrelease.go Outdated Show resolved Hide resolved
pkg/release/release.go Outdated Show resolved Hide resolved
pkg/release/release.go Outdated Show resolved Hide resolved
@stefanprodan stefanprodan mentioned this pull request Mar 30, 2020
@hiddeco hiddeco force-pushed the enhancement/status branch 2 times, most recently from 47f0880 to ca01cf0 Compare March 30, 2020 20:29
// is synced.
MessageChartSynced = "Chart managed by HelmRelease processed"
ReleaseSynced = "ReleaseSynced"
FailedReleaseSync = "FailedReleaseSync"
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Might be out of scope but it would be awesome if we could send a separate event for rollbacks (e.g. ReleaseRollBack).

@hiddeco hiddeco force-pushed the enhancement/status branch 3 times, most recently from 792f511 to aaa51db Compare March 31, 2020 12:30
@hiddeco hiddeco changed the title release: record release phases in status release: record release phases Mar 31, 2020
@hiddeco hiddeco marked this pull request as ready for review March 31, 2020 13:24
@hiddeco hiddeco force-pushed the enhancement/status branch 2 times, most recently from 90612df to e125d83 Compare March 31, 2020 14:36
Copy link
Member

@stefanprodan stefanprodan left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM

Great improvements! thanks @hiddeco 🥇

This commit introduces the recording of release phases in the
`HelmRelease` and as a Prometheus histogram metrics.

It also re-introduces the recording of release failures as a
Kubernetes Event with a proper failure reason. Limitizes the
amount of (obsolete) error wraps and restructures returned
error messages.
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
enhancement New feature or request
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants