Skip to content
This repository has been archived by the owner on Mar 13, 2021. It is now read-only.

Make the service name used by a core deployer equal to the deployer name #224

Closed
wants to merge 3 commits into from

Conversation

glyn
Copy link
Contributor

@glyn glyn commented Dec 3, 2019

Diagnose service name collisions by capturing any service creation error in the relevant condition.

See #141 for where dynamically generated service names were first introduced.

Fixes #223

Reverts "Make the service name used by the deployer configurable".

This reverts commit 09c4793.

Closes #200

@glyn
Copy link
Contributor Author

glyn commented Dec 3, 2019

Here's an example of the Status of a deployer whose service name already existed:

Status:
  Conditions:
    Last Transition Time:  2019-12-03T16:21:14Z
    Status:                True
    Type:                  DeploymentReady
    Last Transition Time:  2019-12-03T16:20:59Z
    Message:               services "test-deployer" already exists
    Reason:                reconcileError
    Status:                False
    Type:                  Ready
    Last Transition Time:  2019-12-03T16:20:59Z
    Message:               services "test-deployer" already exists
    Reason:                reconcileError
    Status:                False
    Type:                  ServiceReady
  Deployment Name:         test-deployer-deployer-7qbj4
  Latest Image:            glyn/kubernetes-bootcamp:v1
Events:                    <none>

@glyn glyn requested a review from scothis December 3, 2019 16:35
@codecov-io
Copy link

codecov-io commented Dec 3, 2019

Codecov Report

Merging #224 into master will not change coverage.
The diff coverage is 0%.

Impacted file tree graph

@@          Coverage Diff           @@
##           master    #224   +/-   ##
======================================
  Coverage    9.09%   9.09%           
======================================
  Files          61      61           
  Lines        4266    4266           
======================================
  Hits          388     388           
  Misses       3845    3845           
  Partials       33      33
Impacted Files Coverage Δ
pkg/apis/core/v1alpha1/deployer_types.go 33.33% <ø> (ø) ⬆️
pkg/apis/core/v1alpha1/deployer_lifecycle.go 0% <0%> (ø) ⬆️
pkg/controllers/core/deployer_controller.go 0% <0%> (ø) ⬆️

Continue to review full report at Codecov.

Legend - Click here to learn more
Δ = absolute <relative> (impact), ø = not affected, ? = missing data
Powered by Codecov. Last update d1295b3...ac6edf7. Read the comment docs.

Copy link
Member

@scothis scothis left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Good process. Comments inline

@@ -89,6 +89,10 @@ func (ds *DeployerStatus) PropagateServiceStatus(ss *corev1.ServiceStatus) {
deployerCondSet.Manage(ds).MarkTrue(DeployerConditionServiceReady)
}

func (ds *DeployerStatus) MarkServiceReconcileError(err error) {
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
func (ds *DeployerStatus) MarkServiceReconcileError(err error) {
func (ds *DeployerStatus) MarkServiceNotOwned(name string) {

See an example from riff 0.4:

func (ds *DeployerStatus) MarkServiceNotOwned(name string) {
deployerCondSet.Manage(ds).MarkFalse(DeployerConditionServiceReady, "NotOwned",
"There is an existing Service %q that we do not own.", name)
}

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'll adopt this. I overlooked that method.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually, I see that riff 0.4 was checking services ahead of reconciliation, which is a design we've moved away from. Rather than trying to introspect the service creation error, which is fragile, I think it's sufficient to capture the service creation error in the service ready condition.

@@ -132,6 +132,7 @@ func (r *DeployerReconciler) reconcile(ctx context.Context, log logr.Logger, dep
childService, err := r.reconcileChildService(ctx, log, deployer)
if err != nil {
log.Error(err, "unable to reconcile child Service", "deployer", deployer)
deployer.Status.MarkServiceReconcileError(err)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

we only want to update the condition if the service resource fails to be created because the name already exists.

By always setting the condition for any error, transient errors will causes the ready condition of the deployer to turn false. The riff cli with the --tail flag will watch the resource until the ready condition is no longer unknown. This change as proposed will cause the CLI to report failure much more frequently.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Makes sense. Also, that will make the condition reason createError instead of the rather bland reconcileError.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Actually made the ServiceReady condition reason serviceCreationFailed so it makes more sense when percolated up to the Ready condition reason.

Diagnose service name collisions by capturing any service creation error in
the relevant condition.

See projectriff#141 for where dynamically
generated service names were first introduced.

Fixes projectriff#223
@glyn glyn self-assigned this Dec 4, 2019
@glyn glyn requested a review from scothis December 4, 2019 11:55
All errors returned by `.Create()` are not equal. Most api errors are
transient and should requeue the task. However, create failures because
there is an existing resource are special. Only those should mark the
resource as not ready, and not requeue.
@glyn glyn closed this Dec 4, 2019
@glyn glyn deleted the 200-service-name branch December 4, 2019 16:05
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
3 participants