improvements for status-write-retry #88

joel-bluedata · 2018-10-11T18:37:17Z

issue #40

Add backoff for retries.
Bail out if object is going away or has gone away.

123tap · 2018-10-11T19:21:49Z

pkg/reconciler/cluster.go

@@ -57,16 +59,43 @@ func syncCluster(
 		if !reflect.DeepEqual(cr.Status, oldStatus) {
 			// Write back the status. Don't exit this handler until we
 			// succeed (will block other handlers for this resource).


this could conceivably never exit. is that ok?

Yeah, or at least in the sense that I'm not sure there's much else we can reasonably do here... if we don't update the status on the object then there's no way we can correctly manage it in the future.

A couple of comments on that situation:

This function not-returning will only block future handling of this specific cluster object. Other objects can still be handled. A corollary of this: the number of these tied-up functions we can have is capped by the number of virtual cluster objects. Since they each block a "normal" handler from running, the worst-case simultaneous number of handler functions is unchanged.

It would be possible (with the change in this PR) to get out of that situation by deleting the object.

There are other ways to handle this situation without tying up a goroutine, and if these seem attractive/necessary we could create an issue to pursue them. For example:

We could set ourselves a reminder of "don't allow any more changes to this object", save the desired status object (in memory), and schedule a periodic timer to kick off a retry of the status update. But I'm not sure this is notably different from using time.Sleep.

Or we could (perhaps after some retries) set ourselves a reminder of "don't allow any more changes to this object" and then just mark the object as permanently corrupted.

I'm hoping this is a quite unusual error BTW. I've certainly never seen it without manually forcing the error case.

123tap · 2018-10-11T19:23:24Z

pkg/reconciler/cluster.go

+						return
+					}
+				}
+				if currentCluster.DeletionTimestamp != nil {


currentCluster could be nil here - if clurretClusterErr != nil and the if at line 78 fails

- Add backoff for retries. - Bail out if object is going away or has gone away.

Left over from when I was multiplying a Duration by a variable.

pkg/reconciler/cluster.go

joel-bluedata requested a review from swamibluedata October 11, 2018 18:38

123tap reviewed Oct 11, 2018

View reviewed changes

joel-bluedata mentioned this pull request Oct 11, 2018

prevent removal of our finalizer by non-KD clients? #38

Closed

joel-bluedata added 4 commits October 11, 2018 17:30

a nice newline

2522528

improvements for status-write-retry

113871f

- Add backoff for retries. - Bail out if object is going away or has gone away.

don't need to do this cast after all

3273e66

Left over from when I was multiplying a Duration by a variable.

only check DeletionTimstamp if sure CR != nil

a115489

joel-bluedata force-pushed the joel-status branch from 7809db3 to a115489 Compare October 12, 2018 00:30

swamibluedata reviewed Oct 16, 2018

View reviewed changes

pkg/reconciler/cluster.go Show resolved Hide resolved

swamibluedata approved these changes Oct 16, 2018

View reviewed changes

coalesce log messages

79c652a

joel-bluedata merged commit 4c27809 into bluek8s:master Oct 16, 2018

joel-bluedata deleted the joel-status branch October 22, 2018 22:48

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

improvements for status-write-retry #88

improvements for status-write-retry #88

joel-bluedata commented Oct 11, 2018

123tap Oct 11, 2018

joel-bluedata Oct 11, 2018 •

edited

Loading

123tap Oct 11, 2018

joel-bluedata Oct 11, 2018

improvements for status-write-retry #88

improvements for status-write-retry #88

Conversation

joel-bluedata commented Oct 11, 2018

123tap Oct 11, 2018

Choose a reason for hiding this comment

joel-bluedata Oct 11, 2018 • edited Loading

Choose a reason for hiding this comment

123tap Oct 11, 2018

Choose a reason for hiding this comment

joel-bluedata Oct 11, 2018

Choose a reason for hiding this comment

joel-bluedata Oct 11, 2018 •

edited

Loading