Destroy machine when last principal unit removed. #52

cmars · 2014-06-10T00:53:51Z

This change causes a machine to be automatically removed when the last principal unit running on that machine is removed (remove-unit). If the machine is a parent to any containers, it is left alone.

Background:
https://bugs.launchpad.net/juju-core/+bug/1206532
https://bugs.launchpad.net/juju-core/+bug/1183309/comments/8

This could use more test coverage, but I'd like to get a general feel for how it's looking so far before I go much further with it. This is my first time mucking around with state transactions. Please let me know if I've got it right!

I haven't added an option to keep empty machines/containers around after removing the last unit. I'd like to see if we can avoid it. If we do need the option, I think it should be opt in. Let me know.

jameinel · 2014-06-10T05:45:18Z

state/machine.go

-func (m *Machine) ForceDestroy() error {
+// forceDestroyOps returns the transaction operations necessary to completely
+// remove the machine along with its units and containers.
+func (m *Machine) forceDestroyOps() ([]txn.Op, error) {


I'm hesitant to use "forceDestroyOps" in something that should be a clean teardown. Maybe call this "destroyOps", and have a different one for forceDestroyOps which includes the regular cleanup?
I guess my concern is that we should only be tearing down the machine automatically when it is already empty (but dirty), so we shouldn't need to run all the extra "make sure everything really is gone". Because if everything isn't gone, we shouldn't be tearing down the machine.

jameinel · 2014-06-10T05:54:56Z

So NOT LGTM as is, mostly because it has 0 tests. The logic seems like it would do what we want, though my personal feeling is that it is overly forceful.
The one reason I can think of to do a more forceful destruction is to handle subordinates that should be cleaned up when the final primary is removed.
But definitely we need to be testing the overall behavior (create a machine, add a unit into it, destroy the unit, notice that the machine is destroyed as well). And then testing the edge cases "create a machine, add 2 units, destroy one and the machine stays around, destroy the second and it goes away", "create a machine, put a container on it, put a unit in the container, destroy the last unit in the container, watch the container go away, but not the machine", etc. I can come up with more if you need thoughts about what the edge cases should be.
I'd also like William to chime in about the model of this change.

cmars · 2014-06-10T18:24:00Z

I've simplified this quite a bit by deferring Destroy() on the unit's assigned machine in Unit.Destroy -- if the machine is eligible for destruction, for which I've refactored logic out of Machine.Destroy(), into Machine.CheckDestroy().

PTAL, thanks!

fwereade · 2014-06-11T10:36:14Z

state/machine.go

+		// machine is not voting)
+		return fmt.Errorf("machine %s is required by the environment", m.doc.Id)
+	}
+	if m.doc.HasVote {


grr, we shouldn't have both these checks. not actionable in this PR though.

fwereade · 2014-06-11T11:27:26Z

NOT LGTM yet, we need to be obsessively careful about state changes. It might be helpful for you to refresh your memory of the "lifecycles", "death-and-destruction", and "hacking-state" documentation (in the doc/ dir); but you should also, definitely, come talk to me about anything that's not crystal clear.

fwereade · 2014-06-12T13:53:58Z

state/service.go

+	ops = append(ops, svcOp)
+
+	if removeMachine {
+		ops = append(ops, txn.Op{


I'm not keen on having two ops referring to the same document. Might be able to expand on this comment usefully when I've seen more.

fwereade · 2014-06-12T14:42:16Z

This is converging fast, but needs a couple of code tweaks (and much heavier testing than is currently in place).

cmars · 2014-06-12T17:26:25Z

Still wip, need to add test cases with transaction hooks

cmars · 2014-06-13T19:03:04Z

PTAL

fwereade · 2014-06-16T21:25:46Z

state/unit.go

+			Update: bson.D{{"$set", bson.D{{"life", Dying}}}},
+		})
+	} else {
+		assert = keepAssert


I think that we'd want to $or together the various keepAssert possibilities, wouldn't we? ie if one of them no longer applied, but another did, we should be able to proceed as we did before.

cmars · 2014-06-17T03:48:04Z

PTAL

fwereade · 2014-06-17T06:01:36Z

state/unit.go

+		ops = append(ops, txn.Op{
+			C:      u.st.containerRefs.Name,
+			Id:     m.doc.Id,
+			Assert: bson.D{{"children", containers}},


I think I'd be checking for not-empty, rather than exact-match, here.

cmars · 2014-06-20T18:47:59Z

Didn't see replies to earlier comments until now.

fwereade · 2014-06-23T08:45:02Z

state/unit.go

+	var keepAsserts []bson.D
+	m, err := u.st.Machine(u.doc.MachineId)
+	if err != nil {
+		if errors.IsNotFound(err) {


This is certainly an indication that something is up -- the machine doc doesn't match the unit. I think this should return txn.ErrTransientFailure -- it should rebuild the ops and try again in the hope of seeing consistent db state.

I'd ask for an explicit test for this situation, but I'm not sure we can construct one.

I tried this, but it caused a failure in CleanupSuite.TestCleanupForceDestroyedMachineWithContainer, cleanup_test.go:232.

cmars · 2014-06-25T23:54:07Z

PTAL

fwereade · 2014-06-27T15:42:50Z

state/unit_test.go

+		result = append(result, tc)
+	}
+	{
+		tc := destroyMachineTestCase{desc: "principal has subordinate", destroyed: true}


Hmm, I think we should probably drop this test case. One reason is that unit destruction should take out the subordinates anyway...

fwereade · 2014-06-27T15:45:56Z

LGTM, thank you for your patience :).

cmars · 2014-06-27T15:58:52Z

$$merge$$

jujubot · 2014-06-27T15:59:46Z

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

jujubot · 2014-06-27T16:40:53Z

Build failed: Tests failed
build url: http://juju-ci.vapour.ws:8080/job/github-merge-juju/279

Either way, assert that the state assumptions for being eligible or not-eligible hold true for the transaction. Improved state assertion in removeUnitOps More test cases, renamed destroyHostOps destroyHostOps fix, allow machine removal on retry If a unit's host cannot be removed, make that condition the only assertion on the transaction, so that we can re-evaluate the state completely on retry. Examples: - host container removed after check, before txn exec - host gives up voting rights after check, before txn exec Unit tests added for the above, and the converse (host qualifies on check, disqualified before txn exec). Fix import Add thrashing state testcase. Consolidate destroyHostOps machine ops. $or together all the "keep host alive" assertions, so that a transaction can succeed if the individual "keep" criteria change in a transaction but the outcome remains the same. Improve destroyHostOps test cases. Test coverage for removing an unassigned unit. Use assertLife helper function. More descriptive test names. More assertion checks. Improve non-empty containers assertion. MongoDB makes it kind of weird. https://stackoverflow.com/questions/14789684/find-mongodb-records-where-array-field-is-not-empty-using-mongoose Clean up tests, log removal of unassigned unit. Removed test cruft, .Refresh() from txn hooks. Colocated unit set up outside of the transaction hook in TestRemoveUnitMachineRetryOrCond. Remove TestDestroyCleanRelations. Check unit destroyed in TestRemoveUnitMachineRetryOrCond Consolidate & improve destroyHostOps assert conds. Refactor out unit test suite method setMachineVote. More comments. Remove subordinate unit test case.

Fixed service removal test by calling State.Cleanup.

cmars · 2014-06-27T20:21:50Z

$$merge$$

jujubot · 2014-06-27T20:22:46Z

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

jujubot · 2014-06-27T21:18:33Z

Build failed: Tests failed
build url: http://juju-ci.vapour.ws:8080/job/github-merge-juju/281

cmars · 2014-06-27T21:26:40Z

Rolling again...

cmars · 2014-06-27T21:26:48Z

$$merge$$

jujubot · 2014-06-27T21:27:46Z

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

jujubot · 2014-06-27T22:24:18Z

Build failed: Tests failed
build url: http://juju-ci.vapour.ws:8080/job/github-merge-juju/282

cmars · 2014-06-27T22:51:40Z

https://i.imgur.com/2SuBHe3.jpg

cmars · 2014-06-27T22:51:46Z

$$merge$$

jujubot · 2014-06-27T22:52:46Z

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

Destroy machine when last principal unit removed. This change causes a machine to be automatically removed when the last principal unit running on that machine is removed (remove-unit). If the machine is a parent to any containers, it is left alone. Background: https://bugs.launchpad.net/juju-core/+bug/1206532 https://bugs.launchpad.net/juju-core/+bug/1183309/comments/8 This could use more test coverage, but I'd like to get a general feel for how it's looking so far before I go much further with it. This is my first time mucking around with state transactions. Please let me know if I've got it right! I haven't added an option to keep empty machines/containers around after removing the last unit. I'd like to see if we can avoid it. If we do need the option, I think it should be opt in. Let me know.

charm: allow for juju-info relation

Make model optional in blockdevice schema MAAS2 sometimes sends null for the "model" field in blockdevice.

jameinel reviewed Jun 10, 2014
View reviewed changes

fwereade reviewed Jun 11, 2014
View reviewed changes

fwereade reviewed Jun 12, 2014
View reviewed changes

fwereade reviewed Jun 16, 2014
View reviewed changes

fwereade reviewed Jun 17, 2014
View reviewed changes

fwereade reviewed Jun 23, 2014
View reviewed changes

fwereade reviewed Jun 27, 2014
View reviewed changes

Casey Marshall added 2 commits June 27, 2014 15:16

Updated wrt state/txn refactoring.

25fec83

Fixed service removal test by calling State.Cleanup.

jujubot merged commit 6bc78e7 into juju:master Jun 27, 2014

ericsnowcurrently pushed a commit to ericsnowcurrently/juju that referenced this pull request May 26, 2015

Merge pull request juju#52 from rogpeppe/018-allow-juju-info-v4

174c69f

charm: allow for juju-info relation

hmlanigan pushed a commit to hmlanigan/juju that referenced this pull request Aug 26, 2019

Merge pull request juju#52 from voidspace/maas2-blockdevice-model

4e9c3d7

Make model optional in blockdevice schema MAAS2 sometimes sends null for the "model" field in blockdevice.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Destroy machine when last principal unit removed. #52

Destroy machine when last principal unit removed. #52

cmars commented Jun 10, 2014

jameinel Jun 10, 2014

jameinel commented Jun 10, 2014

cmars commented Jun 10, 2014

fwereade Jun 11, 2014

fwereade commented Jun 11, 2014

fwereade Jun 12, 2014

fwereade commented Jun 12, 2014

cmars commented Jun 12, 2014

cmars commented Jun 13, 2014

fwereade Jun 16, 2014

cmars commented Jun 17, 2014

fwereade Jun 17, 2014

cmars Jun 17, 2014

cmars commented Jun 20, 2014

fwereade Jun 23, 2014

cmars Jun 25, 2014

cmars commented Jun 25, 2014

fwereade Jun 27, 2014

fwereade commented Jun 27, 2014

cmars commented Jun 27, 2014

jujubot commented Jun 27, 2014

jujubot commented Jun 27, 2014

cmars commented Jun 27, 2014

jujubot commented Jun 27, 2014

jujubot commented Jun 27, 2014

cmars commented Jun 27, 2014

cmars commented Jun 27, 2014

jujubot commented Jun 27, 2014

jujubot commented Jun 27, 2014

cmars commented Jun 27, 2014

cmars commented Jun 27, 2014

jujubot commented Jun 27, 2014

Destroy machine when last principal unit removed. #52

Destroy machine when last principal unit removed. #52

Conversation

cmars commented Jun 10, 2014

Choose a reason for hiding this comment

jameinel commented Jun 10, 2014

cmars commented Jun 10, 2014

Choose a reason for hiding this comment

fwereade commented Jun 11, 2014

Choose a reason for hiding this comment

fwereade commented Jun 12, 2014

cmars commented Jun 12, 2014

cmars commented Jun 13, 2014

Choose a reason for hiding this comment

cmars commented Jun 17, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmars commented Jun 20, 2014

Choose a reason for hiding this comment

Choose a reason for hiding this comment

cmars commented Jun 25, 2014

Choose a reason for hiding this comment

fwereade commented Jun 27, 2014

cmars commented Jun 27, 2014

jujubot commented Jun 27, 2014

jujubot commented Jun 27, 2014

cmars commented Jun 27, 2014

jujubot commented Jun 27, 2014

jujubot commented Jun 27, 2014

cmars commented Jun 27, 2014

cmars commented Jun 27, 2014

jujubot commented Jun 27, 2014

jujubot commented Jun 27, 2014

cmars commented Jun 27, 2014

cmars commented Jun 27, 2014

jujubot commented Jun 27, 2014