Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Units which are unassigned to a machine due to error can be removed #6906
Conversation
| + if isAssigned { | ||
| + statusOp.Assert = bson.D{{"status", status.Allocating}} | ||
| + } else { | ||
| + statusOp.Assert = bson.D{{"$or", []bson.D{{{"status", status.Allocating}}, {{"status", status.Error}}}}} |
axw
Feb 3, 2017
Member
I don't think this is good enough. A unit could be concurrently assigned to a machine when it's in error state (esp. once the assigner retries automatically).
We already do multi-collection assertions (statusOp, minUnitsOp, and isAliveDoc all assert on different collections). Please check machineid is empty when status is error.
| @@ -774,11 +779,46 @@ func (s *UnitSuite) TestShortCircuitDestroyUnit(c *gc.C) { | ||
| assertRemoved(c, s.unit) | ||
| } | ||
| +func (s *UnitSuite) TestShortCircuitDestroyUnitNotAssigned(c *gc.C) { |
axw
Feb 3, 2017
Member
please add a test that concurrently assigns the unit to a machine when the unit is in error state
|
$$merge$$ |
|
Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju |
|
Build failed: Tests failed |
|
$$merge$$ |
|
Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju |
|
Build failed: Tests failed |
|
$$merge$$ |
|
Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju |
wallyworld commentedFeb 2, 2017
•
Edited 1 time
-
wallyworld
Feb 3, 2017
Description of change
There could be an error assigning units to a machine. The error is recorded against the agent, but this prevents juju remove-unit from actually removing the unit, and model cleanup doesn't work - same root cause. This PR fixes the issue. Many tests falsely set up units which had a non allocating status but were not assigned to machines. Such an error is now reported so tests were changed.
We also tighten up SetAgentStatus(). Status cannot be set to idle, failed etc unless a principal unit is already assigned to a machine. And we cannot set Allocating if a unit is already assigned.
QA steps
I hacked the code which assigns a unit to return an error. Without this PR, juju remove-unit would not work. With the PR, juju remove-unit works as expected.
Bug reference
https://bugs.launchpad.net/juju/+bug/1643430