Join GitHub today
GitHub is home to over 20 million developers working together to host and review code, manage projects, and build software together.
Add new option to remove-machine to tell the provisioner not to stop the instance #7728
Conversation
wallyworld
requested a review
from
howbazaar
Aug 10, 2017
| +// is determined by the force and keep parameters. | ||
| +// TODO(wallyworld) - for Juju 3.0, this should be the preferred api to use. | ||
| +func (client *Client) DestroyMachinesWithParams(force, keep bool, machines ...string) ([]params.DestroyMachineResult, error) { | ||
| + if !keep { |
howbazaar
Aug 13, 2017
Owner
Why bother keeping this block? Since we need to handle keep and !keep in the call below, this block seems pointless.
| + Keep: keep, | ||
| + MachineTags: make([]string, 0, len(machines)), | ||
| + } | ||
| + allResults := make([]params.DestroyMachineResult, len(machines)) |
howbazaar
Aug 13, 2017
Owner
Given that the API server needs to do these checks anyway, why duplicate them and make the call site more complicated, on the client side?
wallyworld
Aug 14, 2017
Owner
Was being consistent with existing code in this facade. The original thinking would have been to fail early if possible, avoiding an unnecessary api call if we know up front the tags are invalid. In fact, these sorts of checks are done in many other facades also. It's not uncommon to duplicate checks like this in different layers. We'll often also perform repeated checks in state and apiserver layers for example. Or CLI and api etc.
| -func (mm *MachineManagerAPI) destroyMachine(args params.Entities, force bool) (params.DestroyMachineResults, error) { | ||
| +// DestroyMachineWithParams removes a set of machines from the model. | ||
| +func (mm *MachineManagerAPIV4) DestroyMachineWithParams(args params.DestroyMachinesParams) (params.DestroyMachineResults, error) { | ||
| + logger.Criticalf("%+v", args) |
| @@ -198,6 +229,12 @@ func (mm *MachineManagerAPI) destroyMachine(args params.Entities, force bool) (p | ||
| if err != nil { | ||
| return nil, err | ||
| } | ||
| + if keep { | ||
| + logger.Infof("destroy machine %v but keep instance", machineTag.Id()) | ||
| + if err := machine.SetKeepInstance(keep); err != nil { |
wallyworld
Aug 14, 2017
Owner
This is one potential call site, albeit the only one right now. However, why hard code an API in a lower layer just because right now we only set the value to true?
| + }, | ||
| + }}, | ||
| + }) | ||
| +} |
howbazaar
Aug 13, 2017
Owner
It would be good if you could ask the mock machine if keep instance was set to true.
| @@ -817,6 +817,37 @@ func (s *withoutControllerSuite) TestSeries(c *gc.C) { | ||
| }) | ||
| } | ||
| +func (s *withoutControllerSuite) TestKeepInstance(c *gc.C) { | ||
| + // Add a machine with keep-instance = true. | ||
| + foobarMachine, err := s.State.AddMachine("foobar", state.JobHostUnits) |
| if err != nil { | ||
| return nil, err | ||
| } | ||
| - if root.BestFacadeVersion("MachineManager") >= 3 { | ||
| + if root.BestFacadeVersion("MachineManager") < 4 && c.KeepInstance { | ||
| + return nil, errors.New("this version of Juju doesn't support --keep-instance") |
howbazaar
Aug 13, 2017
Owner
Why have this knowledge in two places? You already return an error from the client facade.
wallyworld
Aug 14, 2017
Owner
I want to fail early before making an api call. Also, there will be a generic "Not Implemented" return from the client facade, but that could be for other reasons as well. Here is where we have the knowledge of what's been asked for and what's supported. This pattern is used in several other CLI commands as well. See other comment below also.
| - if c.Force { | ||
| - destroy = client.ForceDestroyMachines | ||
| + var results []params.DestroyMachineResult | ||
| + if c.KeepInstance { |
howbazaar
Aug 13, 2017
Owner
Why encode all this here? You have already done this in the client and server code.
The whole point of encoding that in the api client is so you can just call DestroyMachinesWithParams on client for all.
wallyworld
Aug 14, 2017
Owner
The issue is that there are 2 facades that could be used - api/client and api/machinemanager. Only quite recent versions of Juju have the machinemanager api. So it's easier to use the knowledge that if keep is specified and we have the relevant machinemanager facade version available, call that directly. Otherwise we need to fall back to using the removeMachineAdapter which knows how to translate the differing return results from the api/client and api/machinemanager implementations.
| + | ||
| +// KeepInstance reports whether a machine, when removed from | ||
| +// Juju, will cause the corresponding cloud instance to be stopped. | ||
| +func (m *Machine) KeepInstance() (bool, error) { |
howbazaar
Aug 13, 2017
Owner
I don't see why this method should return an error?
Why does it not just return the value in the doc?
wallyworld
Aug 14, 2017
Owner
There isn't a value in the machine doc. It's in the machine instance doc (along with instance id, hardware info etc).
|
$$merge$$ |
|
Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju |
wallyworld commentedAug 10, 2017
Description of change
Sometimes MAAS will fail a machine deployment after already telling Juju it will succeed. This can result in Juju having 2 machines in the model with the same instance id (one running, one not). Removing the bad machine in Juju then kills the good machine in MAAS.
Without significant MAAS/Juju changes, this issue cannot be fully fixed. Instead this PR adds some tooling to make things easier to deal with:
When --keep-instance is used, the machine is removed from Juju but the provisioner will not call StopInstance() in the cloud, hence the cloud machine will be left running. This allows a failed machine which happens to have a duplicate instance id to be removed from Juju without stopping a cloud instance belonging to a different Juju machine.
QA steps
bootstrap
add some machines (with and without units)
check the machine tags to see they include the new machine id tag
on an empty machine:
juju remove-machine --keep-instance
(verify that the machine is removed from juju but not killed in the cloud)
on a machine with units:
juju remove-machine --keep-instance --force
(verify that the machine is removed from juju but not killed in the cloud)
bootstrap juju 2.2
add a machine
juju remove-machine --keep-instance
-> error that juju doesn't support --keep-instance
Documentation changes
The new --keep-instance argument to remove-machine needs to be documented.
Bug reference
https://bugs.launchpad.net/juju/+bug/1671588