Vagrant provider repeatedly errors on formation if node dir is deleted #346

mboersma · 2013-11-30T17:58:11Z

Needs to be more robust in some error cases such as this one:

Provision a controller but somehow forget to add deis-controler to the admins group, despite all documentation and fuschia-colored warnings at the command-line
Create a formation and scale it upward, e.g. deis nodes:scale form1 runtime=2
Try to scale down the formation, get an appropriate error about "couldn't remove chef node"
All subsequent formation commands--including destroy!--will fail when trying to access the local vagrant node dir, which apparently was removed in step 3).

This shouldn't happen often, but it can and I think ignoring this error at least in the case of deis formations:destroy would provide a way out of this dead end.

The text was updated successfully, but these errors were encountered:

tombh · 2013-12-01T09:54:35Z

Just to see if I understand this right: if scaling nodes downward fails then deis aborts and retains the node's records in the DB but the vagrant provider goes ahead and removes its (file-based) records. This leads to a discrepancy where deis thinks the nodes still exists (thus expecting the node dirs to exist) but the vagrant provider doesn't think the nodes exist because it's deleted the node dirs.

Does that about sum it up?

I see the PR for catching the error, I left a comment on it. If that PR works then I think it's good enough. Other approaches I can think of are not running destroy_node() if the Chef client purge fails. Oh, but how did it succeed in creating a node in the first place? Are there different permissions for creating and deleting? Another approach again could be running a simple check through _host_ssh() to see if the Vagrantfile in the specified node_dir exists before deleting it.

mboersma · 2013-12-01T17:03:09Z

Yes, your summary is correct. Since we keep raising a RuntimeException from destroy_node(), the Django db model is never deleted. This PR just logs the RuntimeError instead of raising it in that case, allowing the model to be deleted. We took a similar approach with Chef nodes--404 on destroy is just a warning--so I think this is consistent.

… comparison case insensitve - for BASH and ZSH support. deis#346

ghost assigned mboersma Nov 30, 2013

gabrtv closed this as completed in 4e2083f Dec 2, 2013

tombh added a commit to tombh/deis that referenced this issue Dec 18, 2013

When checking the error type from a failed vagrant desutrcution, make…

0d500e6

… comparison case insensitve - for BASH and ZSH support. deis#346

tombh added a commit to tombh/deis that referenced this issue Dec 18, 2013

When checking the error type from a failed vagrant desutrcution, make…

f581767

… comparison case insensitve - for BASH and ZSH support. deis#346

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Vagrant provider repeatedly errors on formation if node dir is deleted #346

Vagrant provider repeatedly errors on formation if node dir is deleted #346

mboersma commented Nov 30, 2013

tombh commented Dec 1, 2013

mboersma commented Dec 1, 2013

Vagrant provider repeatedly errors on formation if node dir is deleted #346

Vagrant provider repeatedly errors on formation if node dir is deleted #346

Comments

mboersma commented Nov 30, 2013

tombh commented Dec 1, 2013

mboersma commented Dec 1, 2013