Skip to content
This repository has been archived by the owner on Nov 30, 2021. It is now read-only.

Vagrant provider repeatedly errors on formation if node dir is deleted #346

Closed
mboersma opened this issue Nov 30, 2013 · 2 comments
Closed
Assignees
Labels
Milestone

Comments

@mboersma
Copy link
Member

Needs to be more robust in some error cases such as this one:

  1. Provision a controller but somehow forget to add deis-controler to the admins group, despite all documentation and fuschia-colored warnings at the command-line
  2. Create a formation and scale it upward, e.g. deis nodes:scale form1 runtime=2
  3. Try to scale down the formation, get an appropriate error about "couldn't remove chef node"
  4. All subsequent formation commands--including destroy!--will fail when trying to access the local vagrant node dir, which apparently was removed in step 3).

This shouldn't happen often, but it can and I think ignoring this error at least in the case of deis formations:destroy would provide a way out of this dead end.

@ghost ghost assigned mboersma Nov 30, 2013
@tombh
Copy link
Contributor

tombh commented Dec 1, 2013

Just to see if I understand this right: if scaling nodes downward fails then deis aborts and retains the node's records in the DB but the vagrant provider goes ahead and removes its (file-based) records. This leads to a discrepancy where deis thinks the nodes still exists (thus expecting the node dirs to exist) but the vagrant provider doesn't think the nodes exist because it's deleted the node dirs.

Does that about sum it up?

I see the PR for catching the error, I left a comment on it. If that PR works then I think it's good enough. Other approaches I can think of are not running destroy_node() if the Chef client purge fails. Oh, but how did it succeed in creating a node in the first place? Are there different permissions for creating and deleting? Another approach again could be running a simple check through _host_ssh() to see if the Vagrantfile in the specified node_dir exists before deleting it.

@mboersma
Copy link
Member Author

mboersma commented Dec 1, 2013

Yes, your summary is correct. Since we keep raising a RuntimeException from destroy_node(), the Django db model is never deleted. This PR just logs the RuntimeError instead of raising it in that case, allowing the model to be deleted. We took a similar approach with Chef nodes--404 on destroy is just a warning--so I think this is consistent.

@gabrtv gabrtv closed this as completed in 4e2083f Dec 2, 2013
tombh added a commit to tombh/deis that referenced this issue Dec 18, 2013
… comparison case insensitve - for BASH and ZSH support. deis#346
tombh added a commit to tombh/deis that referenced this issue Dec 18, 2013
… comparison case insensitve - for BASH and ZSH support. deis#346
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
Projects
None yet
Development

No branches or pull requests

2 participants