migrations: don't start model workers while a model is importing #6678

Merged
merged 3 commits into from Dec 9, 2016

Conversation

Projects
None yet
4 participants
Member

babbageclunk commented Dec 8, 2016

Fixes https://bugs.launchpad.net/juju/+bug/1646310

Starting the migration master while the model was being imported back
into a controller that it had previously migrated out of caused it to
uninstall itself from the engine (thinking that it was finished
processing the old migration), which blocked any workers that depended
on ifNotMigrating. To fix this we wait until the model MigrationMode
has changed to MigrationModeNone - this means the watcher needs to
signal on any changes to models, rather than just life.

Add a global flag to the collectionWatcher config - this lets the watcher return
ids for any model, rather than just the current one.

The apiserver code to remove state objects from the pool still needs to
watch life specifically. If the model becomes Dead and is then
removed within the event coalescence time then the collection watcher
won't report any event, but a lifecycle watcher will.

QA steps:

  • bootstrap two controllers A and B
  • create a model m in A with an application deployed
  • migrate it to B
  • migrate it back to A
  • check the debug log has no dependency engine messages saying "fortress operation aborted" - these show the workers that are prevented from running because the migration master has exited
  • migrate it back to B - this would have failed before this fix, because the previous migration left the model workers in a bad state

babbageclunk added some commits Dec 8, 2016

Make WatchModels a collection watcher
The model worker manager was starting the model workers too early when a
model was migrated into the controller. To fix this we want it to wait
until the model MigrationMode has changed to MigrationModeNone, so the
watcher needs to signal on any changes to models, rather than just
life. Making it a collection watcher does this.

The apiserver code to remove state objects from the pool still needs to
watch life specifically, because if the model becomes Dead and is then
removed within the event coalescence time then the collection watcher
won't report any event, but a lifecycle watcher will.
Defer starting model workers when importing
Starting the migration master while the model was being imported back
into a controller that it had previously migrated out of caused it to
uninstall itself from the engine (thinking that it was finished
processing the old migration), which blocked any workers that depended
on ifNotMigrating.

Fixes https://bugs.launchpad.net/juju/+bug/1646310
Member

babbageclunk commented Dec 8, 2016

Added a bug for including model dependency engines in the introspection report - this would have made finding the problem a bit easier. https://bugs.launchpad.net/juju/+bug/1648539

+1

Changes make sense to me.

@babbageclunk babbageclunk changed the base branch from develop to 2.1 Dec 8, 2016

state/watcher.go
+func (w *collectionWatcher) convertId(id string) (string, error) {
+ // Strip off the env UUID prefix.
+ if w.colWCfg.global {
+ id = w.st.localID(id)
@howbazaar

howbazaar Dec 9, 2016

Owner

Don't we just want to leave the id alone if the config is global?

@babbageclunk

babbageclunk Dec 9, 2016

Member

Yeah, after talking about it I think you're right. Changing that.

Don't strip off model uuid for global watchers
If the collection it's watching is global, then the ids won't be
prefixed with the model uuid anyway - it works for the models collection
because it doesn't do anything if the prefix isn't there. If it's not a
global collection, then the model uuid will be needed to make sense of
the id that comes back.
Member

babbageclunk commented Dec 9, 2016

$$merge$$

Contributor

jujubot commented Dec 9, 2016

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

Member

babbageclunk commented Dec 9, 2016

!!chittychitty!!

@jujubot jujubot merged commit 0aa21b7 into juju:2.1 Dec 9, 2016

1 check failed

github-check-merge-juju Built PR, ran unit tests, and tested LXD deploy. Use !!.*!! to request another build. IE, !!build!!, !!retry!!
Details

@babbageclunk babbageclunk deleted the babbageclunk:migration-bounce branch Dec 9, 2016

jujubot added a commit that referenced this pull request Dec 14, 2016

Merge pull request #6700 from babbageclunk/migration-bounce
migrations: don't start model workers while a model is importing

Fixes https://bugs.launchpad.net/juju/+bug/1646310

(Recreated for develop branch from #6678.)

Starting the migration master while the model was being imported back
into a controller that it had previously migrated out of caused it to
uninstall itself from the engine (thinking that it was finished
processing the old migration), which blocked any workers that depended
on `ifNotMigrating`. To fix this we wait until the model `MigrationMode`
has changed to `MigrationModeNone` - this means the watcher needs to
signal on any changes to models, rather than just life.

Add a global flag to the `collectionWatcher` config - this lets the watcher return
ids for any model, rather than just the current one.

The apiserver code to remove state objects from the pool still needs to
watch life specifically. If the model becomes `Dead` and is then
removed within the event coalescence time then the collection watcher
won't report any event, but a lifecycle watcher will.

QA steps:
* bootstrap two controllers A and B
* create a model m in A with an application deployed
* migrate it to B
* migrate it back to A
* check the debug log has no dependency engine messages saying "fortress operation aborted" - these show the workers that are prevented from running because the migration master has exited
* migrate it back to B - this would have failed before this fix, because the previous migration left the model workers in a bad state
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment