Honour KVM constraints from bundles #6723

Closed
wants to merge 86 commits into
from

Conversation

Projects
None yet

voidspace commented Dec 15, 2016

Update revision of bundlechanges. Fixes bug #1626597.

To QA
Deploy a bundle, with an application with a constraint and a KVM placement directive.
The KVM container created should honour the constraint.

natefinch and others added some commits Dec 2, 2016

Add Ping method to providers
    In order to know if an endpoint is correct when adding a cloud,
    we need to be able to ping that provider to get an idea if it
    even exists.  This is long before bootstrap or even credentials,
    so we need a way to reasonably determine if a cloud of the
    correct type exists at the hostname/ip.  This is different per
    cloud, thus a per-provider endpoint.
Refactor the peergrouper to use clock.Clock, and publish an event ove…
…r pubsub when the collection of apiservers change.
Merge pull request #6676 from wallyworld/cmr-relation-publish
Implement the publish method on the remote relations facade

The remote relations facade has the publish method implemented. Also adds a required GetToken() API that the worker uses.

QA:
bootstrap
juju switch controller
juju deploy mysql
juju offer mysql:db local:/u/ian/mysql
juju switch default
juju deploy wordpress
juju expose wordpress
juju relate wordpress local:/u/ian/mysql

profit
Wait for machines on controller kill and destroy commands
`kill-controller` and `destroy-controller` now also wait on machines yet
to be removed as well as the previous behaviour of waiting on models yet
to be destroyed. This fixes a bug in which machines in the controller
model were not given time to complete their removal before the
controller itself was destroyed.

Fixes lp:1642295
Merge pull request #6677 from wallyworld/cmr-delete-code
Delete old cmr code

Delete old cmr code which is no longer used.
Merge pull request #6674 from macgreagoir/destroy-wait-on-machines
Wait for machines on controller kill and destroy commands

`kill-controller` and `destroy-controller` now also wait on machines yet
to be removed as well as the previous behaviour of waiting on models yet
to be destroyed. This fixes a bug in which machines in the controller
model were not given time to complete their removal before the
controller itself was destroyed.

Fixes lp:1642295

QA steps:
 * Best seen in a manual environment, as per the bug
 * `bootstrap`, then `add-machine` to the controller model
 * `destroy-controller` and observe the destroy procedure continue until the machine in the controller model is removed, which will happen after the default model is destroyed
```
$ juju destroy-controller -y manual
Destroying controller
Waiting for hosted model resources to be reclaimed
Waiting on 0 model, 1 machine
Waiting on 0 model, 1 machine
Waiting on 0 model, 1 machine
All hosted models reclaimed, cleaning up controller machines
```
 * Repeat testing `kill-controller`
Merge pull request #6669 from howbazaar/centralhub-4-peergrouper
Have the peergrouper publish the apiservers over the central hub.

Also refactors the peergrouper to use clock.Clock interface rather than time directly.
Merge pull request #6670 from natefinch/no-pkg-errors
Stop using pkg/errors in juju code

goimports betrayed me

QA: run go test in cmd/juju/cloud
Merge pull request #6681 from macgreagoir/wait-controller-machines-race
Fix race in kill suite controller machines test

Fixes lp:1648749

QA steps:
 * Data race tests should pass
Merge pull request #6680 from anastasiamac/next-version
Change version to 2.2.0-alpha1.

What is source-tag in snapcraft.yaml? Should it be changed to 2.2... something as well?
Merge pull request #6621 from natefinch/fix-1641970
Add Ping method to providers

In order to know if an endpoint is correct when adding a cloud,
we need to be able to ping that provider to get an idea if it
even exists.  This is long before bootstrap or even credentials,
so we need a way to reasonably determine if a cloud of the
correct type exists at the hostname/ip.  This is different per
cloud, thus a per-provider endpoint.

This fixes https://bugs.launchpad.net/juju/+bug/1641970
Merge pull request #6688 from howbazaar/migration-restricted-apiroot
Restrict the api calls available to users during a model's migration.

Forward port of 2.1 branch.
Merge pull request #6691 from howbazaar/block-controller-upgrade-with…
…-active-migration

Block controller upgrade with active migration.

Block the controller from upgrading if it is in the process of importing or exporting another model.

Forward port of 2.1 commit.
Merge pull request #6689 from howbazaar/enable-delete-importing-model
Add a test to show deletion of importing models.

It was incorrectly assumed that one would not be able to delete a model that is being imported.

Added a test to show that this can be done.
component/all: Machine downloads of unit resources
For model migrations, machine agents (specifically the migrationmaster
worker) needs to be able to download unit resources.
api/migrationmaster: Client API for unit resources
Added OpenUnitResource to support downloads of unit-level resources.
apiserver/migrationmaster: Report unit resources
As well as returning application level resource revisions, Export now
returns resource revisions used by units.
api/migrationmaster: Report unit revisions
Export now reports unit resource revisions.
State functionality for setting unit resources
Expose SetUnitResource so that unit resources can be set
directly. Previously the only way was to use OpenResourceForUniter which
only clones the application's resource to the unit.
resource/resourceadapter: Use unit in URL
Instead of using the logged in entity to determine the unit, use the
actual unit in the URL. This is required now that we have the
migratonmaster worker in machine agent downloading unit resources.
apiserver: Support unit resource uploads
Extended the model migration resource upload endpoint to also accept
resource uploads for individual units.
migration: Migrate unit resources
UploadBinaries now handles unit resources in addition to application
resources.
Remove machine access to unit resource endpoint
It turns out there's no need to pull unit resource binaries.
state: Remove unneeded unit resource setting code
... because it's not necessary to supply unit resource binaries.
api/migrationmaster: Remove OpenUnitResource
There's no need to download per-unit resource binaries.
apiserver: No content for unit resources
There's no need to upload the content for a unit resource. Only the
metadata is needed.
Merge pull request #6693 from mjs/MM-unit-resources-import-develop
Support migration of unit resources

(NOTE: this was already reviewed when it merged into the 2.1 branch)

Units may have different resource revisions to the application and as such need to be migrated separately.

This PR adds API support for unit resources and updates the binaries migration functionality to support them.

### QA

- Migrate a resource using model and confirm the resources collection matches up such that the unit resource is recorded. Also checked gridfs and blobstore metadata.
- Migrate a resource using model where the application resource is different to a unit resource and inspect the DB is correct post-migration. Run add-unit on the application.
Merge pull request #6694 from frankban/safe-rename
Ensure the GUI uncompressed dir can be safely renamed.

The src and target directories must be in the same file system.
This fixes juju/juju-gui#2288
Fixes lp#1597830: worker should not restart agent.
Xenial machines with units would hang when trying to convert to state servers under HA after new revisions of systemd and dbus were introduced.

It was discovered that the conv2state worker would explicitly restart an agent. This proposal changes the behavior to throw an error instead to ensure that proper infrastructure restarts the agent cleanly.
Merge pull request #6696 from anastasiamac/develop-con2state-lp1597830
Fixes lp#1597830 on develop: worker should not restart agent.

Xenial machines with units would hang when trying to convert to state servers under HA after new revisions of systemd and dbus were introduced.

It was discovered that the conv2state worker would explicitly restart an agent. This proposal changes the behavior to throw an error instead to ensure that proper infrastructure restarts the agent cleanly.
Make WatchModels a collection watcher
The model worker manager was starting the model workers too early when a
model was migrated into the controller. To fix this we want it to wait
until the model MigrationMode has changed to MigrationModeNone, so the
watcher needs to signal on any changes to models, rather than just
life. Making it a collection watcher does this.

The apiserver code to remove state objects from the pool still needs to
watch life specifically, because if the model becomes Dead and is then
removed within the event coalescence time then the collection watcher
won't report any event, but a lifecycle watcher will.
Defer starting model workers when importing
Starting the migration master while the model was being imported back
into a controller that it had previously migrated out of caused it to
uninstall itself from the engine (thinking that it was finished
processing the old migration), which blocked any workers that depended
on ifNotMigrating.

Fixes https://bugs.launchpad.net/juju/+bug/1646310
Don't strip off model uuid for global watchers
If the collection it's watching is global, then the ids won't be
prefixed with the model uuid anyway - it works for the models collection
because it doesn't do anything if the prefix isn't there. If it's not a
global collection, then the model uuid will be needed to make sense of
the id that comes back.
Merge pull request #6700 from babbageclunk/migration-bounce
migrations: don't start model workers while a model is importing

Fixes https://bugs.launchpad.net/juju/+bug/1646310

(Recreated for develop branch from #6678.)

Starting the migration master while the model was being imported back
into a controller that it had previously migrated out of caused it to
uninstall itself from the engine (thinking that it was finished
processing the old migration), which blocked any workers that depended
on `ifNotMigrating`. To fix this we wait until the model `MigrationMode`
has changed to `MigrationModeNone` - this means the watcher needs to
signal on any changes to models, rather than just life.

Add a global flag to the `collectionWatcher` config - this lets the watcher return
ids for any model, rather than just the current one.

The apiserver code to remove state objects from the pool still needs to
watch life specifically. If the model becomes `Dead` and is then
removed within the event coalescence time then the collection watcher
won't report any event, but a lifecycle watcher will.

QA steps:
* bootstrap two controllers A and B
* create a model m in A with an application deployed
* migrate it to B
* migrate it back to A
* check the debug log has no dependency engine messages saying "fortress operation aborted" - these show the workers that are prevented from running because the migration master has exited
* migrate it back to B - this would have failed before this fix, because the previous migration left the model workers in a bad state
Merge pull request #6701 from howbazaar/centralhub-4.1-manifold
Add a manifold to provide the central hub as a worker resource.

This manifold can provide the hub to other workers.

When the apiserver and the peergrouper use the dependency engine, the centralhub worker should create the central hub, but for now, we need to create it in the machine agent and pass it in through the manifold config.

## QA

juju bootstrap lxd test
juju debug-log -m controller --replay | less

confirm that the "central-hub" worker has started and not stopped.
Ensure resource HTTP handlers release state
Thread the call to httpContext.release through so it can be called from
the right place in the resource HTTP handler.
Close state leaks from the NewResourceOpener code
This code path is a bit trickier to close than the other handler.
apiserver: Uploads of placeholder resources
Placeholder resources are those which haven't been downloaded by any
unit yet. The timestamp field is used to determine whether a resource is
a placeholder.
api/migrationtarget: Placeholder resource support
Added a new SetPlaceholderResource method which calls the
/migrate/resource endpoint with the parameters set for a placeholder
resource.

Also cleaned up tested for the resource upload endpoints.
migration: Support placeholder resources
Placeholder resources now use the specific client side API for handling
them.
worker/migrationmaster: Add SetPlaceholderResource
... to uploadWrapper. The ResourceUploader interface was extended to
include this.
Merge pull request #6703 from babbageclunk/resource-state-leak
Ensure resource HTTP handlers release state

The resource HTTP handler didn't release the state back into the state 
pool - this would mean that it would stick around in the pool even after 
the model was migrated away, preventing the same model from being 
migrated back. (It would also leak the state if the model was destroyed.)

Thread the call to httpContext.release through so it can be called from
the right place in the resource HTTP handler.

This is the last piece of fixing https://bugs.launchpad.net/juju/+bug/1641824

QA steps:
* bootstrap 2 controllers A and B
* create model m in A
* download the Mattermost binary from https://releases.mattermost.com/3.5.1/mattermost-team-3.5.1-linux-amd64.tar.gz
* deploy Mattermost using that binary distribution
```juju deploy -m A:m cs:~cmars/mattermost --resource bdist=<path to binary distribution>```
* migrate the model to b: `juju migrate -c A m B`
* then migrate it back: `juju migrate -c B m A`
* the migration should be successful
Merge pull request #6706 from mjs/MM-placeholder-resources-develop
migrations: Support placeholder resources

Placeholder resources are those which are defined by a charm but have not yet been requested by any unit and are therefore not stored in the controller yet. They need special handling because they have no content and can't be opened by OpenResource.

This is a forward port of #6692.
Merge pull request #6707 from wallyworld/cmr-remove-appid-hack
Fix application id hack in remote relations worker

Fixes a TODO in remote relations worker. The RegisterRelation call now returns the exported application id of the offered application, which is imported into the consuming model. The remote relations facade needed to gain the ImportRemoteEntity API.

The worker in the consuming model was also incorrectly exporting the remote application, not the local one; the worker has been restructured to properly register the remote relation at the right time in the lifecycle.

Finally, the export mutex is no longer needed so was deleted.

QA: bootstrap and setup a relation between wordpress and mysql in different models
state: Expose listing of pending resources
This is required for a migrations precheck.
migration: Block migration if resources pending
A resource is pending if it is in the process of being added or
downloaded (i.e. immediately after deploy or being downloaded by a
unit). Migrations shouldn't proceed if this is the case.
cmd/jujud/agent: Registers components in tests
Required to support migrationmaster worker.
Merge pull request #6714 from mjs/MM-pending-resources-develop
Migration precheck for pending resources

Don't allow a migration to proceed if a resource is pending. A resource is pending if it is in the process of being added or downloaded (i.e. immediately after deploy or is being downloaded). The pending state is typically short-lived.

This is a forward port of #6695.
state: Add missing tests for resource cleanup
There were no tests around resource blob removal. Also added some test
helpers around the NeedCleanup tests.
Ignore attempts to clean up a placeholder resource
These cleanups shouldn't get scheduled in the first place but this acts
as a backstop. See https://bugs.launchpad.net/juju/+bug/1649179
state: Don't schedule placeholder resource cleanup
Placeholder resources aren't stored in the blobstore and therefore don't
need cleaning up.

Fixes https://bugs.launchpad.net/juju/+bug/1649179
Merge pull request #6716 from mjs/1649179-placeholder-resource-cleanu…
…p-develop

Don't schedule placeholder resource cleanup

Placeholder resources aren't stored in the blobstore and therefore don't need cleaning up. The fix has been made in two ways:

1. Cleanups are no longer scheduled for placeholder resources.
2. The resource cleanup code now ignores cleanups without a storage path set. This is a useful backstop but also fixes stray cleanups in existing deployments.

Fixes https://bugs.launchpad.net/juju/+bug/1649179

There was no testing at all for resource cleanup so this has also been rectified.

This is a forward port #6702.
machine.AllSpaces() LinkLayerDevicesForSpaces()
These two functions give us the context for knowing what spaces
a container is going to need, and what host devices we will want
to associate with.
Merge pull request #6690 from jameinel/machine-all-spaces
Track Machine Spaces

This provides 2 key functions on Machine that I think will be necessary for our work on changing how containers are getting network devices on their host machine's bridges.

Machine.AllSpaces() combines the spaces that are specified in constraints with the endpoint bindings that are part of the active primary units on the machine.

Machine.LinkLayerDevicesForSpaces() filter's the link layer devices we know about for the machine, and sorts them into what space they are associated with. Between the two functions we are able to map the desired spaces for a new container onto the host's devices. This both lets us know if we have a bridge to use, and lets us create network devices only for the right ones.
migration: Check source controller version
As well as checking the source model version against the target
controller version, the source controller verison needs to be checked.

A migration is allowed as long as the major.minor of the target
controller is greater than or equal to the major.minor of the source
controller. This offers the possibility of migration back to the
original controller as long as the versions are close
together (i.e. have compatible migration APIs).
Source controller version migration prechecks
Make the source controller version available to migration prechecks
running on the target controller.
Merge pull request #6719 from mjs/MM-controller-version-precheck-develop
migration: Controller version precheck

As well as checking the source model version against the target controller version, the source controller verison needs to be checked.

A migration is allowed as long as the major.minor of the target controller is greater than or equal to the major.minor of the source controller. This offers the possibility of migration back to the original controller as long as the versions are close together (i.e. have compatible migration APIs).

Forward port of #6705.

bz2 approved these changes Dec 15, 2016

Dep change is correct.

@voidspace voidspace closed this Dec 15, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment