apiserver/uniter: Don't track container-scoped relations to applications not relevant for subordinate unit #7369

Merged
merged 1 commit into from May 23, 2017

Conversation

Projects
None yet
3 participants
Member

babbageclunk commented May 22, 2017

Description of change

If a subordinate application was related to more than one other app, units of the subordinate app wouldn't be cleaned up unless relations to all other apps were removed. This was because the uniter would track life for relations to all of the apps, even though the only relation that should keep this unit alive is the relation to the app that it's deployed alongside. For example, if a log monitoring app is related to mysql and mediawiki so that the units are as follows:

mysql/0
    logging/0
mediawiki/0
    logging/1 

... the logging/1 unit shouldn't track the life of the mysql-logging relation. If the mediawiki-logging relation dies then it should destroy itself.

This problem would also prevent you from removing one of the principal applications - destroying the application would hang waiting for the subordinate units to die. (This is the cause of bug 1655486.)

Change the uniter API so that a unit watches relations for a unit rather than at the application level, and make the WatchUnitRelations method filter out events for container-scoped relations to applications other than the unit's principal.

This change is added to the new v5 of the UniterAPI. There was some confusion in the versioning - the current version was registered as v4 and v5 (new methods would show up in the old version which causes problems for libjuju), there was a v3 in the code not linked to anything. This has been sorted out.

QA steps

For the first bug:

  • Deploy mysql, wordpress and nrpe to a model, and relate mysql and wordpress to nrpe. nrpe units will be created on each of the mysql and wordpress unit machines.
  • Remove the relation between mysql and nrpe.
  • The mysql nrpe unit should be removed, but the wordpress one should remain.

To confirm the second bug is fixed:

  • Deploy mysql, wordpress and ntp to a model. Make sure to specify --series trusty for mysql and ntp to ensure that everything can be linked up.
  • Relate mysql to wordpress, ntp to mysql, ntp to wordpress.
  • Once everything has settled down, remove wordpress - the wordpress unit and its subordinate ntp unit should be cleaned up and the application should be removed successfully.

Bug reference

Fixes https://bugs.launchpad.net/juju/+bug/1686696 and https://bugs.launchpad.net/juju/+bug/1655486

@babbageclunk babbageclunk changed the title from apiserver/uniter: Don't track container-scoped relations to irrelevant applications for a subordinate unit to apiserver/uniter: Don't track container-scoped relations to applications not relevant for a subordinate unit May 22, 2017

@babbageclunk babbageclunk changed the title from apiserver/uniter: Don't track container-scoped relations to applications not relevant for a subordinate unit to apiserver/uniter: Don't track container-scoped relations to applications not relevant for subordinate unit May 22, 2017

+// principalName app. Global relations will be included, but only
+// container-scoped relations for the principal application will be
+// emitted - other container-scoped relations will be filtered out.
+func newSubordinateRelationsWatcher(backend *state.State, subordinateApp *state.Application, principalName string) (
@babbageclunk

babbageclunk May 22, 2017

Member

I implemented this before finding out that the LifecycleWatcher can take a filter function to do this kind of thing. Let me know if this needs to be rewritten using that.

As discussed, need to cater for api compatibility

@@ -239,6 +239,35 @@ func (s *unitSuite) TestWatch(c *gc.C) {
wc.AssertOneChange()
}
+func (s *unitSuite) TestWatchRelations(c *gc.C) {
@wallyworld

wallyworld May 22, 2017

Owner

Maybe this test needs updating - it was copied across when the watching was done at an app level. Maybe there should be a scenario where we ensure the new watcher ignores unwanted relation changes.

@babbageclunk

babbageclunk May 23, 2017

Member

Good call, adding.

api/uniter/unit_test.go
+ c.Assert(err, jc.ErrorIsNil)
+ wc.AssertNoChange()
+
+ // Add another service and relate it to wordpress,
@wallyworld

wallyworld May 22, 2017

Owner

s/service/application
where it makes sense, make any driveby changes service->application

@wallyworld

wallyworld May 22, 2017

Owner

There's one other case that I know of that uses non-trivial non-regexp type filter - WatchRemoteRelations()
It could be argued that that other case should have been done like this, to avoid blocking the watcher queue on db lookups etc. I think this is probably ok, and maybe we should look to port the WatchRemoteRelations() to use this style

+ return errors.Trace(err)
+ }
+ if shouldSend {
+ currentRelations.Add(relation)
@wallyworld

wallyworld May 22, 2017

Owner

we don't need both currentRelations and outputData do we?
we could use just a set, and use set.Values() to get the slice to send

@babbageclunk

babbageclunk May 22, 2017

Member

My concern with doing that was that in a pathological case when there's a problem with the downstream so the outgoing messages aren't getting picked up and the current relations set is getting bigger and bigger, then each time you get one more relation you loop across all of the accumulated set to recreate the output slice, so it becomes O(n^2).

@wallyworld

wallyworld May 23, 2017

Owner

i don't quite follow. currentRelations set and outputData slice are updated/reset in sync; we are just duplicating data.
If there's nothing reading from the other end of the channel, this sending side will block regardless of whether a set or slice is used. So I don't think worrying about accumulating several relations in a set is something that we should be concerned with.

@babbageclunk

babbageclunk May 23, 2017

Member

If there's nothing reading on the out channel then we can still receive more on the input side. Each time around the loop we'd need to loop through the set to collect it into a slice (with .Values()), so if you got 100 different relations in separate events with nothing reading from the output then you'd have collected slices with 5050 elements.
But that's a crazy number of relations, really, so I think you're right - it's not necessary here.

+ if err != nil {
+ return false, errors.Trace(err)
+ }
+ // Not really sure why RelatedEndpoints returns a slice -
@wallyworld

wallyworld May 22, 2017

Owner

The result is > 1 element long, hence they did it as a slice. The could have used a struct.
No need to comment here though.

@babbageclunk

babbageclunk May 22, 2017

Member

Ohh, I think I missed that it could be a peer role - in that case counterpartRole would return peer and both endpoints might be returned. But there'd never be more than 1 result in a provides/requires relation.

Comment removed, anyway.

@@ -208,7 +208,7 @@ func (w *RemoteStateWatcher) loop(unitTag names.UnitTag) (err error) {
requiredEvents++
var seenRelationsChange bool
- relationsw, err := w.service.WatchRelations()
@wallyworld

wallyworld May 22, 2017

Owner

There's updates needed to the unit tests.
Currently, there's a relationsWatcher on mockService but things will need to be reworked so that relationsWatcher is on mockUnit; and the tests will need to be enhanced to check that only the correct units get events etc

@babbageclunk

babbageclunk May 23, 2017

Member

Oops, updated.

I'm not updating the tests to check that the correct unit gets events - it's the server that determines that, so the mock will still behave the same. Or am I missing something?

@wallyworld

wallyworld May 23, 2017

Owner

I think it's ok just to rework the mocks to match the backend

Thank you. Please manually test the various cases raised by the bugs and add cards to the QA lane to get CI tests written for these scenarios.
I still think the watcher doesn't need a set and a slice. Happy to discuss.

+ wc.AssertChange(loggingRel.Tag().Id())
+ wc.AssertNoChange()
+
+ // Adding a subordinate relation to another application doesn't notify this unit.
@wallyworld

wallyworld May 23, 2017

Owner

Can we just also add a subordinate relation that does trigger the watcher? I know it's elsewhere too, but to see the watcher trigger right next to the watcher not triggering is useful

+ return errors.Trace(err)
+ }
+ if shouldSend {
+ currentRelations.Add(relation)
@wallyworld

wallyworld May 22, 2017

Owner

we don't need both currentRelations and outputData do we?
we could use just a set, and use set.Values() to get the slice to send

@babbageclunk

babbageclunk May 22, 2017

Member

My concern with doing that was that in a pathological case when there's a problem with the downstream so the outgoing messages aren't getting picked up and the current relations set is getting bigger and bigger, then each time you get one more relation you loop across all of the accumulated set to recreate the output slice, so it becomes O(n^2).

@wallyworld

wallyworld May 23, 2017

Owner

i don't quite follow. currentRelations set and outputData slice are updated/reset in sync; we are just duplicating data.
If there's nothing reading from the other end of the channel, this sending side will block regardless of whether a set or slice is used. So I don't think worrying about accumulating several relations in a set is something that we should be concerned with.

@babbageclunk

babbageclunk May 23, 2017

Member

If there's nothing reading on the out channel then we can still receive more on the input side. Each time around the loop we'd need to loop through the set to collect it into a slice (with .Values()), so if you got 100 different relations in separate events with nothing reading from the output then you'd have collected slices with 5050 elements.
But that's a crazy number of relations, really, so I think you're right - it's not necessary here.

@@ -208,7 +208,7 @@ func (w *RemoteStateWatcher) loop(unitTag names.UnitTag) (err error) {
requiredEvents++
var seenRelationsChange bool
- relationsw, err := w.service.WatchRelations()
@wallyworld

wallyworld May 22, 2017

Owner

There's updates needed to the unit tests.
Currently, there's a relationsWatcher on mockService but things will need to be reworked so that relationsWatcher is on mockUnit; and the tests will need to be enhanced to check that only the correct units get events etc

@babbageclunk

babbageclunk May 23, 2017

Member

Oops, updated.

I'm not updating the tests to check that the correct unit gets events - it's the server that determines that, so the mock will still behave the same. Or am I missing something?

@wallyworld

wallyworld May 23, 2017

Owner

I think it's ok just to rework the mocks to match the backend

Don't track relations irrelevant to subordinate units
Make WatchApplicationRelations take units. Renamed it to
WatchUnitRelations, and updated the client facade and the uniter worker
to use it there.

If the requesting unit is a subordinate, we filter out any
container-scoped relations that aren't to the principal's
application. This prevents these relations from keeping a subordinate
unit alive if the relation between the subordinate application and its
principal's application is removed.

The uniter should still see other global-scoped relations these - the
existence of a global-scoped relation won't keep a subordinate unit
alive.

Rationalised the different versions of the API - there had been a few
confusing changes that left us in a state that didn't make much
sense. The result of the change is that we still have v4 and v5, with
the following differences:

v4 has the now-obsolete methods NetworkConfig and
WatchApplicationRelations.

v5 has the new methods SLALevel, NetworkInfo and WatchUnitRelations.

@wupeka pointed out that the reflection machinery ignores 2-argument
methods, so we can prevent a method from being exposed in the API by
hiding it behind a 2-argument one. This enables a simpler structure for
different API versions.
Member

babbageclunk commented May 23, 2017

$$merge$$

Contributor

jujubot commented May 23, 2017

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

Contributor

jujubot commented May 23, 2017

Build failed: Tests failed
build url: http://juju-ci.vapour.ws:8080/job/github-merge-juju/10959

Member

babbageclunk commented May 23, 2017

It seems like something bad happened to the mongo instance while running the feature tests (not totally sure).

$$merge$$

Contributor

jujubot commented May 23, 2017

Status: merge request accepted. Url: http://juju-ci.vapour.ws:8080/job/github-merge-juju

@jujubot jujubot merged commit d6598db into juju:develop May 23, 2017

1 check passed

github-check-merge-juju Built PR, ran unit tests, and tested LXD deploy. Use !!.*!! to request another build. IE, !!build!!, !!retry!!
Details

@babbageclunk babbageclunk deleted the babbageclunk:subordinates branch May 23, 2017

Member

babbageclunk commented May 23, 2017

W00t!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment