apiserver/uniter: Ignore EnterScope with invalid principals #7542

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
2 participants
Member

babbageclunk commented Jun 22, 2017

Description of change

If an uniter for a subordinate unit requests to EnterScope for a
relation that isn't valid because its principal isn't a member of the
relation, we ignore it. This is needed for clean upgrading from 2.1 to
2.2.1 - there's an upgrade step that corrects unitcounts and invalid
relationscope records, but without this change the agents will re-enter
the scopes incorrectly when they reconnect to the controller (because we
don't get a chance to upgrade them first).

QA steps

  • Bootstrap a 2.1.1 controller.
  • Deploy the ubuntu charm twice as ubuntu and ubuntu2 apps, and ntp.
  • Relate ntp to ubuntu and ubuntu2.
  • Confirm in the DB that there are relationscopes records referring to ubuntu2 for the relation between ubuntu and ntp.
  • Confirm that the unitcount for the ubuntu-ntp relation is 3, which is consistent with other information in the DB but incorrect.
  • Upgrade the controller and model to the version including this PR (2.2.1)
  • Confirm that the "correct unit counts" upgrade step message appears in the logs.
  • Check in the DB that the unit count for the ubuntu-ntp relation is now 2, and that there aren't any ubuntu2 records in the relationscopes records for the ubuntu-ntp relation.
  • Remove the ubuntu application - it should finish successfully.

Bug reference

Actually fixes https://bugs.launchpad.net/juju/+bug/1699050

babbageclunk added some commits Jun 22, 2017

state: Prevent RelationUnits with bad scopes
It was possible to create a RU for a subordinate unit where the
principal isn't a member of the relation. For example, if you had
logging related to mysql and wordpress, you could create an RU for a
logging unit attached to a wordpress unit in the mysql-logging relation,
and then cause havoc.

Prevent this by checking the principal is in the endpoints and raise a
specific InvalidPrincipal error - we'll want to catch and ignore this
error in the apiserver to avoid crashing agents in an upgrade.
apiserver/uniter: Ignore EnterScope with invalid principals
If an uniter for a subordinate unit requests to EnterScope for a
relation that isn't valid because its principal isn't a member of the
relation, we ignore it. This is needed for clean upgrading from 2.1 to
2.2.1 - there's an upgrade step that corrects unitcounts and invalid
relationscope records, but without this change the agents will re-enter
the scopes incorrectly when they reconnect to the controller (because we
don't get a chance to upgrade them first).

A few niggles

+ Relation: mysqlRel.Tag().String(),
+ Unit: wpLoggingU.Tag().String(),
+ }}}
+ result, err := api.EnterScope(args)
@wallyworld

wallyworld Jun 22, 2017

Owner

We should check not just that the error is nil, but also that nothing has been done

@babbageclunk

babbageclunk Jun 22, 2017

Member

Well, I thought this was ok, because the way you check whether it's in scope is by creating a RelationUnit and then calling ru.InScope(). I'll change it to look in the DB.

@babbageclunk

babbageclunk Jun 22, 2017

Member

Ha ha, "are you in scope?".

+
+// IsInvalidPrincipalError returns whether the given error or its
+// cause is ErrInvalidPrincipal.
+func IsInvalidPrincipalError(err interface{}) bool {
@babbageclunk

babbageclunk Jun 22, 2017

Member

I was copying (maybe cargo-culting?) this from the other is-error functions in the file.

@babbageclunk

babbageclunk Jun 22, 2017

Member

As discussed in IRC, it doesn't look like there's any good reason for this - pretty much all the other Is*Error functions in the codebase take error. I'll change it.

+ return false
+ }
+ // In case of a wrapped error, check the cause first.
+ value := err
@wallyworld

wallyworld Jun 22, 2017

Owner

don't need all this

_, ok := errors.Cause(err).(*ErrInvalidPrincipal)
return ok

Member

babbageclunk commented Jun 22, 2017

!!build!!

jujubot added a commit that referenced this pull request Jun 23, 2017

Merge pull request #7547 from babbageclunk/relationunit-isvalid
apiserver/uniter: Ignore EnterScope from invalid units

## Description of change

If we see a request to EnterScope where the unit's principal isn't in
the relation specified, we log and ignore it. These requests can happen
when a controller is upgraded from 2.1 to 2.2, and the model has a 
subordinate application related to multiple principals - the uniter used to think
that those units should be in the relation, but they shouldn't, and
allowing them to enter scope will prevent the application from being
removed because its relations don't go away (since they have units in
scope that never leave).

Ignore the requests to prevent the old uniters from crashing with an
error, which would prevent upgrading them.
Alternative implementation of #7542, with the check for validity of the relation unit being
opt-in rather than raising an error on construction - there are at least 3 or 4 places in the
code that create potentially-invalid `RelationUnit`s to query `InScope` or `Joined`, and
this was a simpler fix to avoid breaking them.

## QA steps

* Bootstrap a 2.1.1 controller.
* Deploy the ubuntu charm twice as ubuntu and ubuntu2 apps, and ntp.
* Relate ntp to ubuntu and ubuntu2.
* Confirm in the DB that there are relationscopes records referring to ubuntu2 for the relation between ubuntu and ntp.
* Confirm that the unitcount for the ubuntu-ntp relation is 3, which is consistent with other information in the DB but incorrect.
* Upgrade the controller and model to the version including this PR (2.2.1)
* Confirm that the "correct unit counts" upgrade step message appears in the logs.
* Check in the DB that the unit count for the ubuntu-ntp relation is now 2, and that there aren't any ubuntu2 records in the relationscopes records for the ubuntu-ntp relation.
* Observe "ignoring <unit> EnterScope" messages in the controller log indicating that the uniter tried to re-enter the invalid unit.
* Remove the ubuntu application - it should finish successfully.

## Bug reference

Really fixes https://bugs.launchpad.net/juju/+bug/1699050
Member

babbageclunk commented Jun 23, 2017

Superseded by #7547.

@babbageclunk babbageclunk deleted the babbageclunk:bad-relationunits branch Jun 23, 2017

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment