Skip to content
This repository has been archived by the owner on Apr 2, 2024. It is now read-only.

@when handlers still fire the handler even if the associated endpoint_from_flag() is None #231

Closed
ajkavanagh opened this issue Feb 7, 2021 · 3 comments

Comments

@ajkavanagh
Copy link
Contributor

Essentially, a handler gets called despite not matching the conditions on the handler. An example:

From mysql-innodb-cluster (and resulting in bug https://bugs.launchpad.net/charm-mysql-innodb-cluster/+bug/1896809):

@reactive.when_not('local.cluster.all-users-created')
@reactive.when('cluster.available')
def create_remote_cluster_user():
    """Create remote cluster user.

    Create the remote cluster peer user and grant cluster permissions in the
    MySQL DB.

    :param cluster: Cluster interface
    :type cluster: MySQLInnoDBClusterPeers object
    """
    cluster = reactive.endpoint_from_flag("cluster.available")
    with charm.provide_charm_instance() as instance:
        ch_core.hookenv.log("Creating remote users.", "DEBUG")
        for unit in cluster.all_joined_units:
            if not instance.create_cluster_user(

Note that all_joined_units property is being accessed on the cluster obtained from the endpoint_from_flag() call.

The stack-trace:

2021-02-06 22:29:08 INFO juju-log Invoking reactive handler: reactive/mysql_innodb_cluster_handlers.py:291:scale_out
2021-02-06 22:29:08 ERROR juju-log Hook error:
Traceback (most recent call last):
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/charms/reactive/__init__.py", line 74, in main
    bus.dispatch(restricted=restricted_mode)
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 390, in dispatch
    _invoke(other_handlers)
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 359, in _invoke
    handler.invoke()
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/.venv/lib/python3.8/site-packages/charms/reactive/bus.py", line 181, in invoke
    self._action(*args)
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/charm/reactive/mysql_innodb_cluster_handlers.py", line 310, in scale_out
    create_remote_cluster_user()
  File "/var/lib/juju/agents/unit-mysql-innodb-cluster-2/charm/reactive/mysql_innodb_cluster_handlers.py", line 105, in create_remote_cluster_user
    for unit in cluster.all_joined_units:
AttributeError: 'NoneType' object has no attribute 'all_joined_units'

i.e. all joined units is None. I don't think that this should be possible as the handler shouldn't be able to be invoked unless the @when('cluster.available) is truth-y.

@johnsca
Copy link
Contributor

johnsca commented Feb 8, 2021

The interface layer is using @when conditions to set or clear the cluster.available flag. Because the order of invocation of @when handlers isn't defined, this can lead to a race condition where the charm's handler gets invoked before the interface layer has a chance to clear the flag. The safest way to fix this is to switch to Endpoint.manage_flags() which is guaranteed to allow the interface to set up all flags prior to any charm handlers running. It should also simplify the flag management logic in the layer.

@ajkavanagh
Copy link
Contributor Author

It's not really about the race. It's that the handler still gets called despite the condition not being honored as decorated on that handler. i.e. the framework isn't doing what it says it is doing. It is calling a handler when a condition isn't true. That's a bug, I think. Solving it may not necessarily be trivial as I expect that it needs to re-evaluate the conditions just before a handler is called to verify that it should still be called.

The alternative is to say: "actually, it's a reasonable endeavour, but the caller of endpoint_from_flag() must check the return value to ensure that it is still valid at that point in time". That's a slightly different contract, but needs to be made clear, I think.

@ajkavanagh
Copy link
Contributor Author

ajkavanagh commented Feb 8, 2021

It looks like the handler invoked scale_out which has the following sig/conditions:

@reactive.when('endpoint.cluster.changed.unit-configure-ready')
@reactive.when('leadership.set.cluster-instances-clustered')
@reactive.when('leadership.is_leader')
def scale_out():
    """Handle scale-out adding new nodes to an existing cluster."""
...

I.e. there is no check on the flag. So, moral of the story is probably that we always check endpoint_from_flag() as the handler may not have been checked. I think the behaviour has been seen, but this isn't it. Sorry for the incorrect bug report; we'll keep an eye out for this in the future.

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants