Skip to content
Merged
13 changes: 13 additions & 0 deletions src/charm.py
Original file line number Diff line number Diff line change
Expand Up @@ -46,6 +46,7 @@
InstallEvent,
LeaderElectedEvent,
RelationDepartedEvent,
SecretRemoveEvent,
StartEvent,
)
from ops.framework import EventBase
Expand Down Expand Up @@ -201,6 +202,7 @@ def __init__(self, *args):
self.framework.observe(self.on.set_password_action, self._on_set_password)
self.framework.observe(self.on.promote_to_primary_action, self._on_promote_to_primary)
self.framework.observe(self.on.update_status, self._on_update_status)
self.framework.observe(self.on.secret_remove, self._on_secret_remove)
self.cluster_name = self.app.name
self._member_name = self.unit.name.replace("/", "-")

Expand Down Expand Up @@ -435,6 +437,17 @@ def get_hostname_by_unit(self, _) -> str:
# the underlying provider (LXD, MAAS, etc.), the unit IP is returned.
return self._unit_ip

def _on_secret_remove(self, event: SecretRemoveEvent) -> None:
# A secret removal (entire removal, not just a revision removal) causes
# https://github.com/juju/juju/issues/20794. This check is to avoid the
# errors that would happen if we tried to remove the revision in that case
# (in the revision removal, the label is present).
if event.secret.label is None:
logger.debug("Secret with no label cannot be removed")
return
logger.debug(f"Removing secret with label {event.secret.label} revision {event.revision}")
event.remove_revision()
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

When I was testing an implementation like this with my own charms, I ran into an issue with juju 3.6.9. We have a bugfix prepared that should make this implementation ok, but we might want to be defensive in how we write our charm.
juju/juju#20796 is the bug fix and
juju/juju#20794 is the issue.

Specifically, the issue is that older versions of juju, when a secret is fully-removed, will trigger secret-remove with the old revisions of the secret. However, that secret no longer exists, so that will cause the hook to fail, because it tries to delete something that doesn't exist.

I do see that you have some code to try and handle some of that:

    def remove(self, label: str) -> None:
        """Remove a secret from the cache."""
        if secret := self.get(label):
            try:
                secret.remove()
                self._secrets.pop(label)
            except (SecretsUnavailableError, KeyError):
                pass
            else:
                return
        logging.debug("Non-existing Juju Secret was attempted to be removed %s", label)

However, in my testing, because Juju tries to make the removal of a secret revision transaction consistent, it doesn't actually do anything until the hook completes, at which point, the charm has no way of resolving it (hence the above bug).

The fix that I did in my test charm was:

val=`secret-get $JUJU_SECRET_ID`
if [ -n "$val" ] ; then
  secret-remove --revision $JUJU_SECRET_REVISION $JUJU_SECRET_ID;
else
  juju-log -l WARNING secret $JUJU_SECRET_ID already deleted;
fi

I'm not a huge fan of the workaround. (Not least of which, it means that your application now becomes an observer of the charm content, so it will also get secret-changed events. I suppose you might be able to use --peek? I'm not sure if that sets you as an observer.)

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thank you so much, @jameinel, for your comments in this PR. I'm now handling the issue from your above comment through the check added on a6c36bb.


def _on_get_primary(self, event: ActionEvent) -> None:
"""Get primary instance."""
try:
Expand Down