Right now, operator-controller is using helm-operator-plugins, which automatically uninstalls failed installations and rolls back failed upgrades.
This is problematic because the installation or upgrade may have progressed to the point of no return. For example:
- a pre-install/pre-upgrade hook may have run a migration and a subsequent object failed to update.
- after any hooks run, a deployment may have started that ran a migration, but then subsequent objects fail to update.
In my opinion, a better behavior is to just stop, inform a user of the problem, and require human intervention to resolve and progress.