Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Fail actions which are interrupted due to an agent restart in the middle #16922

Merged
merged 1 commit into from
Feb 13, 2024

Conversation

wallyworld
Copy link
Member

If a unit agent is killed and restarted in the middle of running an action, the action stays forever in the running state. This is because the unit agent was not running the FailAction operation to mark the action as failed and hence terminate it. In addition, the commit operation at the end of the fail action was not wired up to correctly update local state to remove the action from the "todo" list.

QA steps

run an action like so

juju exec -u someunit/0 -- "echo hello; reboot"

After a few seconds for the machine to reboot, the CLI will exit back to a shell prompt (it blocks in the meantime).
The action is failed, and juju show-task <n> --format yaml shows the action was terminated and marked as failed.

Previously the CLI would block forever and the action would stay in the running state.

Links

https://bugs.launchpad.net/juju/+bug/2012861

Jira card: JUJU-5438

Copy link
Member

@hpidcock hpidcock left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Excellent.

@wallyworld
Copy link
Member Author

/merge

@jujubot jujubot merged commit 1f88619 into juju:3.3 Feb 13, 2024
20 of 23 checks passed
@Aflynn50 Aflynn50 mentioned this pull request Feb 20, 2024
jujubot added a commit that referenced this pull request Feb 20, 2024
#16939

There were no merge conflicts from this, the commits added are:

#16844 Support minikube env folder on juju snap.
#16876 Pass all the availability zones when discovering subnets in clustered lxd.
#16890 Revert dot-minikube on snapcraft until auto-connect is approved.
#16567 Add lease manager worker documentation.
#16898 Suppress klog log messages. 
#16905 Ensure we close client.
#16915 Merge branch '3.1' of https://github.com/juju/juju into 3.1-into-3.3.
#16917 Fix docker cache for github actions.
#16922 Fail actions which are interrupted due to an agent restart in the middle.
#16930 Fix the HasModelAdmin API to handle non admins properly.
#16936 Merge branch '2.9' into merge-2.9-3.1. 
#16937 Merge branch '3.1' into merge-3.1-3.3.
#16938 Add check for relation-get with app arguemnt but no --app flag.
@Aflynn50 Aflynn50 mentioned this pull request Feb 20, 2024
jujubot added a commit that referenced this pull request Feb 20, 2024
#16951

The only merge conflicts were juju version numbers, all resolved to 3.5.



- #16939 
 - #16844 
 - #16876 
 - #16890 
 - #16567 
 - #16898 
 - #16905 
 - #16915 
 - #16917 
 - #16922 
 - #16930 
 - #16936 
 - #16937 
 - #16938 
- #16946
- #16934
- #16928
- #16889
- #16883
- #16880
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants