Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

win_system.reboot: wait_for_reboot limited by minion retry timer #39469

Closed
morganwillcock opened this issue Feb 16, 2017 · 4 comments
Closed
Labels
info-needed waiting for more info
Milestone

Comments

@morganwillcock
Copy link
Contributor

Description of Issue/Question

I've been seeing some inconsistency when wait_for_reboot: True is used with the reboot function of the win_system execution module. Initially I thought that the sleep wasn't long enough to cover component based serving on shutdown, but it looks like the function throws an exception because of the shutdown signal.

2017-02-16 21:56:53,621 [salt.state       ][INFO    ][2908] Running state [system.reboot] at time 21:56:53.621000
2017-02-16 21:56:53,635 [salt.state       ][INFO    ][2908] Executing state module.run for system.reboot
---
2017-02-16 21:56:57,721 [salt.minion      ][INFO    ][1500] Creating minion process manager
2017-02-16 21:56:57,753 [salt.minion      ][INFO    ][1500] Starting a new job with PID 1500
2017-02-16 21:56:57,769 [salt.utils.lazy  ][DEBUG   ][1500] LazyLoaded saltutil.find_job
2017-02-16 21:56:57,769 [salt.utils.lazy  ][DEBUG   ][1500] LazyLoaded direct_call.get
2017-02-16 21:56:57,769 [salt.minion      ][DEBUG   ][1500] Minion return retry timer set to 8 seconds (randomized)
---
2017-02-16 21:57:05,818 [salt.minion      ][INFO    ][1704] User root Executing command saltutil.find_job with jid 20170216215614216925
2017-02-16 21:57:05,818 [salt.minion      ][DEBUG   ][1704] Command details {'tgt_type': 'list', 'jid': '20170216215614216925', 'tgt': ['salt-test'], 'ret': '', 'user': 'root', 'arg': ['20170216215306875641'], 'fun': 'saltutil.find_job'}
2017-02-16 21:57:06,769 [salt.state       ][ERROR   ][2908] Module function system.reboot threw an exception. Exception: [Errno 4] Interrupted function
---

Replacing the sleep function with a long running process also exhibits the same issue. In my case (because updates have been applied) the 8 seconds was short enough that the system is still shutting down and the next state function starts to run:

2017-02-16 21:57:06,769 [salt.state       ][INFO    ][2908] Completed state [system.reboot] at time 21:57:06.769000 duration_in_ms=13148.0
2017-02-16 21:57:06,769 [salt.utils.lazy  ][DEBUG   ][2908] LazyLoaded file.recurse
2017-02-16 21:57:06,769 [salt.state       ][INFO    ][2908] Running state [C:\Updates\IE-Win7] at time 21:57:06.769000
2017-02-16 21:57:06,769 [salt.state       ][INFO    ][2908] Executing state file.recurse for C:\Updates\IE-Win7
2017-02-16 21:57:07,142 [salt.utils.parsers][WARNING ][1704] Minion received a SIGINT. Exiting.
2017-02-16 21:57:07,142 [salt.cli.daemons ][INFO    ][1704] Shutting down the Salt Minion

Steps to Reproduce Issue

This is probably tricky to reproduce, but I imagine any time updates are installed or the disk is slow on shutdown, the state run may continue because the minion service is still running.

It looks like it may not affect Python 3.
https://www.python.org/dev/peps/pep-0475/

Versions Report

Tested on 2016.11.2

@Ch3LL
Copy link
Contributor

Ch3LL commented Feb 17, 2017

@morganwillcock it seems you are attempting to run a state module with a sls file. Would you share a small snippet of that to help attempt to replicate this?

@Ch3LL Ch3LL added the info-needed waiting for more info label Feb 17, 2017
@Ch3LL Ch3LL added this to the Blocked milestone Feb 17, 2017
@morganwillcock
Copy link
Contributor Author

@Ch3LL thanks for looking.

install_security_rollup:
  dism.package_installed:
    - name: 'C:\path\to\file.cab'

# this should sleep for 35 seconds
reboot_after_update:
  module.run:
    - name: system.reboot
    - timeout: 5
    - in_seconds: True
    - wait_for_reboot: True
    - only_on_pending_reboot: True

# only do something else if not rebooting
do_something_else:
  ...

There are win_system state functions in develop that will work around this, but I imagine they aren't going to get backported to the stable branches as they are brand new. Basically, I'm only calling the execution module because the state functions aren't available yet, but wasn't expecting to encounter the timing problems.

@morganwillcock
Copy link
Contributor Author

Function was superseded by a state module, which doesn't encounter the same issue.

@ElVirtualJefe
Copy link

I am having this issue in 3005.1, both with the state and the module. Neither are waiting for the reboot to complete, and are failing out with a [Not connected] error.

How can I make this work, so that I can reboot, and then continue with a state. I have tried both with the state and the module.

Here is my SLS that I am testing with:

pre_reboot:
  system.reboot:
    - message: "Pre-Reboot Starting..."
    - timeout: 5
    - in_seconds: True
    - only_on_pending_reboot: False

win_wua.list:
  module.run:
    - install: True
    - online: True
    - download: True

post_reboot:
  module.run:
    - name: system.reboot
    - timeout: 5
    - in_seconds: True
    - wait_for_reboot: True
    - only_on_pending_reboot: True

Here is my return:

root@salt-master [ ~ ]# salt salt-minion state.apply saltenv=test Deployment.wua
test-minion:
    Minion did not return. [Not connected]
ERROR: Minions returned with non-zero exit code

Am I missing something, or is this still an issue? I have looked for answers, for several days, and haven't been able to find anything.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
info-needed waiting for more info
Projects
None yet
Development

No branches or pull requests

3 participants