Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

win_reboot: Ansible 2.8.5 win_reboot never successfully validate reboot completion #27

Closed
jborean93 opened this issue Mar 11, 2020 · 6 comments · Fixed by #214
Closed

Comments

@jborean93
Copy link
Collaborator

From @vfuyong on Sep 16, 2019 08:35

SUMMARY

win_reboot is able to initiate reboot successfully. After 15 seconds, Windows is back with Windows Remote Management service running fine. Ansible debug log says it is still sending commands to get OS bootup time even Windows is ready to receive commands.

However, since reboot is initiated I validated that Ansible is NOT sending any more wsman protocol action (like shell creation, execute command, send input, receive, signal or shell deletion) over HTTPS despite logging repetitively says "ESTABLISH WINRM CONNECTION FOR USER: on PORT 5002 TO windows01" (this is totally unexpected behavior)

ISSUE TYPE
  • Bug Report
COMPONENT NAME

win_reboot

ANSIBLE VERSION

ansible 2.8.5
config file = /etc/ansible/ansible.cfg
configured module search path = [u'/home/young/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/lib/python2.7/dist-packages/ansible
executable location = /usr/bin/ansible
python version = 2.7.15+ (default, Jul 9 2019, 16:51:35) [GCC 7.4.0]

CONFIGURATION

DEFAULT_FORKS(/etc/ansible/ansible.cfg) = 100

OS / ENVIRONMENT

Ansible client runs on Ubuntu 18
Target: Windows 2016

STEPS TO REPRODUCE
- name: Reboot
  hosts: all
  gather_facts: no
  tasks:
    - name: Reboot immediately
      win_reboot:
        connect_timeout: 60
        post_reboot_delay: 0
        reboot_timeout: 30 # longer than reboot duration 15s
EXPECTED RESULTS

Expecting win_reboot to detect Windows reboot has completed

ACTUAL RESULTS
<windows01> ESTABLISH WINRM CONNECTION FOR USER: e1c_ma on PORT 5002 TO windows01
EXEC (via pipeline wrapper)
win_reboot: rebooting server...
EXEC (via pipeline wrapper)
win_reboot: validating reboot
win_reboot: attempting to get system boot time
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
win_reboot: attempting to get system boot time
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
win_reboot: attempting to get system boot time
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
win_reboot: attempting to get system boot time
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
win_reboot: attempting to get system boot time
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: <user> on PORT 5002 TO windows01
fatal: [windows01]: FAILED! => {
    "changed": false,
    "elapsed": 30,
    "msg": "Timed out waiting for last boot time check (timeout=30)",
    "rebooted": true
}

PLAY RECAP *************************************************************************************************************
windows01                  : ok=0    changed=0    unreachable=0    failed=1    skipped=0    rescued=0    ignored=0

Copied from original issue: ansible/ansible#62343

@jborean93
Copy link
Collaborator Author

From @samdoran on Sep 20, 2019 19:11

I am unable to reproduce this problem. I see the connection attempts while the system reboots, then the task complete successfully.

 76839 1569006375.37396: win_reboot: getting boot time with command: '(Get-WmiObject -ClassName Win32_OperatingSystem).LastBootUpTime'
<10.77.1.51> ESTABLISH WINRM CONNECTION FOR USER: vagrant on PORT 5985 TO 10.77.1.51
 76839 1569006375.78557: win_reboot: last boot time: 20190920190540.492950+000
 76839 1569006375.78560: win_reboot: saving original connect_timeout of 90
win_reboot: rebooting server...
 76839 1569006375.78579: win_reboot: distribution: {'name': 'windows', 'version': '', 'family': ''}
 76839 1569006375.78583: win_reboot: rebooting server with command 'PowerShell -NoProfile -NonInteractive -ExecutionPolicy Unrestricted -EncodedCommand UwBlAHQALQBTAHQAcgBpAGMAdABNAG8AZABlACAALQBWAGUAcgBzAGkAbwBuACAATABhAHQAZQBzAHQACgBzAGgAdQB0AGQAbwB3AG4AIAAvAHIAIAAvAHQAIAAyACAALwBjACAAIgBSAGUAYgBvAG8AdAAgAGkAbgBpAHQAaQBhAHQAZQBkACAAYgB5ACAAQQBuAHMAaQBiAGwAZQAiAAoASQBmACAAKAAtAG4AbwB0ACAAJAA/ACkAIAB7ACAASQBmACAAKABHAGUAdAAtAFYAYQByAGkAYQBiAGwAZQAgAEwAQQBTAFQARQBYAEkAVABDAE8ARABFACAALQBFAHIAcgBvAHIAQQBjAHQAaQBvAG4AIABTAGkAbABlAG4AdABsAHkAQwBvAG4AdABpAG4AdQBlACkAIAB7ACAAZQB4AGkAdAAgACQATABBAFMAVABFAFgASQBUAEMATwBEAEUAIAB9ACAARQBsAHMAZQAgAHsAIABlAHgAaQB0ACAAMQAgAH0AIAB9AA=='
win_reboot: validating reboot
win_reboot: attempting to get system boot time
 76839 1569006376.29580: win_reboot: setting connect_timeout to 60
<10.77.1.51> ESTABLISH WINRM CONNECTION FOR USER: vagrant on PORT 5985 TO 10.77.1.51
 76839 1569006376.35241: win_reboot: getting boot time with command: '(Get-WmiObject -ClassName Win32_OperatingSystem).LastBootUpTime'
 76839 1569006376.67783: win_reboot: last boot time: 20190920190540.492950+000
 76839 1569006376.67794: win_reboot: last boot time check fail 'boot time has not changed', retrying in 1.48 seconds...
win_reboot: attempting to get system boot time
 76839 1569006378.15874: win_reboot: setting connect_timeout to 60
<10.77.1.51> ESTABLISH WINRM CONNECTION FOR USER: vagrant on PORT 5985 TO 10.77.1.51
 76839 1569006378.20622: win_reboot: getting boot time with command: '(Get-WmiObject -ClassName Win32_OperatingSystem).LastBootUpTime'
 76839 1569006378.55017: win_reboot: last boot time: 20190920190540.492950+000
 76839 1569006378.55022: win_reboot: last boot time check fail 'boot time has not changed', retrying in 2.466 seconds...
win_reboot: attempting to get system boot time
 76839 1569006381.01682: win_reboot: setting connect_timeout to 60
<10.77.1.51> ESTABLISH WINRM CONNECTION FOR USER: vagrant on PORT 5985 TO 10.77.1.51
 76839 1569006381.07872: win_reboot: getting boot time with command: '(Get-WmiObject -ClassName Win32_OperatingSystem).LastBootUpTime'
 76839 1569006381.44018: win_reboot: last boot time: 20190920190540.492950+000
 76839 1569006381.44023: win_reboot: last boot time check fail 'boot time has not changed', retrying in 4.767 seconds...
win_reboot: attempting to get system boot time
 76839 1569006386.20831: win_reboot: setting connect_timeout to 60
<10.77.1.51> ESTABLISH WINRM CONNECTION FOR USER: vagrant on PORT 5985 TO 10.77.1.51
 76839 1569006393.40274: win_reboot: getting boot time with command: '(Get-WmiObject -ClassName Win32_OperatingSystem).LastBootUpTime'
 76839 1569006399.19239: win_reboot: last boot time: 20190920190626.488380+000
 76839 1569006399.19243: win_reboot: last boot time check success
 76839 1569006399.19246: win_reboot: setting connect_timeout back to original value of 90
<10.77.1.51> ESTABLISH WINRM CONNECTION FOR USER: vagrant on PORT 5985 TO 10.77.1.51
win_reboot: attempting post-reboot test command
 76839 1569006399.29016: win_reboot: attempting post-reboot test command 'whoami'
win_reboot: system successfully rebooted
 76839 1569006399.73707: win_reboot: post-reboot test command success

Can you please run env ANSIBLE_DEBUG=1 ansible-playbook [playbook].yml -vvv | grep -E 'reboot: |WINRM'

However, since reboot is initiated I validated that Ansible is NOT sending any more wsman protocol action...

Could you do a packet capture on the Ansible node to see if the packets are leaving the Ansible control node? If the Windows machine does indeed boot, could you perform a packet capture on the Windows hosts to see if it is receiving the packets?

needs_info

@jborean93
Copy link
Collaborator Author

From @vfuyong on Oct 10, 2019 11:26

@samdoran Attached logs here.

env ANSIBLE_DEBUG=1 ansible-playbook <host file> reboot.yml -vvv | grep -E 'reboot: |WINRM'
    81 1570706611.42800: win_reboot: getting boot time with command: '(Get-WmiObject -ClassName Win32_OperatingSystem).LastBootUpTime'
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
    81 1570706616.30026: win_reboot: last boot time: 20191010151721.488799+240
    81 1570706616.30029: win_reboot: saving original connect_timeout of None
win_reboot: rebooting server...
    81 1570706616.30039: win_reboot: distribution: {'version': '', 'name': 'windows', 'family': ''}
    81 1570706616.30041: win_reboot: rebooting server with command 'PowerShell -NoProfile -NonInteractive -ExecutionPolicy Unrestricted -EncodedCommand UwBlAHQALQBTAHQAcgBpAGMAdABNAG8AZABlACAALQBWAGUAcgBzAGkAbwBuACAATABhAHQAZQBzAHQACgBzAGgAdQB0AGQAbwB3AG4AIAAvAHIAIAAvAHQAIAAyACAALwBjACAAIgBSAGUAYgBvAG8AdAAgAGkAbgBpAHQAaQBhAHQAZQBkACAAYgB5ACAAQQBuAHMAaQBiAGwAZQAiAAoASQBmACAAKAAtAG4AbwB0ACAAJAA/ACkAIAB7ACAASQBmACAAKABHAGUAdAAtAFYAYQByAGkAYQBiAGwAZQAgAEwAQQBTAFQARQBYAEkAVABDAE8ARABFACAALQBFAHIAcgBvAHIAQQBjAHQAaQBvAG4AIABTAGkAbABlAG4AdABsAHkAQwBvAG4AdABpAG4AdQBlACkAIAB7ACAAZQB4AGkAdAAgACQATABBAFMAVABFAFgASQBUAEMATwBEAEUAIAB9ACAARQBsAHMAZQAgAHsAIABlAHgAaQB0ACAAMQAgAH0AIAB9AA=='
win_reboot: validating reboot
win_reboot: attempting to get system boot time
    81 1570706636.39898: win_reboot: setting connect_timeout to 60
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
    81 1570706636.40137: win_reboot: last boot time check fail 'ssl: coercing to Unicode: need string or buffer, int found', retrying in 1.484 seconds...
win_reboot: attempting to get system boot time
    81 1570706637.88585: win_reboot: setting connect_timeout to 60
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
    81 1570706637.88672: win_reboot: last boot time check fail 'ssl: coercing to Unicode: need string or buffer, int found', retrying in 2.994 seconds...
win_reboot: attempting to get system boot time
    81 1570706640.88134: win_reboot: setting connect_timeout to 60
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
    81 1570706640.88227: win_reboot: last boot time check fail 'ssl: coercing to Unicode: need string or buffer, int found', retrying in 4.182 seconds...
win_reboot: attempting to get system boot time
    81 1570706645.06524: win_reboot: setting connect_timeout to 60
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
    81 1570706645.06617: win_reboot: last boot time check fail 'ssl: coercing to Unicode: need string or buffer, int found', retrying in 8.276 seconds...
win_reboot: attempting to get system boot time
    81 1570706653.34254: win_reboot: setting connect_timeout to 60
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
    81 1570706653.34308: win_reboot: last boot time check fail 'ssl: coercing to Unicode: need string or buffer, int found', retrying in 12.63 seconds...
win_reboot: attempting to get system boot time
    81 1570706665.97546: win_reboot: setting connect_timeout to 60
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
<windows01> ESTABLISH WINRM CONNECTION FOR USER: Administrator on PORT 5002 TO windows01
    81 1570706665.97609: win_reboot: last boot time check fail 'ssl: coercing to Unicode: need string or buffer, int found', retrying in 12.39 seconds...

@jborean93
Copy link
Collaborator Author

From @samdoran on Oct 14, 2019 14:15

@vfuyong Thank for the detailed logs. That helps a lot. I can see the problem. The fix isn't clear, but I have enough to investigate further.

@jborean93
Copy link
Collaborator Author

From @vfuyong on Oct 17, 2019 12:03

@samdoran Do you know what can cause 'ssl: coercing to Unicode: need string or buffer, int found'? I am using WinRM over HTTPS.

@jborean93
Copy link
Collaborator Author

From @ShachafGoldstein on Nov 17, 2019 07:01

What version of pywinrm are you using?

@jborean93 jborean93 changed the title Ansible 2.8.5 win_reboot never successfully validate reboot completion win_reboot: Ansible 2.8.5 win_reboot never successfully validate reboot completion Mar 12, 2020
@msilmser
Copy link

msilmser commented Jul 7, 2020

@jborean93 i have a customer running into this issue with pywinrm 0.3.0, will ask them to upgrade to latest and verify if anything changes

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging a pull request may close this issue.

2 participants