Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Windows: winRM Read timed out. (read timeout=nnn). Again! #78797

Open
1 task done
oskarahner opened this issue Sep 16, 2022 · 4 comments
Open
1 task done

Windows: winRM Read timed out. (read timeout=nnn). Again! #78797

oskarahner opened this issue Sep 16, 2022 · 4 comments
Labels
affects_2.9 This issue/PR affects Ansible v2.9 affects_2.16 bug This issue/PR relates to a bug. needs_verified This issue needs to be verified/reproduced by maintainer support:core This issue/PR relates to code supported by the Ansible Engineering Team. waiting_on_contributor This would be accepted but there are no plans to actively work on it.

Comments

@oskarahner
Copy link
Contributor

oskarahner commented Sep 16, 2022

Summary

The winRM error

"winrm connection error: HTTPSConnectionPool(host='1.2.3.4', port=5986): Read timed out. (read timeout=nnn)"

is thrown and causes the ansible task to fail. It happens very unpredictable and seemingly randomly. It does not help (very little) to adjust the winrm setting 'ansible_winrm_connection_timeout' to different values, the exception is still thrown seemingly randomly and unpredictable.

For more details see pull request #78796

Issue Type

Bug Report

Component Name

Connection Plugin 'winrm'

Ansible Version

$ ansible --version
ansible 2.9.27                                                                                                                                                          
  config file = None                                                                                                                                                    
  configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']                                                            
  ansible python module location = /opt/nnn/venv/ansible2/lib/python2.7/site-packages/ansible                                                                       
  executable location = /opt/nnn/venv/ansible2/bin/ansible                                                                                                          
  python version = 2.7.9 (default, Sep 19 2021, 00:35:19) [GCC 4.3.4 [gcc-4_3-branch revision 152973]]

Configuration

# if using a version older than ansible-core 2.12 you should omit the '-t all'
$ ansible-config dump --only-changed -t all

OS / Environment

SLES 11.4

Steps to Reproduce

See pull request: #78796
The winRM error "Read timed out" is thrown unexpectedly and cannot be reproduced at will (without actively injecting code provoking the timers to expire)

Expected Results

If the two timeout parameters (operation timeout in the WS-Man session and the HTTP-request transport read_timeout in the client session) are set independently and with a difference with about 10 seconds, this winRM will not happen during normal network operation.

The expected result is that the winrm-code should be more tolerant against networks with long latency and roundtrips since the two involved timers are running on different computers, the ansible controller and the winRM-server which can be located far away from each other network wise. If the difference between these two timers are higher, this error can be almost eliminated.

Actual Results

Inserted debug prints in the code, too see what is happening (sorry for some Swedish)

Jun 20 14:44:54 dpgbmg02 gb0972s001: OSKAR _raw_get_command_output, innan send_message, Request: {u'env:Envelope': {u'@xmlns:xsd': u'http://www.w3.org/2001/XMLSchema', 
u'env:Header': {u'w:OptionSet': {u'w:Option': [{u'#text': u'TRUE', u'@Name': u'WSMAN_CMDSHELL_OPTION_KEEPALIVE"'}]}, u'w:MaxEnvelopeSize': {u'@mustUnderstand': u'true',
 u'#text': u'153600'}, u'a:Action': {u'@mustUnderstand': u'true', u'#text': u'http://schemas.microsoft.com/wbem/wsman/1/windows/shell/Receive'}, u'w:Locale': {u'@mustUn··
derstand': u'false', u'@xml:lang': u'en-US'}, u'a:MessageID': u'uuid:738e8862-db36-4434-a04e-66b2b8ef5b67', u'w:OperationTimeout': u'PT200S', u'a:ReplyTo': {u'a:Address····
': {u'@mustUnderstand': u'true', u'#text': u'http://schemas.xmlsoap.org/ws/2004/08/addressing/role/anonymous'}}, u'w:SelectorSet': {u'w:Selector': {u'#text': '442E6464-
9CE9-4730-91DE-B07FA20FED21', u'@Name': u'ShellId'}}, u'w:ResourceURI': {u'@mustUnderstand': u'true', u'#text': u'http://schemas.microsoft.com/wbem/wsman/1/windows/shel
l/cmd'}, u'a:To': u'http://windows-host:5985/wsman',                                                                                                                    
Jun 20 14:44:54 dpgbmg02 gb0972s001:  u'p:DataLocale': {u'@mustUnderstand': u'false', u'@xml:lang': u'en-US'}}, u'@xmlns:xsi': u'http://www.w3.org/2001/XMLSchema-instan
ce', u'@xmlns:rsp': u'http://schemas.microsoft.com/wbem/wsman/1/windows/shell', u'@xmlns:p': u'http://schemas.microsoft.com/wbem/wsman/1/wsman.xsd', u'env:Body': {u'rsp
:Receive': {u'rsp:DesiredStream': {u'#text': u'stdout stderr', u'@CommandId': 'AD0C5EB8-FE47-4089-8C49-AF2D64217488'}}}, u'@xmlns:w': u'http://schemas.dmtf.org/wbem/wsm·
an/1/wsman.xsd', u'@xmlns:x': u'http://schemas.xmlsoap.org/ws/2004/09/transfer', u'@xmlns:a': u'http://schemas.xmlsoap.org/ws/2004/08/addressing', u'@xmlns:b': u'http:/
/schemas.dmtf.org/wbem/wsman/1/cimbinding.xsd', u'@xmlns:env': u'http://www.w3.org/2003/05/soap-envelope', u'@xmlns:cfg': u'http://schemas.microsoft.com/wbem/wsman/1/co
nfig', u'@xmlns:n': u'http://schemas.xmlsoap.org/ws/2004/09/enumeration'}}                                                                                              
Jun 20 14:48:02 dpgbmg02 gb0971s001: OSKAR: Transport efter session.send (low level). Innan raise_for_status()                                                          
Jun 20 14:48:02 dpgbmg02 gb0971s001: OSKAR: Transport efter session.send (low level). Efter raise_for_status()                                                          
Jun 20 14:48:02 dpgbmg02 gb0971s001: failed: [gb0971s001] (item=nnn-ms-sql-server-standard-2017) => {                                                               
Jun 20 14:48:02 dpgbmg02 gb0971s001:     "ansible_loop_var": "fjpkglist_package_name",                                                                                  
Jun 20 14:48:02 dpgbmg02 gb0971s001:     "fjpkglist_package_name": "nnn-ms-sql-server-standard-2017",                                                               
Jun 20 14:48:02 dpgbmg02 gb0971s001:     "msg": "winrm connection error: HTTPSConnectionPool(host='172.16.76.240', port=5986): Read timed out. (read timeout=201)",     
Jun 20 14:48:02 dpgbmg02 gb0971s001:     "unreachable": true                                                                                                            
Jun 20 14:48:02 dpgbmg02 gb0971s001: }

Code of Conduct

  • I agree to follow the Ansible Code of Conduct
@ansibot
Copy link
Contributor

ansibot commented Sep 16, 2022

Files identified in the description:

  • lib/ansible/plugins/connection

If these files are incorrect, please update the component name section of the description or use the !component bot command.

click here for bot help

@ansibot ansibot added affects_2.9 This issue/PR affects Ansible v2.9 bug This issue/PR relates to a bug. needs_triage Needs a first human triage before being processed. support:core This issue/PR relates to code supported by the Ansible Engineering Team. labels Sep 16, 2022
@mkrizek mkrizek removed the needs_triage Needs a first human triage before being processed. label Sep 20, 2022
@oskarahner
Copy link
Contributor Author

oskarahner commented Sep 21, 2022

Referring to discussion in:
#78796 (review)

@jborean93 I agree, when I looked at the documentation page for ansible regarding timeouts (https://docs.ansible.com/ansible/2.9/user_guide/windows_winrm.html) it clearly documents two variables:

ansible_winrm_operation_timeout_sec
ansible_winrm_read_timeout_sec

However in the connection module winrm.py only this option is actually parsed:
self._winrm_connection_timeout = self.get_option('connection_timeout')

(

self._winrm_connection_timeout = self.get_option('connection_timeout')
)

And then later on in the same code, these assigments are done:

            if self._winrm_connection_timeout:
                winrm_kwargs['operation_timeout_sec'] = self._winrm_connection_timeout
                winrm_kwargs['read_timeout_sec'] = self._winrm_connection_timeout + 1

So here (suddenly) the phrase "connection_timeout" is replaced with 'operation_timeout' and 'read_timeout' and the concept connection_timeout is not conceptually used anymore. I think this is confusing and should be fixed.

I saw that You have exposed the Request-api to support the tuple (instead of a single value timeout):
(connect and read) from PSRP. I'm thinking, is this needed for winrm also or not ? I agree that sysadmin most likely will not have different values for these timeouts, but its good to have the opportunity.
(https://requests.readthedocs.io/en/latest/user/advanced/#timeouts)

I will open a new PR towards devel for this matter, lets continue the discussion there. I agree, that we need to keep it simple and maybe not make it complex with many variants of timeouts.

Also, I think just +10 instead of +1 actually solves most cases of this issue, and its a simple fix to do. Nevertheless, there maybe sysadmins who want to fine tune these settings also.

@bcoca bcoca added affects_2.16 needs_verified This issue needs to be verified/reproduced by maintainer labels Apr 5, 2023
@s-hertel s-hertel added the waiting_on_contributor This would be accepted but there are no plans to actively work on it. label Dec 6, 2023
@juvon19
Copy link

juvon19 commented Jan 29, 2024

Hey, this affects me as well.. anybody have a workaround for now?
Will it be fixed?

@oskarahner
Copy link
Contributor Author

I think this fix was merged into master, dont know which maitenance-branches it went out to, though.
If the fix is not in your branch, you need to add the fix manually.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
affects_2.9 This issue/PR affects Ansible v2.9 affects_2.16 bug This issue/PR relates to a bug. needs_verified This issue needs to be verified/reproduced by maintainer support:core This issue/PR relates to code supported by the Ansible Engineering Team. waiting_on_contributor This would be accepted but there are no plans to actively work on it.
Projects
None yet
Development

No branches or pull requests

6 participants