Skip to content

Mitogen intermittent hangs on "Connection timed out" target  #598

Closed
@antigenius0910

Description

@antigenius0910
* Which version of Ansible are you running?
     ansible 2.4.6.0
     config file = /etc/ansible/ansible.cfg
     configured module search path = [u'/home/yen306/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
     ansible python module location = /usr/local/lib/python2.7/dist-packages/ansible
     executable location = /usr/local/bin/ansible
     python version = 2.7.15rc1 (default, Nov 12 2018, 14:31:15) [GCC 7.3.0]

* Is your version of Ansible patched in any way?
     No

* Are you running with any custom modules, or `module_utils` loaded?
     No, /etc/ansible/ansible.cfg:18:#module_utils   = /usr/share/my_module_utils/ 

* Have you tried the latest master version from Git?
     Yes

* Do you have some idea of what the underlying problem may be?
    NO, using "-vvv -e mitogen_ssh_debug_level=3" for log capturing

* Mention your host and target OS and versions
    Linux  4.15.0-1014 #16-Ubuntu SMP Tue Dec 11 11:19:10 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux

* Mention your host and target Python versions
    Python 2.7.15rc1

* If reporting any kind of problem with Ansible, please include the Ansible
  version along with output of "ansible-config dump --only-changed".

ANSIBLE_SSH_ARGS(/etc/ansible/ansible.cfg) = -o UserKnownHostsFile=/dev/null
DEFAULT_DEBUG(env: ANSIBLE_DEBUG) = True
DEFAULT_FORKS(/etc/ansible/ansible.cfg) = 10
DEFAULT_STRATEGY(/etc/ansible/ansible.cfg) = mitogen_linear
DEFAULT_STRATEGY_PLUGIN_PATH(/etc/ansible/ansible.cfg) = [u'/home/mitogen-0.2.7/ansible_mitogen/plugins/strategy']
HOST_KEY_CHECKING(/etc/ansible/ansible.cfg) = False
INVENTORY_ENABLED(/etc/ansible/ansible.cfg) = ['host_list', 'virtualbox', 'yaml', 'constructed', 'ini', 'script']

First thanks for the awesome project! I had a 11 times performance improvement!!! after using mitogen. However, I am facing below problem and wondering if you can take a look.

Submit ansible playbook jobs in a for loop for 20 time (96 hosts)

export ANSIBLE_DEBUG=1; for ((n=0;n<20;n++)); do time ansible-playbook -i mzoneinis/mzoneXXX.ini ping.yml -vvv -e mitogen_ssh_debug_level=3; done

and it will intermittently hang on (Observed output)

[mux  60775] 20:27:59.833484 D mitogen: mitogen.core.Stream('unix_client.61010').on_disconnect()
[task 61010] 20:27:59.833479 D mitogen: Waker(Broker(0x7fbb6fef01d0) rfd=41, wfd=42).on_disconnect()
[task 61010] 20:27:59.834034 D mitogen: Router(Broker(0x7fbb6fef01d0)): stats: 0 module requests in 0 ms, 0 sent (0 ms minify time), 0 negative responses. Sent 0.0 kb total, 0.0 kb avg.
 61010 1561840079.83510: done running TaskExecutor() for hostname/TASK: ping [506b4bf5-3974-a649-31ae-000000000056]
 61010 1561840079.83533: sending task result for task 506b4bf5-3974-a649-31ae-000000000056
 61010 1561840079.83567: done sending task result for task 506b4bf5-3974-a649-31ae-000000000056
 61010 1561840079.83575: WORKER PROCESS EXITING
fatal: [hostname]: UNREACHABLE! => {
    "changed": false, 
    "msg": "Connection timed out.", 
    "unreachable": true
}
[mux  60775] 20:28:03.169269 D mitogen: ssh.10.249.3.28: debug3: send packet: type 80
[mux  60775] 20:28:03.169646 D mitogen: ssh.10.249.3.28: debug3: receive packet: type 82
[mux  60775] 20:28:03.172083 D mitogen: ssh.10.249.3.33: debug3: send packet: type 80
[mux  60775] 20:28:03.172364 D mitogen: ssh.10.249.3.33: debug3: receive packet: type 82
[mux  60775] 20:28:03.172592 D mitogen: ssh.10.249.3.32: debug3: send packet: type 80
[mux  60775] 20:28:03.172764 D mitogen: ssh.10.249.3.32: debug3: receive packet: type 82

If it doesn't hang it looks like below (Expected output)

task 9426] 19:32:07.468708 D mitogen: mitogen.core.Stream('unix_listener.9171').on_disconnect()
[mux  9171] 19:32:07.469339 D mitogen: mitogen.core.Stream('unix_client.9426').on_disconnect()
[task 9426] 19:32:07.469341 D mitogen: Waker(Broker(0x7fca93433810) rfd=41, wfd=42).on_disconnect()
[task 9426] 19:32:07.469874 D mitogen: Router(Broker(0x7fca93433810)): stats: 0 module requests in 0 ms, 0 sent (0 ms minify time), 0 negative responses. Sent 0.0 kb total, 0.0 kb avg.
  9426 1561836727.47090: done running TaskExecutor() for hostname/TASK: ping [506b4bf5-3974-0ca7-6a9c-000000000056]
  9426 1561836727.47111: sending task result for task 506b4bf5-3974-0ca7-6a9c-000000000056
  9426 1561836727.47143: done sending task result for task 506b4bf5-3974-0ca7-6a9c-000000000056
  9426 1561836727.47149: WORKER PROCESS EXITING
fatal: [hostname]: UNREACHABLE! => {
    "changed": false, 
    "msg": "Connection timed out.", 
    "unreachable": true
}
  9158 1561836727.47340: no more pending results, returning what we have

if I take out strategy_plugins = /home/zabbixserver/zabbixautomation/mitogen-0.2.7/ansible_mitogen/plugins/strategy strategy = mitogen_linearI can loop it for 50 times and still won't hang

Below is the content of ping.yml playbook

---
- hosts: all
  gather_facts: no
  tasks:
    - ping:

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions