Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Post-run error #333

Closed
knightsg opened this issue Aug 1, 2018 · 7 comments
Closed

Post-run error #333

knightsg opened this issue Aug 1, 2018 · 7 comments

Comments

@knightsg
Copy link

@knightsg knightsg commented Aug 1, 2018

Background: I previously had an issue document in #319 where ansible using mitogen would fail immediately after starting a playbook run. That was resolved, and it appeared to be working fine. When I successfully tested that fix, I was using a playbook that imports 4 roles of 4 - 15 tasks each.

Current issue: I tried to run ansible using mitogen against our site.yml playbook that imports a bunch (18) of other playbooks. When I run site.yml against the same server (using -l ), I get this error after the run completes:


ERROR! [pid 4766] 13:57:01.410645 E mitogen.ctx.local.4790: mitogen: <IoLogger stderr> crashed
Traceback (most recent call last):
  File "<stdin>", line 1778, in _call
  File "<stdin>", line 1556, in on_receive
  File "<stdin>", line 896, in read
  File "<stdin>", line 287, in io_op
OSError: [Errno 11] Resource temporarily unavailable
ERROR! [pid 4766] 13:57:01.445036 E mitogen.ctx.local.4790: mitogen: <IoLogger stdout> crashed
Traceback (most recent call last):
  File "<stdin>", line 1778, in _call
  File "<stdin>", line 1556, in on_receive
  File "<stdin>", line 896, in read
  File "<stdin>", line 287, in io_op
OSError: [Errno 11] Resource temporarily unavailable

It doesn't appear to cause any issues apart from the error, but I thought I should post it here nonetheless.

Setup
Controller
Uname: Linux XXXXXXXXXX 4.4.0-43-Microsoft #1-Microsoft Wed Dec 31 14:42:53 PST 2014 x86_64 x86_64 x86_64 GNU/Linux (Windows Subsystem for Link - WSL, which is currently running Ubuntu 16.04.5)

Ansible:
ansible 2.4.6.0 (stable-2.4 9d08bc8242) last updated 2018/07/16 11:49:56 (GMT -700)
config file = /root/.ansible.cfg
configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/ansible/lib/ansible
executable location = /usr/local/ansible/bin/ansible
python version = 2.7.12 (default, Dec 4 2017, 14:50:18) [GCC 5.4.0 20160609]

Notes: Added 'strategy_plugins = /path/to/mitogen-0.2.1/ansible_mitogen/plugins/strategy' to defaults and ran ansible-playbook with 'ANSIBLE_STRATEGY=mitogen_linear' set.

Host(s)
Uname: Linux XXXXXXXXXXXXXXX 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Python 2.7.6

@knightsg
Copy link
Author

@knightsg knightsg commented Aug 7, 2018

Unfortunately it's still happening for me in the master branch (latest pulled commit is #898c06f1).

@dw
Copy link
Member

@dw dw commented Aug 8, 2018

Sorry for the delay -- been super busy :) I'm betting this is basically an epoll manifestation of https://github.com/dw/mitogen/issues/320

The fix will be similar, however the epoll interface is totally different to kqueue, so I need some free time to properly investigate it.

There is an underlying root cause to both these bugs, the previous fix didn't find/fix it.

@knightsg
Copy link
Author

@knightsg knightsg commented Aug 8, 2018

No worries, and thanks for the reply. I'm not in any major rush to get this working.

dw added a commit that referenced this issue Oct 30, 2018
Now poller is start enough to know a start_receive() during an iteration
does not cause events yielded by that iteration to associate with the
wrong descriptor.

These changes are tangentially related to the associated ticket, but
event versioning is still the underlying issue.
dw added a commit that referenced this issue Oct 30, 2018
dw added a commit that referenced this issue Oct 30, 2018
@dw
Copy link
Member

@dw dw commented Nov 2, 2018

Somehow made it 3 months without noticing you're running WSL. The root cause is that WSL produces conflicting socket readiness / error codes when a socket is half-shutdown.

Ultra-simple fix is to avoid shutting down sockets like that on WSL, but the mechanism is required to ensure timely cleanup of child contexts.

dw added a commit that referenced this issue Nov 2, 2018
@dw dw closed this in 1d32ed3 Nov 2, 2018
dw added a commit that referenced this issue Nov 2, 2018
- issue #323, #333 WSL workaround.
@dw
Copy link
Member

@dw dw commented Nov 2, 2018

I've simply avoided shutdown() use on WSL for now. If the WSL bug ever gets fixed, I'll look at removing the workaround.

This is now on the master branch and will make it into the next release. To be updated when a new release is made, subscribe to https://networkgenomics.com/mail/mitogen-announce/

Thanks for reporting this!

@dw dw added the os:wsl label Nov 2, 2018
@knightsg
Copy link
Author

@knightsg knightsg commented Jan 11, 2019

I tried this out today and it's now working without any errors. Just thought I'd confirm the fix - thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
3 participants