Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ModuleNotFoundError with 'ansible_mitogen.target' #586

Closed
href opened this issue May 10, 2019 · 11 comments

Comments

@href
Copy link

commented May 10, 2019

I've encountered a bit of a problem with Suitable and Mitogen that I don't quite know how to solve. When trying to connect from a Python3.6/3.7 host to a Python3.6 Ubuntu 18.04 vanilla installation I get the following error:

Traceback (most recent call last):
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/ansible/executor/task_executor.py
line 140, in run
    res = self._execute()
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/ansible/executor/task_executor.py
line 612, in _execute
    result = self._handler.run(task_vars=variables)
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/ansible_mitogen/mixins.py
line 116, in run
    return super(ActionModuleMixin, self).run(tmp, task_vars)
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/ansible/plugins/action/command.py
line 24, in run
'    results = merge_hash(results, '
'self._execute_module(task_vars=task_vars, '
wrap_async=wrap_async))
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/ansible_mitogen/mixins.py
line 354, in _execute_module
    self._temp_file_gibberish(module_args, wrap_async)
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/ansible_mitogen/mixins.py
line 333, in _temp_file_gibberish
    self._connection.get_good_temp_dir()
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/ansible_mitogen/connection.py
line 702, in get_good_temp_dir
    self._connect()
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/ansible_mitogen/connection.py
line 721, in _connect
    self._connect_stack(stack)
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/ansible_mitogen/connection.py
line 675, in _connect_stack
    stack=mitogen.utils.cast(list(stack)),
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/mitogen/core.py
line 1859, in call_service
    return recv.get().unpickle()
File /Users/denis/.virtualenvs/operations/lib/python3.7/site-packages/mitogen/core.py
line 835, in unpickle
    raise obj
mitogen.core.CallError: builtins.ModuleNotFoundError: The
Mitogen master process was unable to serve
"'ansible_mitogen.target'. It may be a native Python extension, "
or it may be missing entirely. Check the importer debug logs on
the master for more information.
  File "<stdin>", line 3099, in _dispatch_one
  File "<stdin>", line 3086, in _parse_request
  File "<stdin>", line 596, in import_module
  File "<stdin>", line 1330, in load_module

They key is this:

mitogen.core.CallError: builtins.ModuleNotFoundError: The
Mitogen master process was unable to serve
"'ansible_mitogen.target'. It may be a native Python extension, "
or it may be missing entirely. Check the importer debug logs on
the master for more information.

I know this might be related to Suitable's on hackery ending up in something that is like Ansible, but not quite like Ansible. So I'm not sure this is a Mitogen issue. But I'd be happy to get any pointers.

Especially, how can I check the importer debug logs?

I'm running with MITOGEN_LOG_LEVEL=DEBUG and mitogen.log_to_file, but I do not have a proper test-setup yet that I could show off. I'm happy to provide one, but at this point I'm just as happy for any kind of information what might be going on here.

What could be the cause of this error?

@href

This comment has been minimized.

Copy link
Author

commented May 10, 2019

I'm setting the ansible_python_interpreter to /usr/bin/python3 by the way. If I just use python2 on the target, it works.

@dw dw added user-reported ansible and removed ansible labels May 11, 2019

@dw

This comment has been minimized.

Copy link
Owner

commented Jun 3, 2019

Sorry @href! You should ping me on Twitter if I've gone dead over here :) Will look at this ASAP tomorrow evening

@href

This comment has been minimized.

Copy link
Author

commented Jun 3, 2019

No worries, I kinda forgot about this too, so you know it's not a critical issue ;)

@dw

This comment has been minimized.

Copy link
Owner

commented Jun 3, 2019

When you have the logging framework configured with logging.basicConfig(level=logging.DEBUG) or when Ansible was started with -vvv, you should see lines like:

[mux  22568] 13:11:26.687390 D mitogen: _get_module_via_sys_modules(u'ansible.module_utils.distro._distro') -> <module 'ansible.module_utils.distro._distro' from '/home/dmw/src/mitogen/.venv/local/lib/python2.7/site-packages/ansible/module_utils/distro/_distro.pyc'>
[mux  22568] 13:11:26.691216 D mitogen: ssh.localhost: sending module ansible.module_utils._text (3.71 KiB)
[mux  22568] 13:11:26.691673 D mitogen: _build_tuple('/home/dmw/src/mitogen/.venv/lib/python2.7/site-packages/ansible/module_utils/common/__init__.py', u'ansible.module_utils.common') -> [u'_collections_compat', u'_json_compat', u'_utils', u'collections', u'dict_transformations', u'file', u'network', u'parameters', u'process', u'removed', u'sys_info', u'text', u'validation']
[mux  22568] 13:11:26.691833 D mitogen: ssh.localhost: sending module ansible.module_utils.common (0.43 KiB)
[mux  22568] 13:11:26.692558 D mitogen: ssh.localhost: sending module ansible.module_utils.common._utils (0.95 KiB)

Unfortunately everything goes to the same logger right now, so it's not clear what the importer parts are -- but it is referring to the lines above. Before build_tuple() completes and prints those lines, a lot of complex machinery runs that should print errors of its own, if dependency scanning fails or if it was unable to find the source code for a module.

If you can point me at a reproduction I'll be happy to run this locally.

@href

This comment has been minimized.

Copy link
Author

commented Jun 3, 2019

I'll get back to you as soon as I find the time, thanks!

@href

This comment has been minimized.

Copy link
Author

commented Jun 14, 2019

I was able to narrow this down somewhat in a separate test-case.

The weird thing is that it only happens if run inside a Python CLI, not when run as a script. Only in this test however, in the original case I encountered this, I had the same error in a script.

This is how you should be able to see the error. First, install dependencies:

pip install mitogen suitable ansible==2.7.11

Then, run the Python CLI:

python

Then paste the following code into the CLI:

import mitogen.utils
from suitable.mitogen import Api

mitogen.utils.log_to_file(level='DEBUG')
Api('127.0.0.1').command('whoami')

I already set up a container and everything to test this out, but it turns out that there's no connection needed either. So I'm not sure why exactly this occurs when Python 2 is not on the host in my original error. But maybe this error just masks something else.

@href

This comment has been minimized.

Copy link
Author

commented Aug 6, 2019

@dw Do you mind having a look at this? I've got another user who has found this problem and I think it is also the cause of Mitogen not working on the servers we have that only run Python 3.

@dw

This comment has been minimized.

Copy link
Owner

commented Aug 8, 2019

Hi @href, I hope to get to this one real soon. I'm so sorry for the wait, it's only the past week or so I've been able to spend serious time on the library again, and getting stuff merged to handle all of 2.8's features has been the priority. Naturally there has been a lot of fallout from doing a big merge

This issue is probably really simple, will try and get to it as soon as CI is reliably healthy

@dw

This comment has been minimized.

Copy link
Owner

commented Aug 9, 2019

What joyful luck (for me, at least!)

The reason it only manifests in the interactive interpreter is because sys.modules["__main__"] is different in that case.

Ansible 2.7 for inexplicable reasons changed modules_utils to import __main__, but in Mitogen there is no __main__ except the real module running on the master. This would mean pulling literally the Ansible main script over to the target machine if it worked, but the Ansible extension has always used an importer blacklist to prevent all kinds of random fun with third party modules, so this was never more than an ImportError.

Meanwhile to cope with that, ansible_mitogen.target synthesizes a 'real stub' __main__ in the child interpreter during import, before importing any of module_utils which will indirectly trigger the import of __main__.

Fast forward a little further, and sometime relatively recently in Python 3 land, pkgutil was broken due to this new 'importer specs' cleanup, that meant the old module loader could no longer rely on pkgutil to resolve __main__, and a custom DefectivePython3xMainMethod was added. This was actually done for OpsMop.

Unfortunately the new method was never written for __main__ coming from the interactive console, where the __file__ attribute is absent, and this is the error you are (indirectly) receiving.

Given

import mitogen.utils
from suitable.mitogen import Api

mitogen.utils.log_to_file(level='debug')
Api('127.0.0.2', verbosity='debug').command('whoami')

If you scroll up a good bit in the logs, you will see a WARNING generated by the importer (in bright pink on my terminal) about the failed import, and if you scroll back yet a little further, you'll see the exception being thrown by DefectivePython3xMainMethod. These exceptions are converted into silent warnings because of the sheer amount of abuse the import mechanism receives, noisy heuristics and lucky guesses are almost all we can hope for to systematically find a module's source in the general case, and so a failing method simply moves on to the next method.

And now for the lucky part.. @marc1006 sent #612 just this afternoon to fix your bug :) So it is already on master, mere hours before I finally sat down to figure out what was wrong.

Soooo sorry for the huge delay on getting to this -- hopefully I'll have at least the next week to burn through more backlog. Master is approaching releasable, but it is a 7kLOC diff compared to 0.2.7 and needs at least a few nights of soak tests (depending on outcome of the previous night's run!) to get out the door

Thanks again for reporting this!

@dw dw closed this Aug 9, 2019

dw added a commit that referenced this issue Aug 9, 2019

@dw

This comment has been minimized.

Copy link
Owner

commented Aug 9, 2019

@href as a final note, for verbosity="debug", you might consider still silencing the mitogen.io logger. It is a rare (but essential) occasion when it comes in useful, and becoming rarer all the time. Usually it just generates a ton of useless confusing spam

@href

This comment has been minimized.

Copy link
Author

commented Aug 9, 2019

Thanks @dw. Don't worry about this taking some time - I know too well how these things go 😏

Glad this got fixed and thanks for your input and detailed explanation. Looking forward to the release of 0.2.8!

dw added a commit that referenced this issue Aug 9, 2019

Merge remote-tracking branch 'origin/dmw'
* origin/dmw:
  issue #482: another Py3 fix
  ci: try removing exclude: to make Azure jobs work again
  compat: fix Py2.4 SyntaxError
  issue #482: remove 'ssh' from checked processes
  ci: Py3 fix
  issue #279: add one more test for max_message_size
  issue #482: ci: add stray process checks to all jobs
  tests: fix format string error
  core: MitogenProtocol.is_privileged was not set in children
  issue #482: tests: fail DockerMixin tests if stray processes exist
  docs: update Changelog.
  issue #586: update Changelog.
  docs: update Changelog.
  [security] core: undirectional routing wasn't respected in some cases
  docs: tidy up Select.all()
  issue #612: update Changelog.

dw added a commit that referenced this issue Aug 18, 2019

Merge remote-tracking branch 'origin/v028' into stable
* origin/v028: (383 commits)
  Bump version for release.
  docs: update Changelog for 0.2.8.
  issue #627: add test and tweak Reaper behaviour.
  docs: lots more changelog concision
  docs: changelog concision
  docs: more changelog tweaks
  docs: reorder chapters
  docs: versionless <title>
  docs: update supported Ansible version, mention unsupported features
  docs: changelog fixes/tweaks
  issue #590: update Changelog.
  issue #621: send ADD_ROUTE earlier and add test for early logging.
  issue #590: whoops, import missing test modules
  issue #590: rework ParentEnumerationMethod to recursively handle bad modules
  issue #627: reduce the default pool size in a child to 2.
  tests: add a few extra service tests.
  docs: some more hyperlink joy
  docs: more hyperlinks
  docs: add domainrefs plugin to make link aliases everywhere \o/
  docs: link IS_DEAD in changelog
  docs: tweaks to better explain changelog race
  issue #533: update routing to account for DEL_ROUTE propagation race
  tests: use defer_sync() Rather than defer() + ancient sync_with_broker()
  tests: one case from doas_test was invoking su
  tests: hide memory-mapped files from lsof output
  issue #615: remove meaningless test
  issue #625: ignore SIGINT within MuxProcess
  issue #625: use exec() instead of subprocess in mitogen_ansible_playbook
  issue #615: regression test
  issue #615: update Changelog.
  issue #615: ensure 4GB max_message_size is configured for task workers.
  issue #615: update Changelog.
  issue #615: route a dead message to recipients when no reply is expected
  issue #615: fetch_file() might be called with AnsibleUnicode.
  issue #615: redirect 'fetch' action to 'mitogen_fetch'.
  issue #615: extricate slurp brainwrong from mitogen_fetch
  issue #615: ansible: import Ansible fetch.py action plug-in
  issue #533: include object identity of Stream in repr()
  docs: lots more changelog
  issue #595: add buildah to docs and changelog.
  docs: a few more internals.rst additions
  ci: update to Ansible 2.8.3
  tests: another random string changed in 2.8.3
  tests: fix sudo_flags_failure for Ansible 2.8.3
  ci: fix procps command line format warning
  Whoops, merge together lgtm.yml and .lgtm.yml
  issue #440: log Python version during bootstrap.
  docs: update changelog
  issue #558: disable test on OSX to cope with boundless mediocrity
  issue #558, #582: preserve remote tmpdir if caller did not supply one
  issue #613: must await 'exit' and 'disconnect' in wait=False test
  Import LGTM config to disable some stuff
  Fix up another handful of LGTM errors.
  tests: work around AnsibleModule.run_command() race.
  docs: mention another __main__ safeguard
  docs: tweaks
  formatting error
  docs: make Sphinx install soft fail on Python 2.
  issue #598: allow disabling preempt in terraform
  issue #598: update Changelog.
  issue #605: update Changelog.
  issue #605: ansible: share a sem_t instead of a pthread_mutex_t
  issue #613: add tests for all the weird shutdown methods
  Add mitogen.core.now() and use it everywhere; closes #614.
  docs: move decorator docs into core.py and use autodecorator
  preamble_size: make it work on Python 3.
  docs: upgrade Sphinx to 2.1.2, require Python 3 to build docs.
  docs: fix Sphinx warnings, add LogHandler, more docstrings
  docs: tidy up some Changelog text
  issue #615: fix up FileService tests for new logic
  issue #615: another Py3x fix.
  issue #615: Py3x fix.
  issue #615: update Changelog.
  issue #615: use FileService for target->controll file transfers
  issue #482: another Py3 fix
  ci: try removing exclude: to make Azure jobs work again
  compat: fix Py2.4 SyntaxError
  issue #482: remove 'ssh' from checked processes
  ci: Py3 fix
  issue #279: add one more test for max_message_size
  issue #482: ci: add stray process checks to all jobs
  tests: fix format string error
  core: MitogenProtocol.is_privileged was not set in children
  issue #482: tests: fail DockerMixin tests if stray processes exist
  docs: update Changelog.
  issue #586: update Changelog.
  docs: update Changelog.
  [security] core: undirectional routing wasn't respected in some cases
  docs: tidy up Select.all()
  issue #612: update Changelog.
  master: fix TypeError
  pkgutil: fix Python3 compatibility
  parent: use protocol for getting remote_id
  docs: merge signals.rst into internals.rst
  os_fork: do not attempt to cork the active thread.
  parent: fix get_log_level() for split out loggers.
  issue #547: fix service_test failures.
  issue #547: update Changelog.
  issue #547: core/service: race/deadlock-free service pool init
  docs: update Changelog.
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
2 participants
You can’t perform that action at this time.