Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Mitogen does not implement BECOME_ALLOW_SAME_USER #499

Closed
knightsg opened this issue Jan 29, 2019 · 16 comments
Closed

Mitogen does not implement BECOME_ALLOW_SAME_USER #499

knightsg opened this issue Jan 29, 2019 · 16 comments

Comments

@knightsg
Copy link

@knightsg knightsg commented Jan 29, 2019

Controller info:
uname -a
Linux BB0000039 4.4.0-43-Microsoft #1-Microsoft Wed Dec 31 14:42:53 PST 2014 x86_64 x86_64 x86_64 GNU/Linux

lsb_release -a
No LSB modules are available.
Distributor ID: Ubuntu
Description: Ubuntu 16.04.5 LTS
Release: 16.04
Codename: xenial

ansible-playbook -- version
ansible-playbook 2.6.12.post0 (detached HEAD ref: refs/) last updated 2018/10/31 12:12:33 (GMT -700)
config file = /root/.ansible.cfg
configured module search path = [u'/root/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /usr/local/ansible/lib/ansible
executable location = /usr/local/ansible/bin/ansible-playbook
python version = 2.7.12 (default, Dec 4 2017, 14:50:18) [GCC 5.4.0 20160609]

When we run our playbooks we pass in the Boto profile name using the AWS_PROFILE environment variable to select the specific account we want. However, since I enabled mitogen the other day in Ansible I noticed one of our playbooks failing.

The playbook includes the following task to gather vpc information from the selected AWS account:

    - ec2_vpc_net_facts:
        region: "{{ data.region }}"
        filters:
          isDefault: "false"
      register: vpc_info

When I run the above with mitogen enabled it returns a list of the VPCs from the 'default' profile in the .aws/credentials file on my controller machine, even when I pass a different profile in using AWS_PROFILE. It took me a while to work out that it was returning the wrong VPC but eventually I tried disabling mitogen and it started working correctly, returning the list of VPCs from the account/profile I specified.

After identifying mitogen as the cause of the issue, I tried explicitly supplying the aws access key and secret key via the appropriate parameters in the ec2_vpc_net_facts module and it started working as expected. So the issue is specifically that mitogen isn't recognising the AWS_PROFILE environment variable. For the record, I was supplying this in the call to the playbook as follows:

ANSIBLE_STRATEGY=mitogen_linear AWS_PROFILE=my_profile ansible-playbook playbooks/my_playbook.yml

@dw
Copy link
Member

@dw dw commented Jan 29, 2019

If the module is executing on the local machine (i.e. local_action, connection: local, or delegate_to: localhost) things should just work already, and I just verified this is the case.

If it's executing on a remote machine, then I /think/ this is some property of the SSH client that is allowing the environment variables to migrate across machines -- often this is disabled.

Would it be possible for you to review '-vvv -e mitogen_ssh_debug_level=3' output and check to see what SSH is doing with AWS_PROFILE?

Also, do you have any other relevant flags (e.g. sudo_flags contains -E) in ansible.cfg that could explain how the mechanism is enabled. I'm a bit surprised, I haven't seen this turned on anywhere in quite a while :)

Either way this should be a straightforward fix. Thanks for the incredibly clear report.

@dw
Copy link
Member

@dw dw commented Jan 29, 2019

It's possible if you set several of ssh_flags, ssh_common_flags, ssh_extra_flags, Mitogen is not reproducing them on the command line the same way Ansible does. These command lines show up in ANSIBLE_DEBUG=1 output for vanilla Ansible, and -vvv output for Mitogen

@dw
Copy link
Member

@dw dw commented Jan 29, 2019

Yet Another(tm) possibility is that somehow HOME is wrong, this has been a recurring simple-yet-subtle bug. Is become:true set at the playbook level for that task?

@knightsg
Copy link
Author

@knightsg knightsg commented Jan 29, 2019

Yep, the playbook is using both hosts: localhost and connection: local. Also, I don't have sudo_flags explicitly set in ansible.cfg.

I actually can't find AWS_PROFILE mentioned at all in the output using -vvv -e mitogen_ssh_debug_level=3. I could supply the file to you if you'd like but I have to do a bunch of sanitising first and don't want to go through all that unless I have to :) But I'm happy to do it if you think you need it.

@knightsg
Copy link
Author

@knightsg knightsg commented Jan 29, 2019

By the way, I don't see any of ssh_flags, ssh_common_flags, ssh_extra_flags in the -vvv output for mitogen.

become=True is set in my ansible.cfg but not in the playbook itself in this case.

@dw
Copy link
Member

@dw dw commented Jan 29, 2019

How is the task actually running?

  • hosts line of the play
  • become (you just gave me it thanks)
  • any delegate_to?

I can see some difference in handling of 'localhost', might have spotted it. It appears to depend on the ansible.cfg in Mitogen's tests dir, hmm :)

@knightsg
Copy link
Author

@knightsg knightsg commented Jan 29, 2019

The hostsline is hosts: localhost. There's no delegate_to for the task or in the entire playbook.

@dw
Copy link
Member

@dw dw commented Jan 29, 2019

# No config, no inventory
ansible -m shell localhost -a 'echo $AWS_PROFILE'    -> (present)
ansible -m shell localhost -a 'echo $AWS_PROFILE' -b   -> (absent, but depends on sudoers config)

# No config, explicit 'localhost'
ansible -m shell -i localhost, localhost -a 'echo $AWS_PROFILE'     -> (absent, its going via ssh localhost)
no point testing ^ with become

# Mitogen, no config, no inventory
ansible -m shell localhost -a 'echo $AWS_PROFILE'    -> (present)
ansible -m shell localhost -a 'echo $AWS_PROFILE' -b  -> (absent) 


# sudo "Defaults !reset_env", vanilla:
ansible -m shell localhost -a 'echo $AWS_PROFILE' -b    -> (present)

#Mitogen
ansible -m shell localhost -a 'echo $AWS_PROFILE' -b    -> (present)

Hrm, still not seeing this. Will keep digging.

You mention become is active. Does your /etc/sudoers mention anything like "env_reset" or "env_keep" or "SETENV"? I think your sudo may by default preserve environment, whereas mine did not until looking at this

Also, does your inventory include an explicit "localhost" line?

If you run "ansible -m mitogen_get_stack localhost" in that playbook repository using recent Mitogen, does it show something like "connection: local" or "connection: ssh"?

Sorry for the million questions -- these env differences are entirely undocumented and annoyingly subtle! Thanks again

@knightsg
Copy link
Author

@knightsg knightsg commented Jan 29, 2019

Yep, in /etc/sudoers it has this line:
Defaults env_reset

However I'm not using sudo to run this - not sure if that matters or not, though.

The inventory doesn't include an explicity 'localhost' line. FYI, I'm not specifying an inventory in the command itself.

This is the output of ansible -m mitogen_get_stack localhost - no mention of 'connection':

localhost | SUCCESS => {
    "changed": true,
    "result": [
        {
            "kwargs": {
                "python_path": [
                    "/usr/bin/python"
                ]
            },
            "method": "local"
        },
        {
            "enable_lru": true,
            "kwargs": {
                "connect_timeout": 10,
                "password": null,
                "python_path": [
                    "/usr/bin/python"
                ],
                "sudo_args": [
                    "-H",
                    "-S",
                    "-n"
                ],
                "sudo_path": null,
                "username": "root"
            },
            "method": "sudo"
        }
    ]
}

No worries about the questions, I'm very happy to help :)

@dw
Copy link
Member

@dw dw commented Jan 29, 2019

So just to triple check -

  • The task is targeting "localhost", and it's using the "implicit localhost" feature of Ansible, where it creates the equivalent of "localhost ansible_connection=local" if it were written in an inventory file
  • ansible.cfg has "[privilege_escalation] become = True", which has the effect of setting '-b' or 'become: true', even if they are not otherwise specified
  • The task has no other modifiers on the playbook or block level, such as delegate_to
  • The default become method is unchanged, and defaults to sudo

Now,

  • When running the task as defined above, Ansible manages to preserve "AWS_PROFILE" environment variable across the sudo invocation, despite ansible.cfg apparently having no sudo_flags set, and mitogen_get_stack indicating the Ansible config does not otherwise make "-E" appear on the sudo command line, and despite sudoers being configured with "Defaults env_reset", and no other "!env_reset" anywhree else in /etc/sudoers or /etc/sudoers.d/*

^ Does this sound right?

And just to be sure, does 'localhost' have any relevant group_vars defined? I don't think it can be a member of any group, since it's not explicitly in the inventory at all


So I've put the literal string "Defaults env_reset" in my sudoers like it is in yours, created an ansible.cfg like above, installed 2.6.12, no explicit localhost inventory, and a single task that looks like this:

- hosts: localhost
  tasks:
  - shell: echo $AWS_PROFILE
  - shell: echo $HOME

And I can't figure out how Ansible could possibly be preserving that AWS_PROFILE variable across the sudo invocation -- it's not working for me on this machine.

Does /root/.bashrc, /root/.profile etc maybe set it too? But even that would not describe it.

I have no doubt there is an environment variable difference lurking here, but given your sudo config I cannot understand how things are working for you at all just now :) This one may take a few days

@dw
Copy link
Member

@dw dw commented Jan 29, 2019

Another double-take.. if you replace that task with simply "shell: echo $AWS_PROFILE", in the same location in the playbook, and run with ansible-playbook -v, do you see your profile name printed?

Both 'shell' and 'ec2_vpc_net_facts' are both new-style modules, so there should be no difference, but always worth testing your tests ;)

@dw
Copy link
Member

@dw dw commented Jan 29, 2019

Does your machine have any of /etc/environment, ~/.pam_environment, or /root/.pam_environment?

@dw
Copy link
Member

@dw dw commented Jan 29, 2019

Found something -- Ansible ignored become when it's running as root

@dw dw changed the title AWS_PROFILE not recognised Mitogen does not implement BECOME_ALLOW_SAME_USER Jan 29, 2019
@dw
Copy link
Member

@dw dw commented Jan 29, 2019

It's because we don't implement this:

BECOME_ALLOW_SAME_USER:
  name: Allow becoming the same user
  default: False
  description: This setting controls if become is skipped when remote user and become user are the same. I.E root sudo to root.
  env: [{name: ANSIBLE_BECOME_ALLOW_SAME_USER}]
  ini:
  - {key: become_allow_same_user, section: privilege_escalation}
  type: boolean
  yaml: {key: privilege_escalation.become_allow_same_user}

Should be an easy fix! Thanks for your help

dw added a commit that referenced this issue Jan 29, 2019
dw added a commit that referenced this issue Jan 29, 2019
@dw dw closed this in 60e9596 Jan 29, 2019
dw added a commit that referenced this issue Jan 29, 2019
* origin/issue499:
  issue #499: another totally moronic implementation difference
  issue #499: disable new test on vanilla.
  docs: update Changelog; closes #499.
  issue #499: respect C.BECOME_ALLOW_SAME_USER.
@dw
Copy link
Member

@dw dw commented Jan 29, 2019

This is now on the master branch and will make it into the next release. To be updated when a new release is made, subscribe to https://networkgenomics.com/mail/mitogen-announce/

Thanks for reporting this!

dw added a commit that referenced this issue Jan 29, 2019
* commit 'fe74577':
  Use develop mode in tox
  issue #429: fix sudo regression.
  misc: rename to scripts. tab completion!!
  core: Latch._wake improvements
  issue #498: prevent crash on double 'disconnect' signal.
  issue #413: don't double-propagate DEL_ROUTE to parent.
  issue #498: wrap Router dict mutations in a lock
  issue #429: enable en_US locale to unbreak debops test.
  issue #499: fix another mind-numbingly stupid vanilla inconsistency
  issue #497: do our best to cope with crap upstream code
  ssh: fix test to match updated log format.
  issue #429: update Changelog.
  issue #429: update Changelog.
  issue #429: teach sudo about every know i18n password string.
  issue #429: install i18n-related bits in test images.
  ssh: tidy up logs and stream names.
  tests: ensure file is closed in connection_test.
  gcloud: small updates
  tests: give ansible/gcloud/ its own requirements file.
@knightsg
Copy link
Author

@knightsg knightsg commented Jan 29, 2019

Great, it works! Thanks for the extremely quick fix.

dw added a commit that referenced this issue Feb 10, 2019
* origin/master: (661 commits)
  Bump version for release.
  docs: update Changelog; closes #481
  issue #481: core: preserve stderr TTY FD if one is present.
  issue #481: avoid crash if disconnect occurs during forward_modules()
  Add a few more important modules to preamble_size.py.
  .ci: add verbiage for run_batches() too.
  .ci: add README.md.
  docs: update thanks
  docs: lose "approaching stability" language, we're pretty good now
  docs: fix changelog syntax/order/"20KB"
  tests: add new compression parameter to mitogen_get_stack results
  tests: disable affinity_test on Travis :/
  issue #508: fix responder stats test due to new smaller parent.py.
  issue #508: tests: skip minify_test Py2.4/2.5 for profiler.py.
  tests: fix fallout from 36fb318
  issue #520: add AIX auth failure string to su.
  tests: move affinity_test to Ansible tests.
  core: cProfile is not available in 2.4.
  issue #505: docs: add new detail graph for one scenario.
  docs: update and re-record profile graphs in docs; closes #505
  service: fix PushFileService exception
  tests: pad out localhost-*
  service: start pool shutdown on broker shutdown.
  master: .encode() needed for Py3.
  ansible: stash PID files in CWD if requested for debugging.
  issue #508: master: minify_safe_re must be bytes for Py3.
  bench: tidy up and cpu-pin some more files.
  tests: add localhost-x100
  ansible: double the default pool size.
  ansible: raise error with correct exception type.
  issue #508: master: minify all Mitogen/ansible_mitogen sources.
  parent: PartialZlib docstrings.
  ansible: hacky parser to alow bools to be specified on command line
  parent: pre-cache bootstrap if possible.
  docs: update Changelog.
  ansible: add mitogen_ssh_compression variable.
  service: PushFileService never recorded a file as sent.
  parent: synchronize get_core_source()
  service: use correct profile aggregation name.
  SyntaxError.
  ansible: don't pin controller if <4 cores.
  tests: make soak testing work reliably on vanilla.
  docs: changelog tidyups.
  ansible: document and make affinity stuff portable to non-Linux
  ansible: fix affinity.py test failure on 2 cores.
  ansible: preheat PluginLoader caches before fork.
  tests: make mitogen_shutdown_all be run_once by default.
  docs: update Changelog.
  ansible: use Poller for WorkerProcess; closes #491.
  ansible: new multiplexer/workers configuration
  docs: update Changelog.
  docs: update Changelog.
  ansible: pin connection multiplexer to a single core
  utils: pad out reset_affinity() and integrate with detach_popen()
  utils: import reset_affinity() function.
  master: set Router.profiling if MITOGEN_PROFILING variable present.
  parent: don't kill children when profiling is active.
  ansible: hook strategy and worker processes into profiler
  profiler: import from linear2 branch
  core: tidy up existing profiling code and support MITOGEN_PROFILE_FMT
  issue #260: redundant if statement.
  ansible: ensure MuxProcess MITOGEN_PROFILING results reach disk.
  ansible/bench: make end= configurable.
  master: cache sent/forwarded module names
  Aggregate code coverage data across tox all runs
  Allow independant control of coverage erase and reporting
  Fix incorrect attempt to use coverage
  docs: update Changelog; closes #527.
  issue #527: catch new-style module tracebacks like vanilla.
  Fix DeprecationWarning in mitogen.utils.run_with_router()
  Generate coverage report even if some tests fail
  ci: fix incorrect partition/rpartition from 8a4caea
  issue #260: hide force-disconnect messages.
  issue #498: fix shutdown crash
  issue #260: avoid start_transmit()/on_transmit()/stop_transmit()
  core: ensure broker profiling output reaches disk
  master: keep is_stdlib_path() result as negative cache entry
  ci: Allow DISTROS="debian*32" variable, and KEEP=1
  Use develop mode in tox
  issue #429: fix sudo regression.
  misc: rename to scripts. tab completion!!
  core: Latch._wake improvements
  issue #498: prevent crash on double 'disconnect' signal.
  issue #413: don't double-propagate DEL_ROUTE to parent.
  issue #498: wrap Router dict mutations in a lock
  issue #429: enable en_US locale to unbreak debops test.
  issue #499: fix another mind-numbingly stupid vanilla inconsistency
  issue #497: do our best to cope with crap upstream code
  ssh: fix test to match updated log format.
  issue #429: update Changelog.
  issue #429: update Changelog.
  issue #429: teach sudo about every know i18n password string.
  issue #429: install i18n-related bits in test images.
  ssh: tidy up logs and stream names.
  tests: ensure file is closed in connection_test.
  gcloud: small updates
  tests: give ansible/gcloud/ its own requirements file.
  issue #499: another totally moronic implementation difference
  issue #499: disable new test on vanilla.
  docs: update Changelog; closes #499.
  ...
dw added a commit that referenced this issue Feb 10, 2019
* origin/v024: (662 commits)
  docs: update Changelog release date.
  Bump version for release.
  docs: update Changelog; closes #481
  issue #481: core: preserve stderr TTY FD if one is present.
  issue #481: avoid crash if disconnect occurs during forward_modules()
  Add a few more important modules to preamble_size.py.
  .ci: add verbiage for run_batches() too.
  .ci: add README.md.
  docs: update thanks
  docs: lose "approaching stability" language, we're pretty good now
  docs: fix changelog syntax/order/"20KB"
  tests: add new compression parameter to mitogen_get_stack results
  tests: disable affinity_test on Travis :/
  issue #508: fix responder stats test due to new smaller parent.py.
  issue #508: tests: skip minify_test Py2.4/2.5 for profiler.py.
  tests: fix fallout from 36fb318
  issue #520: add AIX auth failure string to su.
  tests: move affinity_test to Ansible tests.
  core: cProfile is not available in 2.4.
  issue #505: docs: add new detail graph for one scenario.
  docs: update and re-record profile graphs in docs; closes #505
  service: fix PushFileService exception
  tests: pad out localhost-*
  service: start pool shutdown on broker shutdown.
  master: .encode() needed for Py3.
  ansible: stash PID files in CWD if requested for debugging.
  issue #508: master: minify_safe_re must be bytes for Py3.
  bench: tidy up and cpu-pin some more files.
  tests: add localhost-x100
  ansible: double the default pool size.
  ansible: raise error with correct exception type.
  issue #508: master: minify all Mitogen/ansible_mitogen sources.
  parent: PartialZlib docstrings.
  ansible: hacky parser to alow bools to be specified on command line
  parent: pre-cache bootstrap if possible.
  docs: update Changelog.
  ansible: add mitogen_ssh_compression variable.
  service: PushFileService never recorded a file as sent.
  parent: synchronize get_core_source()
  service: use correct profile aggregation name.
  SyntaxError.
  ansible: don't pin controller if <4 cores.
  tests: make soak testing work reliably on vanilla.
  docs: changelog tidyups.
  ansible: document and make affinity stuff portable to non-Linux
  ansible: fix affinity.py test failure on 2 cores.
  ansible: preheat PluginLoader caches before fork.
  tests: make mitogen_shutdown_all be run_once by default.
  docs: update Changelog.
  ansible: use Poller for WorkerProcess; closes #491.
  ansible: new multiplexer/workers configuration
  docs: update Changelog.
  docs: update Changelog.
  ansible: pin connection multiplexer to a single core
  utils: pad out reset_affinity() and integrate with detach_popen()
  utils: import reset_affinity() function.
  master: set Router.profiling if MITOGEN_PROFILING variable present.
  parent: don't kill children when profiling is active.
  ansible: hook strategy and worker processes into profiler
  profiler: import from linear2 branch
  core: tidy up existing profiling code and support MITOGEN_PROFILE_FMT
  issue #260: redundant if statement.
  ansible: ensure MuxProcess MITOGEN_PROFILING results reach disk.
  ansible/bench: make end= configurable.
  master: cache sent/forwarded module names
  Aggregate code coverage data across tox all runs
  Allow independant control of coverage erase and reporting
  Fix incorrect attempt to use coverage
  docs: update Changelog; closes #527.
  issue #527: catch new-style module tracebacks like vanilla.
  Fix DeprecationWarning in mitogen.utils.run_with_router()
  Generate coverage report even if some tests fail
  ci: fix incorrect partition/rpartition from 8a4caea
  issue #260: hide force-disconnect messages.
  issue #498: fix shutdown crash
  issue #260: avoid start_transmit()/on_transmit()/stop_transmit()
  core: ensure broker profiling output reaches disk
  master: keep is_stdlib_path() result as negative cache entry
  ci: Allow DISTROS="debian*32" variable, and KEEP=1
  Use develop mode in tox
  issue #429: fix sudo regression.
  misc: rename to scripts. tab completion!!
  core: Latch._wake improvements
  issue #498: prevent crash on double 'disconnect' signal.
  issue #413: don't double-propagate DEL_ROUTE to parent.
  issue #498: wrap Router dict mutations in a lock
  issue #429: enable en_US locale to unbreak debops test.
  issue #499: fix another mind-numbingly stupid vanilla inconsistency
  issue #497: do our best to cope with crap upstream code
  ssh: fix test to match updated log format.
  issue #429: update Changelog.
  issue #429: update Changelog.
  issue #429: teach sudo about every know i18n password string.
  issue #429: install i18n-related bits in test images.
  ssh: tidy up logs and stream names.
  tests: ensure file is closed in connection_test.
  gcloud: small updates
  tests: give ansible/gcloud/ its own requirements file.
  issue #499: another totally moronic implementation difference
  issue #499: disable new test on vanilla.
  ...
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants