Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Unable to find identity file in home directory of user running Ansible #334

Closed
knightsg opened this issue Aug 1, 2018 · 7 comments
Closed

Unable to find identity file in home directory of user running Ansible #334

knightsg opened this issue Aug 1, 2018 · 7 comments

Comments

@knightsg
Copy link

@knightsg knightsg commented Aug 1, 2018

I'm running the latest master (commit #232aaf5c) and getting the following error when a playbook is run from our Jenkins agent server, as the Jenkins user:

[pid 3613] 21:59:06.117798 D mitogen: mitogen.ssh.Stream(u'local.3681').connect(): child process stdin/stdout=64
[pid 3613] 21:59:06.120488 D mitogen: mitogen.ssh.Stream(u'local.3681'): received 'Warning: Identity file ~/.ssh/XXXXXXXX.pem not accessible: No such file or directory.\n'
[pid 3613] 21:59:06.215677 D mitogen: mitogen.ssh.Stream(u'local.3681'): received 'Permission denied (publickey).\r\n'
[pid 3613] 21:59:06.216505 D mitogen: mitogen.ssh.Stream(u'local.3681'): child process still alive, sending SIGTERM
[pid 3678] 21:59:06.218670 D mitogen: Broker(0x7fb58094add0).shutdown()
[pid 3678] 21:59:06.219146 D mitogen: mitogen.core.Stream(u'unix_listener.3613').on_disconnect()
[pid 3678] 21:59:06.219847 D mitogen: Context(0, None).on_disconnect()
[pid 3613] 21:59:06.220063 D mitogen: mitogen.core.Stream(u'unix_client.3678').on_disconnect()
[pid 3613] 21:59:06.220703 D mitogen: Context(1003, None).on_disconnect()
[pid 3678] 21:59:06.221500 D mitogen: Waker(Broker(0x7fb58094add0) rfd=14, wfd=15).on_shutdown()
[pid 3678] 21:59:06.222075 D mitogen: Waker(Broker(0x7fb58094add0) rfd=14, wfd=15).on_disconnect()

fatal: [172.25.0.36]: UNREACHABLE! => {
    "changed": false,
    "msg": "SSH authentication is incorrect",
    "unreachable": true
}

When this is run without mitogen it works fine, for some reason it can't find the ssh key file in the jenkins user's home folder (or it can't find the home folder at all).

Controller
Uname: Linux XXXXXXXXXX 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty

Ansible
ansible 2.4.6.0 (detached HEAD ref: refs/) last updated 2018/08/01 22:18:08 (GMT +000)
config file = /var/jenkins/.ansible.cfg
configured module search path = [u'/var/jenkins/.ansible/plugins/modules', u'/usr/share/ansible/plugins/modules']
ansible python module location = /opt/ansible/ansible-2.4/lib/ansible
executable location = /opt/ansible/ansible-2.4/bin/ansible
python version = 2.7.6 (default, Nov 23 2017, 15:49:48) [GCC 4.8.4]

Notes: Added 'strategy_plugins = /path/to/mitogen-master/ansible_mitogen/plugins/strategy' to defaults and ran ansible-playbook with 'ANSIBLE_STRATEGY=mitogen_linear' set.

Host(s)
Uname: Linux XXXXXXXXXXXX 3.13.0-153-generic #203-Ubuntu SMP Thu Jun 14 08:52:28 UTC 2018 x86_64 x86_64 x86_64 GNU/Linux
Distributor ID: Ubuntu
Description: Ubuntu 14.04.5 LTS
Release: 14.04
Codename: trusty
Python 2.7.6

@dw
Copy link
Member

@dw dw commented Aug 10, 2018

This looks strange! OpenSSH handles tilde expansion internally (i.e. it's not a side-effect of Ansible passing "~/your.key" through the shell. Would it be possible for you to run this again, and search for a command like below in your "-vvv" output:

[pid 35825] 11:21:27.074615 D mitogen: hybrid_tty_create_child() pid=35827 stdio=95, tty=49, cmd: ssh -o "LogLevel ERROR" -o "Compression yes" -o "ServerAliveInterval 15" -o "ServerAliveCountMax 3" -o "StrictHostKeyChecking no" -o "UserKnownHostsFile /dev/null" -o "GlobalKnownHostsFile /dev/null" -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s localhost /usr/bin/python -c "'import codecs,os,sys;_=codecs.decode;exec(_(_(\"eNqFkFFLwzAUhZ/XX9G3m7CwpZ1DKQSUIeKDCEXcgw5pl1SDXRLSbnH+eu86Ye188O1+nHPvuZycLYVtJk47RWjkWeiRrmKEyvpPQrNohLPcupRwlnBOT5yzPnlUkyOva9sokvfB92HZh4CAgc0e4+uixdRNLEQMsvBBG4gLIztRfan1ti3KWnXydNv4aanN1O3bD2sA/xyd2caiW9wp32hrXrLZqotVZqc9Mtzkd88cVmK4dvQg1mQosCGOgWx0a9+VyeQmXN/WUtfZbH6VzinQCG8Er1tFEgYP90+PnPNXA5i9thIrptFCvJFDydI6ZbBa8CXQiVeFJEl6kV5SBt/a4aXKiZNvySCUcOi9cr8Bi24+dnnmDv+5/36ZDL78ARZxq8o=\".encode(),\"base64\"),\"zip\"))'"

I'm only interested in the options passed to SSH, the hostname and base64'd blob don't matter.

Can you also please run the same with -vvv under regular Ansible, and return one of the lines like:

<localhost> ESTABLISH SSH CONNECTION FOR USER: None
<localhost> SSH: EXEC ssh -o ForwardAgent=yes -o ControlMaster=auto -o ControlPersist=60s -o StrictHostKeyChecking=no -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o ConnectTimeout=10 -o ControlPath=/Users/dmw/.ansible/cp/8a5a4c6a60 localhost '/bin/sh -c '"'"'/usr/bin/python && sleep 0'"'"''

Finally can you confirm the current working directory when Jenkins runs Ansible (is it the same as the home directory?) and also whether the value of $HOME looks like it makes sense.

I'm thinking the underlying problem here might be the current working directory has changed, the SSH client is somehow being executed in the wrong account, or an environment difference means HOME is incorrect when running under Jenkins somehow (which would not surprise me)

@dw dw added the target:v0.2 label Aug 11, 2018
@knightsg
Copy link
Author

@knightsg knightsg commented Aug 22, 2018

Sure:

With ANSIBLE_STRATEGY=mitogen_linear (unneeded info removed):

[pid 595] 18:38:10.528436 D mitogen: hybrid_tty_create_child() pid=854 stdio=72, tty=25, cmd: ssh -o "LogLevel ERROR" -l ubuntu -p 22 -i ~/.ssh/XXXXXXXX.pem -o "Compression yes" -o "ServerAliveInterval 15" -o "ServerAliveCountMax 3" -o "StrictHostKeyChecking no" -o "UserKnownHostsFile /dev/null" -o "GlobalKnownHostsFile /dev/null" -o ControlMaster=auto -o ControlPersist=30m 
[pid 595] 18:38:10.530594 D mitogen: mitogen.ssh.Stream(u'local.854').connect(): child process stdin/stdout=72
[pid 595] 18:38:10.532948 D mitogen: mitogen.ssh.Stream(u'local.854'): received 'Warning: Identity file ~/.ssh/XXXXXXXX.pem not accessible: No such file or directory.\n'

Without mitogen enabled (removed -o ControlPath from output):
SSH: EXEC ssh -o ControlMaster=auto -o ControlPersist=30m -o StrictHostKeyChecking=no -o Port=22 -o 'IdentityFile="/var/jenkins/.ssh/XXXXXXXX.pem"' -o KbdInteractiveAuthentication=no -o PreferredAuthentications=gssapi-with-mic,gssapi-keyex,hostbased,publickey -o PasswordAuthentication=no -o User=XXXXXXXXX -o ConnectTimeout=30

Yes, the current working directory is the home directory, and the value of $HOME=/var/jenkins. I notice in the non-mitogen run it uses the full path to the home directory.

@dw dw added the user-reported label Sep 12, 2018
@knightsg
Copy link
Author

@knightsg knightsg commented Sep 13, 2018

Just to update, I manually changed the path to the key file in the inventory (it's set via the ansible_ssh_private_key_file variable) from using a tilde to the full path to the user's home directory, and it works while using mitogen_linear. So it's definitely that expansion that it's bugging out on.

@dw
Copy link
Member

@dw dw commented Oct 30, 2018

Okay:

  • When specified as --private-key=, Ansible tilde-expands using the $HOME environment variable (via os.path.expanduser()) and stores the result in PlayContext.private_key_file

  • When specified as ansible_ssh_private_key_file, Ansible does no tilde-expansion, and stores the literal string in PlayContext.private_key_file

  • When the Ansible SSH plugin receives a private key file, it expands it using os.path.expanduser() and passes it to ssh via "-o IdentityFile=...".

  • When the Mitogen SSH plugin receives a private key file, it does not expand it, but passes it literally tossh via -i.

  • The SSH tilde-expansion logic does not use the $HOME environment variable, it uses getpwent() to lookup the user's home directory.

The problem is because with vanilla, we use $HOME in the environment to expand the path, whereas in Mitogen we let SSH do that, which uses getpwent().

If using sudo without appropriate flags, $HOME will be incorrect, and thus Ansible will expand the 'right' path while SSH, which looks up HOME afresh, will use the correct, but wrong path.

@dw dw closed this in 96f000c Oct 30, 2018
dw added a commit that referenced this issue Oct 30, 2018
- issue #334
@dw
Copy link
Member

@dw dw commented Oct 30, 2018

So sorry for the huge wait on this one! Debugging these expansions is a real pain -- sometimes there are multiple nested layers of /bin/sh involved in upstream, and it's impossible without actually running things under strace to figure out what is happening where.

This is now on the master branch and will make it into the next release. To be updated when a new release is made, subscribe to https://networkgenomics.com/mail/mitogen-announce/

Thanks for reporting this!

@knightsg
Copy link
Author

@knightsg knightsg commented Oct 30, 2018

No worries at all, I appreciate the effort! I track the master branch so I can already go ahead and try this out on our Jenkins agent.

@knightsg
Copy link
Author

@knightsg knightsg commented Jan 11, 2019

I tried this out today and it's now working for me. Thanks!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
2 participants