Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Could not make dir ~/.ansible/cp after upgrade to 1.2.3 #3943

Closed
jshprentz opened this issue Aug 25, 2013 · 9 comments
Closed

Could not make dir ~/.ansible/cp after upgrade to 1.2.3 #3943

jshprentz opened this issue Aug 25, 2013 · 9 comments
Labels
bug This issue/PR relates to a bug.

Comments

@jshprentz
Copy link

I upgraded Ansible from release 1.2 to 1.2.3 and ran a playbook that had been working well.

joel@Dimension-8200:~/ansible$ git pull --rebase origin  release1.2.3
joel@Dimension-8200:~/ansible$ cd ../projects/f4d/
joel@Dimension-8200:~/projects/f4d$ ansible-playbook wordpress.yml -i hosts  -v

Ansible displayed a Python stack trace while gathering facts and then continued processing only one of the two hosts.

PLAY [Apply common configuration to all nodes] ********************************

GATHERING FACTS ***************************************************************
Could not make dir /home/joel/.ansible/cp: [Errno 17] File exists: '/home/joel/.ansible/cp'
Traceback (most recent call last):
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 70, in _executor_hook
    return_data = multiprocessing_runner._executor(host, new_stdin)
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 364, in _executor
    exec_rc = self._executor_internal(host, new_stdin)
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 443, in _executor_internal
    return self._executor_internal_inner(host, self.module_name, self.module_args, inject, port, complex_args=complex_args)
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 579, in _executor_internal_inner
    conn = self.connector.connect(actual_host, actual_port, actual_user, actual_pass, actual_transport, actual_private_key_file)
  File "/home/joel/ansible/lib/ansible/runner/connection.py", line 36, in connect
    conn = utils.plugins.connection_loader.get(transport, self.runner, host, port, user=user, password=password, private_key_file=private_key_file)
  File "/home/joel/ansible/lib/ansible/utils/plugins.py", line 170, in get
    return getattr(self._module_cache[path], self.class_name)(*args, **kwargs)
  File "/home/joel/ansible/lib/ansible/runner/connection_plugins/ssh.py", line 44, in __init__
    self.cp_dir = utils.prepare_writeable_dir('$HOME/.ansible/cp',mode=0700)
  File "/home/joel/ansible/lib/ansible/utils/__init__.py", line 203, in prepare_writeable_dir
    exit("Could not make dir %s: %s" % (tree, e))
  File "/home/joel/ansible/lib/ansible/utils/__init__.py", line 122, in exit
    sys.exit(rc)
SystemExit: 1
ok: [f4d.shprentz.com]

TASK: [Refresh the package cache if old] **************************************
ok: [f4d.shprentz.com] => {"changed": false, "item": ""}

At that point, I terminated Ansible with a control-c. Then I examined the suspect directory.

joel@Dimension-8200:~/projects/f4d$ ls -l /home/joel/.ansible/cp
total 0
srw------- 1 joel joel 0 Aug 24 20:46 ansible-ssh-f4d.shprentz.com-22-ansible
joel@Dimension-8200:~/projects/f4d$ ls -ld /home/joel/.ansible/cp
drwx------ 2 joel joel 4096 Aug 24 20:47 /home/joel/.ansible/cp

Seeing no obvious problems, I tried running Ansible again. This time it processed both hosts and reported no stack trace.

joel@Dimension-8200:~/projects/f4d$ ansible-playbook wordpress.yml -i hosts  -v

PLAY [Apply common configuration to all nodes] ********************************

GATHERING FACTS ***************************************************************
ok: [f4d.shprentz.com]
ok: [wp.shprentz.com]

TASK: [Refresh the package cache if old] **************************************
ok: [f4d.shprentz.com] => {"changed": false, "item": ""}
ok: [wp.shprentz.com] => {"changed": false, "item": ""}

[Dozens more tasks omitted here.]

A search of Ansible issues found no similar problems, so I am reporting it.

@jimi-c
Copy link
Member

jimi-c commented Aug 25, 2013

Thanks, I'll take a look into this.

@mpdehaan
Copy link
Contributor

Thanks for the report.

What OS are you using BTW?

Anything useful to share about the server setup we should know (NFS home, permissions, etc?)

@jshprentz
Copy link
Author

OS is Ubuntu 12.04.2 LTS (GNU/Linux 3.2.0-49-generic-pae i686)

Server has only local disks. Permissions appear okay:

joel@Dimension-8200:~/projects/f4d$ ls -ld ~ ~/.ansible ~/.ansible/cp
drwxr-xr-x 27 joel joel 4096 Aug 24 20:46 /home/joel
drwx------  3 joel joel 4096 Aug 24 20:46 /home/joel/.ansible
drwx------  2 joel joel 4096 Aug 24 20:50 /home/joel/.ansible/cp

Environment was augmented with the supplied script: source ~/ansible/hacking/env-setup. The relevant environment properties seem okay:

ANSIBLE_LIBRARY=/home/joel/ansible/library
PATH=/home/joel/ansible/bin:/usr/local/sbin:/usr/local/bin:/usr/sbin:/usr/bin:/sbin:/bin:/usr/games
PYTHONPATH=/home/joel/ansible/lib:

I could setup another user with a clean Ansible 1.2.3 installation, if that would help.

@jimi-c
Copy link
Member

jimi-c commented Aug 25, 2013

Are you running the ansible command via sudo?

@jshprentz
Copy link
Author

No, I am running the ansible command without sudo: ansible-playbook wordpress.yml -i hosts -v

Ansible is installed (with git by me) in my personal directory /home/joel/ansible, not in any system directory such as /opt or /usr/local.

@jimi-c
Copy link
Member

jimi-c commented Aug 25, 2013

This appears to be a race condition where the multiple forked connections are all trying to create the directory at the same time. I wrote this patch, which appears to correct the issue, if you'd like to give it a test:

https://gist.github.com/jimi1283/a827f580ef2475c05254

@jshprentz
Copy link
Author

I applied the patch and ran ansible as before: ansible-playbook wordpress.yml -i hosts -v. It ran okay.

Next, I deleted /home/joel/.ansible/cp and its contents. I ran ansible again. Ansible ran fine and created /home/joel/.ansible/cp.

I tried a few more cycles of deleting the cp directory and running ansible. No problems were encountered. Of course, my limited testing does not prove that the race condition is fixed.

As I reviewed the code changes in the patch, I wondered what would happen if the call to utils.prepare_writeable_dir failed. Would the still-locked process lockfile cause some problem? To test this, I deleted /home/joel/.ansible/cp again and removed write permission from /home/joel/.ansible. Then I reran ansible.

joel@Dimension-8200:~/.ansible$ rm -rf cp
joel@Dimension-8200:~/.ansible$ cd ..
joel@Dimension-8200:~$ chmod u-w .ansible/
joel@Dimension-8200:~$ ls -ld .ansible/
dr-x------ 2 joel joel 4096 Aug 25 20:59 .ansible/
joel@Dimension-8200:~$ pushd
~/projects/f4d ~
joel@Dimension-8200:~/projects/f4d$ ansible-playbook wordpress.yml -i hosts  -v

PLAY [Apply common configuration to all nodes] ********************************

GATHERING FACTS ***************************************************************
Could not make dir /home/joel/.ansible/cp: [Errno 13] Permission denied: '/home/joel/.ansible/cp'
Traceback (most recent call last):
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 70, in _executor_hook
    return_data = multiprocessing_runner._executor(host, new_stdin)
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 364, in _executor
    exec_rc = self._executor_internal(host, new_stdin)
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 443, in _executor_internal
    return self._executor_internal_inner(host, self.module_name, self.module_args, inject, port, complex_args=complex_args)
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 579, in _executor_internal_inner
    conn = self.connector.connect(actual_host, actual_port, actual_user, actual_pass, actual_transport, actual_private_key_file)
  File "/home/joel/ansible/lib/ansible/runner/connection.py", line 36, in connect
    conn = utils.plugins.connection_loader.get(transport, self.runner, host, port, user=user, password=password, private_key_file=private_key_file)
  File "/home/joel/ansible/lib/ansible/utils/plugins.py", line 170, in get
    return getattr(self._module_cache[path], self.class_name)(*args, **kwargs)
  File "/home/joel/ansible/lib/ansible/runner/connection_plugins/ssh.py", line 47, in __init__
    self.cp_dir = utils.prepare_writeable_dir('$HOME/.ansible/cp',mode=0700)
  File "/home/joel/ansible/lib/ansible/utils/__init__.py", line 203, in prepare_writeable_dir
    exit("Could not make dir %s: %s" % (tree, e))
  File "/home/joel/ansible/lib/ansible/utils/__init__.py", line 122, in exit
    sys.exit(rc)
SystemExit: 1
Could not make dir /home/joel/.ansible/cp: [Errno 13] Permission denied: '/home/joel/.ansible/cp'
Traceback (most recent call last):
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 70, in _executor_hook
    return_data = multiprocessing_runner._executor(host, new_stdin)
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 364, in _executor
    exec_rc = self._executor_internal(host, new_stdin)
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 443, in _executor_internal
    return self._executor_internal_inner(host, self.module_name, self.module_args, inject, port, complex_args=complex_args)
  File "/home/joel/ansible/lib/ansible/runner/__init__.py", line 579, in _executor_internal_inner
    conn = self.connector.connect(actual_host, actual_port, actual_user, actual_pass, actual_transport, actual_private_key_file)
  File "/home/joel/ansible/lib/ansible/runner/connection.py", line 36, in connect
    conn = utils.plugins.connection_loader.get(transport, self.runner, host, port, user=user, password=password, private_key_file=private_key_file)
  File "/home/joel/ansible/lib/ansible/utils/plugins.py", line 170, in get
    return getattr(self._module_cache[path], self.class_name)(*args, **kwargs)
  File "/home/joel/ansible/lib/ansible/runner/connection_plugins/ssh.py", line 47, in __init__
    self.cp_dir = utils.prepare_writeable_dir('$HOME/.ansible/cp',mode=0700)
  File "/home/joel/ansible/lib/ansible/utils/__init__.py", line 203, in prepare_writeable_dir
    exit("Could not make dir %s: %s" % (tree, e))
  File "/home/joel/ansible/lib/ansible/utils/__init__.py", line 122, in exit
    sys.exit(rc)
SystemExit: 1

TASK: [Refresh the package cache if old] **************************************
FATAL: no hosts matched or all hosts have already failed -- aborting


PLAY RECAP ********************************************************************
           to retry, use: --limit @/home/joel/wordpress.retry

f4d.shprentz.com           : ok=0    changed=0    unreachable=1    failed=0
wp.shprentz.com            : ok=0    changed=0    unreachable=1    failed=0

Ansible attempted to create /home/joel/.ansible/cp twice. Looking at the stack traces, I suspect that the process lockfile was released each time sys.exit was called. The lockfile caused the two processes to take turns attempting to create /home/joel/.ansible/cp. That is the desired behavior.

After my test, I corrected the write permission on /home/joel/.ansible and ran the ansible command again to confirm that no harm was done. It ran just fine.

@jimi-c
Copy link
Member

jimi-c commented Aug 26, 2013

Excellent, thanks for confirming. @mpdehaan and I had also discussed adding a bit more error catching to that function to ignore the case where the directory already existed, however not being writeable I think should still be an error that halts things (I also committed a patch earlier that handles that error cleanly rather than resulting in a stack dump).

@jshprentz
Copy link
Author

Thanks for the quick solution.

@jimi-c jimi-c closed this as completed in 53c2f4c Sep 3, 2013
@ansibot ansibot added bug This issue/PR relates to a bug. and removed bug_report labels Mar 6, 2018
@ansible ansible locked and limited conversation to collaborators Apr 24, 2019
Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
bug This issue/PR relates to a bug.
Projects
None yet
Development

No branches or pull requests

4 participants