New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Failure to extract dependencies to ~/.pex/install if they already exist with some Python version #265

Closed
jcohen opened this Issue May 24, 2016 · 2 comments

Comments

Projects
None yet
3 participants
@jcohen
Copy link
Contributor

jcohen commented May 24, 2016

Specifically on a host running Python 2.7.5 we ran into the following stack trace:

Traceback (most recent call last):
  File ".bootstrap/_pex/pex.py", line 309, in execute
  File ".bootstrap/_pex/pex.py", line 78, in _activate
  File ".bootstrap/_pex/environment.py", line 132, in activate
  File ".bootstrap/_pex/environment.py", line 176, in _activate
  File ".bootstrap/_pex/environment.py", line 121, in update_candidate_distributions
  File ".bootstrap/_pex/environment.py", line 107, in load_internal_cache
  File ".bootstrap/_pex/environment.py", line 95, in write_zipped_internal_cache
  File ".bootstrap/_pex/util.py", line 177, in cache_distribution
OSError: [Errno 17] File exists

In this case we're on a slightly older version of pex, line 177 in util.py is:

pex/pex/util.py

Line 177 in ba58a85

os.rename(target_dir_tmp, target_dir)

So, the rename fails because the directory exists. In this case it's raising EEXIST, whereas pex only expects ENOTEMPTY to be a continuable error code.

Pex should handle EEXIST with the same behavior as ENOTEMPTY in this case.

@kwlzn kwlzn added the bug label May 24, 2016

@shatil

This comment has been minimized.

Copy link

shatil commented May 25, 2016

tl;dr if you're catching the exception, use the broader OSError and discover why it happened yourself. Error code returned in e.errno is filesystem-dependent and unreliable.

Python invokes system calls, in the case of os.rename(), that's rename(2) (see man 2 rename). This will trigger the behavior that os.rename() does:

mkdir from to
touch from/contents to/contents
strace mv -T from to

OSError will reflect in its errno attribute whatever the operating system (kernel) reports back when the C library invokes rename(2), but this is filesystem-dependent. In the case of NFS on CentOS 5, or XFS on CentOS 7 running a stock kernel, it can be EEXIST, which is error 17.

lstat("from", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("to", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
rename("from", "to")                    = -1 EEXIST (File exists)

ext3 on CentOS 5 running kernel 3.10, may raise ENOTEMPTY (error 39), like util.py catches:

pex/pex/util.py

Line 177 in ba58a85

os.rename(target_dir_tmp, target_dir)

lstat("from", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
lstat("to", {st_mode=S_IFDIR|0755, st_size=4096, ...}) = 0
rename("from", "to")                    = -1 ENOTEMPTY (Directory not empty)

Anyway, you'll want a more robust approach that is kernel and filesystem agnostic.

@kwlzn

This comment has been minimized.

Copy link
Member

kwlzn commented May 26, 2016

@jcohen it'd be great to get your repro details for this. afaict, this is actually a race condition vs a simple "run pex twice on centos7" bug:

$ cat /etc/redhat-release
Nest Enterprise Linux release 7.2.1511 (Core)
$ rm -rf ~/.pex
$ ./pants.pex
Could not find pants.ini!
$ ./pants.pex
Could not find pants.ini!
$ ./pants.pex
Could not find pants.ini!
$ ./pants.pex
Could not find pants.ini!

there is logic in:

pex/pex/util.py

Line 167 in ba58a85

if not os.path.exists(target_dir):

which checks for existence of target_dir before attempting the rename. so my guess would be that another pex could be simultaneously writing this dir between that check and the problematic os.rename, causing the issue you're seeing.

hence, would be great to confirm this theory with a repro.

@kwlzn kwlzn closed this in #271 May 27, 2016

kwlzn added a commit to pantsbuild/pants that referenced this issue May 28, 2016

Bump pex requirement to 1.1.10.
This pulls in a quick fix for pantsbuild/pex#265 to relieve an apparent race condition internal teams at Twitter are hitting with pex execution on centos7/xfs.

Testing Done:
https://travis-ci.org/pantsbuild/pants/builds/133540161

Bugs closed: 3518

Reviewed at https://rbcommons.com/s/twitter/r/3949/
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment