Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Apply timeouts to pex resolves #7659

Merged
merged 2 commits into from May 4, 2019

Conversation

Projects
None yet
2 participants
@stuhood
Copy link
Member

commented May 4, 2019

Problem

Either pypi or travis are currently intensely flaky with regard to the opening of https connections to pypi, with the result that the wheel-builder shards in master have a near-zero success rate due to the number of resolves they attempt during wheel testing.

This flakiness is significantly exacerbated by the fact that, when used with requests, pex does not set a connect/read timeout for http connections: the result is that many attempts to connect were hanging indefinitely at:

06:42:14 [DEBUG] urllib3.connectionpool:pid=3921: Starting new HTTPS connection (1): pypi.org:443

Solution

  1. Monkey-patch the methods of RequestsContext that should specify timeouts, and specify them.
  2. Guarantee that we use RequestsContext, rather than whatever Context.get might choose.
  3. Memoize Context creation on PythonRepos, which allows the connection pool to be reused across multiple resolves.

Result

Connections will time out and retry:

06:42:14 [DEBUG] urllib3.connectionpool:pid=3921: Starting new HTTPS connection (1): pypi.org:443
06:42:29 [DEBUG] urllib3.util.retry:pid=3921: Incremented Retry for (url='/simple/pantsbuild-pants-contrib-awslambda-python/'): Retry(total=4, connect=None, read=None, redirect=None, status=None)
06:42:29 [WARN] Retrying (Retry(total=4, connect=None, read=None, redirect=None, status=None)) after connection broken by 'ReadTimeoutError("HTTPSConnectionPool(host='pypi.org', port=443): Read timed out. (read timeout=15)",)': /simple/pantsbuild-pants-contrib-awslambda-python/
06:42:29 [DEBUG] urllib3.connectionpool:pid=3921: Starting new HTTPS connection (2): pypi.org:443
06:42:29 [DEBUG] urllib3.connectionpool:pid=3921: https://pypi.org:443 "HEAD /simple/pantsbuild-pants-contrib-awslambda-python/ HTTP/1.1" 200 0

I'll look into upstreaming this as a fix for http://github.com/pantsbuild/pex/issues/26 tomorrow, but assuming it goes green, it would be good to land it temporarily to fix the breakage in master.

@stuhood stuhood force-pushed the twitter:stuhood/pex-resolver-timeout branch from 4046c8d to d4d99b9 May 4, 2019

@ity

ity approved these changes May 4, 2019

Copy link
Contributor

left a comment

thanks for looking into this!

# requests does not support file:// -- so we must short-circuit manually
if link.local:
return open(link.local_path, 'rb') # noqa: T802
for attempt in range(self._max_retries + 1):

This comment has been minimized.

Copy link
@ity

ity May 4, 2019

Contributor

cant see it here, but assuming _max_retries is being set somewhere else

This comment has been minimized.

Copy link
@stuhood

stuhood May 4, 2019

Author Member

Yea, self in this case will be a RequestsContext instance.

@stuhood

This comment has been minimized.

Copy link
Member Author

commented May 4, 2019

Due to all the subprocessing of pex that we do (which will not be affected by the monkey-patching), this is not a complete fix. But because I have a clean CI run for it while master continues to flake, I'm going to add a TODO here and land it to improve the situation until pantsbuild/pex#26 can be addressed.

@stuhood stuhood merged commit 92cfc9f into pantsbuild:master May 4, 2019

1 check was pending

continuous-integration/travis-ci/pr The Travis CI build is in progress
Details

@stuhood stuhood deleted the twitter:stuhood/pex-resolver-timeout branch May 4, 2019

stuhood added a commit to pantsbuild/pex that referenced this pull request May 6, 2019

@jsirois jsirois referenced this pull request May 7, 2019

Closed

Release 1.6.7 #704

10 of 11 tasks complete
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
You can’t perform that action at this time.