Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

transitive dependency disrespecting python-repo #8739

Closed
a-teammate opened this issue Nov 29, 2019 · 1 comment
Closed

transitive dependency disrespecting python-repo #8739

a-teammate opened this issue Nov 29, 2019 · 1 comment

Comments

@a-teammate
Copy link

a-teammate commented Nov 29, 2019

Description

tl;dr: transitive dependencies resolved via pex->setuptools are not resolving to python-repo, but pypi instead.

Hey there,
i have problems when using pants for a python codebase behind a proxy.
Direct dependencies seem to be resolved via the pip mirror we have in our company and hence to be resolving dependencies correctly (after some slight hacks, see below), but when bundling pyspark, its transitive dependency pypandoc seems to be resolved to pypi.io .. which is failing.

Disclaimer:
My pants proxy setup may be suboptimal as well (obviously, since pypi is not accessible). So in order to make my pants resolve to our company's mirror of pip at least the certificates seem to be the problem, so my workaround so far has been to patch the pex pants is using [1] to not verify request accesses.

But since every python-repo besides our internal one are broken for me, I noticed that not all dependencies are respecting the python-repos set in pants.ini

Setup

pants.ini:

[GLOBAL]
pants_version: 1.21.0

[python-setup]
interpreter_constraints: ['CPython>=3.6']

[python-repos]

indexes: [
    "https://a****.net/artifactory/api/pypi/pypi-remote/simple"
 ]

python/3rdparty/BUILD

python_requirement_library(
    name="pandas",
    requirements=[
        python_requirement("pandas")
    ]
)
python_requirement_library(
    name="pyspark",
    requirements=[
        python_requirement("pyspark==2.4.4")
    ]
)

Error

Note that $ ./pants bundle --level=debug 3rdparty/python:pandas works just fine.
Also it is resolving the transitive dependency numpy just fine and bundles the .pex file correctly.
This means that not all transitive dependencies are resolved via pypi.org, some are correctly handled via our new index
bundling pyspark however gives errors:

$ ./pants bundle --level=debug 3rdparty/python:pyspark

**** Failed to install pyspark-2.4.4 (caused by: NonZeroExit("received exit code 1 during execution of `['/usr/bin/python3.6', '-s', '-', 'bdist_wheel', '--dist-dir=/tmp/tmpgw2n2x7y']` while trying to execute `['/usr/bin/python3.6', '-s', '-', 'bdist_wheel', '--dist-dir=/tmp/tmpgw2n2x7y']`")
):
stdout:

stderr:
Could not import pypandoc - required to package PySpark
Download error on https://pypi.org/simple/pypandoc/: [Errno 101] Network is unreachable -- Some packages may not be found!
Couldn't find index page for 'pypandoc' (maybe misspelled?)
Download error on https://pypi.org/simple/: [Errno 101] Network is unreachable -- Some packages may not be found!
No local packages or working download links found for pypandoc
Traceback (most recent call last):
  File "<stdin>", line 14, in <module>
  File "<string>", line 224, in <module>
  File "/tmp/tmpnvmtgfsi/pex/vendor/_vendored/setuptools/setuptools/__init__.py", line 166, in setup
    _install_setup_requires(attrs)
  File "/tmp/tmpnvmtgfsi/pex/vendor/_vendored/setuptools/setuptools/__init__.py", line 161, in _install_setup_requires
    dist.fetch_build_eggs(dist.setup_requires)
  File "/tmp/tmpnvmtgfsi/pex/vendor/_vendored/setuptools/setuptools/dist.py", line 626, in fetch_build_eggs
    replace_conflicting=True,
  File "/tmp/tmpnvmtgfsi/pex/vendor/_vendored/setuptools/pkg_resources/__init__.py", line 812, in resolve
    replace_conflicting=replace_conflicting
  File "/tmp/tmpnvmtgfsi/pex/vendor/_vendored/setuptools/pkg_resources/__init__.py", line 1095, in best_match
    return self.obtain(req, installer)
  File "/tmp/tmpnvmtgfsi/pex/vendor/_vendored/setuptools/pkg_resources/__init__.py", line 1107, in obtain
    return installer(requirement)
  File "/tmp/tmpnvmtgfsi/pex/vendor/_vendored/setuptools/setuptools/dist.py", line 697, in fetch_build_egg
    return cmd.easy_install(req)
  File "/tmp/tmpnvmtgfsi/pex/vendor/_vendored/setuptools/setuptools/command/easy_install.py", line 732, in easy_install
    raise DistutilsError(msg)
distutils.errors.DistutilsError: Could not find suitable distribution for Requirement.parse('pypandoc')



               Waiting for background workers to finish.
18:24:25 18:27   [complete]
               FAILURE
timestamp: 2019-11-28T18:24:25.794626
Exception caught: (pex.resolver.Untranslateable) (backtrace omitted)
Exception message: Package SourcePackage('file:///home/******/.pants.d/python-setup/resolved_requirements/CPython-3.6.8/pyspark-2.4.4.tar.gz') is not translateable by ChainedTranslator(WheelTranslator, EggTranslator, SourceTranslator)

What I would have expected

all dependencies being resolved via the python-repos i set in pants.ini

Notes

[1]: my patch for pex in order to workaround my proxy certificate issue:

# EDIT by @a_teammate: patch pex.http in order to workaround proxy issues for now..
function patch_pex_http {
  staging_dir="$1"
  export pex_http_folder="${staging_dir}/install/lib/python3.7/site-packages/pex"
  export pex_http_file_path="${pex_http_folder}/http.py"
  export pex_http_patch="""--- http.py	2019-11-13 11:08:17.109444981 +0100
+++ http_fixed.py	2019-11-13 10:26:10.376205000 +0100
@@ -224,7 +224,7 @@
     if requests is None:
       raise RuntimeError('requests is not available.  Cannot use a RequestsContext.')
 
-    self._verify = verify
+    self._verify = False
 
     max_retries = env.PEX_HTTP_RETRIES
     if max_retries < 0:
"""

  export pex_http_patch_path="${pex_http_folder}/http.patch"
  echo "$pex_http_patch" > ${pex_http_patch_path}
  patch ${pex_http_file_path} ${pex_http_patch_path}
}

I assume that my patch is not needed anymore for pex 2.0.3 pex-tool/pex#812

@Eric-Arellano
Copy link
Contributor

Hey there, sorry we never replied to this! I see you edited this to mention Pex 2.0 - indeed, Pex 2.0 brought many fixes for things like this. Are you still having issues?

Also, you may be interested in the recent Pants 2.0 release, which greatly improves Python support https://blog.pantsbuild.org/introducing-pants-v2/

I'm going to close as stale, but please feel free to reopen and let us know how we can help. Sorry again for not replying!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants