Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

new sanity_pip_check should ignore modules in user home #1877

Closed
mstud opened this issue Dec 6, 2019 · 10 comments · Fixed by #1891
Closed

new sanity_pip_check should ignore modules in user home #1877

mstud opened this issue Dec 6, 2019 · 10 comments · Fixed by #1891
Milestone

Comments

@mstud
Copy link

mstud commented Dec 6, 2019

Hi, I just tried to install Python-3.7.4-GCCcore-8.3.0.eb with EasyBuild 4.1.0, but it failed during the sanity check because I had some TensorFlow installation lying around in ~/.local which depended on an old Python module (that was not loaded ofc). So apparently, the "pip check" also tried to check this tensorflow installation and failed because it could not find numpy, protobuf etc.

== sanity checking...
== FAILED: Installation ended unsuccessfully (build directory: /dev/shm/easybuild-build/Python/3.7.4/GCCcore-8.3.0): build failed (first 300 chars): cmd "pip check" exited with exit code 1 and output:
tensorflow-metadata 0.15.0 requires protobuf, which is not installed.
tensorflow-datasets 1.3.0 requires absl-py, which is not installed.
tensorflow-datasets 1.3.0 requires numpy, which is not installed.
tensorflow-datasets 1.3.0 requires protobuf, (took 14 min 33 sec)

Deleting ~/.local solved this problem, but, wouldn't it be good if EasyBuild somehow ignored modules from the user home for the purpose of sanity checking? Not sure if this is even possible..

@boegel
Copy link
Member

boegel commented Dec 8, 2019

Hmm, I'm not sure that's possible either...

As far as I understand, pip check scans all installed Python packages it can see, and checks whether all dependencies they require are available too.

Excluding stuff installed in ~/.local would be equivalent to using python -s (see https://docs.python.org/3/using/cmdline.html#cmdoption-s).

So, perhaps running python -s -m pip check rather than pip check could do the trick?

@mstud
Copy link
Author

mstud commented Dec 10, 2019

I just tried it, and yes indeed, python -s -m pip check ignores the broken dependencies in my home whereas pip check fails. So this would definitely be an improvement. I wonder if this could be relevant at any other place where pip is used, i.e. when installing packages? But probably not.

@boegel boegel transferred this issue from easybuilders/easybuild-framework Dec 10, 2019
@boegel boegel added this to the release after 4.1.0 (4.1.1?) milestone Dec 10, 2019
@boegel
Copy link
Member

boegel commented Dec 10, 2019

Before we switch to using python -s -m pip check, we should definitely make sure that it still catches the issues it's supposed to catch.

Wouldn't be the first time we shoot ourselves in the foot by misinterpreting the python command line options...

@Flamefire
Copy link
Contributor

I think it should be used throughout EB. Having anything in the users home may lead to broken packages which goes undetected for a while. Example: User has installed a package in its home and tries to install a EB python package which might then be skipped because it is already found. Although the ignore-installed switch should avoid this.

Anyway python -s seems to be the perfect fit

@smoors
Copy link
Contributor

smoors commented Dec 16, 2019

Users can already do this throughout EB by simply setting PYTHONNOUSERSITE.

@Flamefire
Copy link
Contributor

Great. Why not set this when running any python command (or even EasyBlock)? Because e.g. when using EB with Github integration I need to install pip packages into my HOME (git, credential stuff) but I don't think there is a valid use case for using the users site-packages for anything EB installs. It could lead to not installing a python package to the system because the admin had it accidentally in his environment.

@smoors
Copy link
Contributor

smoors commented Dec 17, 2019

I agree that setting this env var by default would be a good idea, as long as (adventurous) users can still unset it.

@boegel
Copy link
Member

boegel commented Dec 17, 2019

Actually, we already set $PYTHONNOUSERSITE in PythonPackage, see https://github.com/easybuilders/easybuild-easyblocks/blob/develop/easybuild/easyblocks/generic/pythonpackage.py#L520.

But it gets unset when the sanity check is run, so we should make sure it's set again before running pip check (or any other python command during the sanity check, I guess).

The original environment is restored as a part of the sanity check (before loading the temporary module to set up the environment for running the sanity check), so it's not exactly trivial, you need to basically inject $PYTHONNOUSERSITE at the right time...

@Flamefire
Copy link
Contributor

so it's not exactly trivial, you need to basically inject $PYTHONNOUSERSITE at the right time...

Why not just set it at the start of the sanity check? Like here:

Or use python -s...

Not having it set for commands run by EB is a risk: Some module installations (e.g. Horovod) check for existence of other modules and use them to install different stuff. This defeats the reproducibility by EB... And you can't set it before running EB: EB itself might be installed via pip install --user

@boegel
Copy link
Member

boegel commented Dec 17, 2019

Setting it in PythonPackage.sanity_check_step only ensures it's set for bundles of Python packages, not for individual Python packages installed as stand-alone modules.

Complete (& verified) fix implemented in #1891

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
4 participants