Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate a user's conda environment from python user packages in ~/.local #689

Open
englehardt opened this issue Jun 18, 2020 · 4 comments
Open
Labels
developer-experience Workflow improvements enhancement Not a bug or a feature request

Comments

@englehardt
Copy link
Collaborator

The problem

In #682 we discovered that conda doesn't provide a fully isolated execution environment, and instead allows python dependencies installed in ~/.local to be picked up. Unfortunately these seems to be preferred to conda's own packages. Here's a simple test script:

import crontab
from pprint import pprint as print
import sys

print(sys.path)
print(crontab.__file__)

When a conda environment isn't active, run pip install --user crontab. Then conda activate openwpm. The environment variable PYTHONNOUSERSITE=True will control whether python will pick up the users local packages.

(openwpm) steven@apollo:~/research/OpenWPM/test$ python test.py
['/home/steven/research/OpenWPM/test',
 '/home/steven/miniconda3/envs/openwpm/lib/python38.zip',
 '/home/steven/miniconda3/envs/openwpm/lib/python3.8',
 '/home/steven/miniconda3/envs/openwpm/lib/python3.8/lib-dynload',
 '/home/steven/.local/lib/python3.8/site-packages',
 '/home/steven/miniconda3/envs/openwpm/lib/python3.8/site-packages']
'/home/steven/.local/lib/python3.8/site-packages/crontab/__init__.py'
(openwpm) steven@apollo:~/research/OpenWPM/test$ PYTHONNOUSERSITE=True python test.py
['/home/steven/research/OpenWPM/test',
 '/home/steven/miniconda3/envs/openwpm/lib/python38.zip',
 '/home/steven/miniconda3/envs/openwpm/lib/python3.8',
 '/home/steven/miniconda3/envs/openwpm/lib/python3.8/lib-dynload',
 '/home/steven/miniconda3/envs/openwpm/lib/python3.8/site-packages']
'/home/steven/miniconda3/envs/openwpm/lib/python3.8/site-packages/crontab/__init__.py'

I don't know the exact internal logic python uses to order locations in sys.path. From PEP 370 it sounds like this is the intended order. This happens both with packages that are installed directly from conda's package repository and those installed via pip within conda.

Why fix

We shouldn't assume that users don't have local packages installed. It's the default flow for pip when global packages aren't writable (i.e., most linux users?). While virtual environments are best practice, I'm sure many users will pip install a few packages to run a simple script every once in a while (or are human and don't always follow best practices). In fact, OpenWPM's previous install script would intentionally install packages into ~/.local (which in retrospect was a mistake) so this may impact a lot of our long-term users.

These issues aren't easy to debug either. As @vringar experienced when he hit the issue in: #614 (comment), conda list shows the conda packages so it's not immediately clear to the user.

When python dependencies are listed in a requirements.txt file, users can choose to run in a virtualenv or not, and if they choose not to they're taking things into their own hands. But that's not possible with environment.yaml, so it's on us to make sure things work.

What can we do

This is less clear. This feels like something that should be fixed in conda, but reading through the discussions in conda-forge/python-feedstock#171, conda/conda#7173, and conda/conda#448 it seems like this is intended design despite the conda folks also billing conda as an alternate to pip and venv.

We could try to export PYTHONNOUSERSITE=True in a conda helper script, but as Sarah points out this may have some issues #682 (comment).

We could also try to use both virtualenv and conda, which is suggested by a conda developer here conda/conda#9788 (comment), but we'd have to refactor how we manage dependencies. Maybe this is the best long-term option.

@birdsarah
Copy link
Contributor

We could try to export PYTHONNOUSERSITE=True in a conda helper script, but as Sarah points out this may have some issues #682 (comment).

My only issue is with the helper script not actually working. If it does work, great.
If it doesn't work, that doesn't preclude us adding PYTHONNOUSERSITE=True to all places where we have:

# Make conda available to shell script
eval "$(conda shell.bash hook)"

Specifically what I mean by "not actually working" is that eval line is, to my understanding, necessary in scripts so that they see conda. So you can't split it out. And if you can't split it out, there's minimal code-drying out to be gained from a helper script.

That line is in 3 scripts currently.


As for the rest of this issue, I'm not sure it's useful for me to weigh in, because I don't think I'm seeing things with the same perspective as others.

@englehardt
Copy link
Collaborator Author

Thanks Sarah. I'm also curious what you think about

We could also try to use both virtualenv and conda, which is suggested by a conda developer here conda/conda#9788 (comment), but we'd have to refactor how we manage dependencies. Maybe this is the best long-term option.

This seems like a bunch of extra overhead.

@birdsarah
Copy link
Contributor

This seems like a bunch of extra overhead.

Yes

The problem is that we're talking about a tradeoff and as I have a hard rule that you never install things outside of a virtualenv or a conda env then I don't see this as a problem.

So I can't weigh in on the benefits.

On the cons, I'm assuming by overhead you mean "extra code and extra complexity". If so I really agree and I think there are lighter weight solutions we can do for now:

(1) we have the PYTHONUSERSITE=True - maybe that fixes it and we never have an issue again
(2) encourage people to open issues if they're having installation issues - make sure people post the output of conda list and pip freeze to help us debug the problem.
(3) I wonder if this will only really impact devs running repin.sh and so that's a much smaller surface area of people to educate and doesn't really require a tech solution. I know in theory it could affect more people than that, but perhaps in practice it won't.

@englehardt
Copy link
Collaborator Author

Yeah I agree. In #682 I added a troubleshooting comment to the readme to solicit feedback to this issue if folks run into the issue.

But I think it might actually be fairly easy to fix since conda environments support setting environment variables upon activation. See https://docs.conda.io/projects/conda/en/latest/user-guide/tasks/manage-environments.html#saving-environment-variables. This should work easily across macos and linux. Windows will require a separate file (that uses set instead of export), but that's also straightforward.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
developer-experience Workflow improvements enhancement Not a bug or a feature request
Projects
None yet
Development

Successfully merging a pull request may close this issue.

3 participants