Skip to content

Conversation

Carreau
Copy link
Contributor

@Carreau Carreau commented Nov 25, 2020

This add the --inherit option when creating virtual environments.
witch this child environments will inherit all installed package from
all the parents which can be useful to speed up installation; as well as
reduced disk usage.

Example with two environment parent and child.

This acts similarly to Python ChainMap, where the activated environment
is mutable, installing, updating (by shadowing) works by modifying the
child env, uninstalling is not possible when the package is installed in
a parent env.

$ python -m venv parent
$ source parent/bin/activate
$ pip install requests
$ deactivate

$ python -m venv child --inherit parent
$ source child/bin/activate
$ python -c 'import requests'

$ pip uninstall requests # fails parent is read only.
$ pip install flask # works

$ pip install requests --upgrade # would work and shadow the parent requests

This patch was initially created by the D. E. Shaw group (https://www.deshaw.com/)
and, on their behalf., enhanced/contributed back via
QuanSight Labs (https://www.quansight.com/labs)

The following authors have contributed to the upstream patch:

Co-authored-by: Robert Byrnes byrnes@deshaw.com
Co-authored-by: Vitaly Shupak shchupak@deshaw.com

https://bugs.python.org/issue42101

@Carreau Carreau changed the title BPO-42101: allow inheritance of venv bpo-42101: allow inheritance of venv Nov 25, 2020
@Carreau Carreau force-pushed the inherit-venv branch 2 times, most recently from 0846c03 to dfe6c08 Compare November 25, 2020 02:20
@Carreau
Copy link
Contributor Author

Carreau commented Nov 25, 2020

Windows failure is from unicodedata (test_normalization (test.test_unicodedata.NormalizationTest)
), seem unrelated and a message in Azure CI tells me "This task has failed 37 times across 443 pipeline runs in the last 14 days.", so I think that's a flaky test.

Can someone restart the tests suite, or should I rebase and force-push to restart them.

@Carreau Carreau marked this pull request as ready for review November 27, 2020 18:53
@Carreau Carreau requested a review from vsajip as a code owner November 27, 2020 18:53
@vsajip
Copy link
Member

vsajip commented Nov 28, 2020

I'm not sure this is really needed in core, given the complexity it introduces - I know there has been discussion on python-ideas, but what about python-dev? The python-ideas thread didn't really indicate a strong use case (i.e. confirmation that the feature would be reasonably widely used because its benefits are compelling - I can't see any other than disk space).

@Carreau
Copy link
Contributor Author

Carreau commented Nov 28, 2020

I can't see any other than disk space

That also help with creation speed, as many package do not need to be (re)-installed in many circumstances, and also help for central library updating where end user can inherit from a central venvs that are managed by administrators.

@vsajip
Copy link
Member

vsajip commented Dec 1, 2020

Hmmm - venvs are supposed to be disposable entities, created and recreated as needed. Recreation time should be minimal when the recommended practice of using wheels is followed. I'm worried that this change could introduce problems when incompatible versions of packages are installed in "parent" and "child" venvs - this should be battle-tested in real-world scenarios before adding to the stdlib. If problems arise, note that I will be the one that deals with the support fallout ... have you considered setting up a package on PyPI that does this, to enable doing that battle-testing?

@Carreau
Copy link
Contributor Author

Carreau commented Dec 1, 2020

I'm worried that this change could introduce problems when incompatible versions of packages are installed in "parent" and "child" venvs - this should be battle-tested in real-world scenarios before adding to the stdlib.

A similar patch (but with Python 2.7 compat, and more features I have trimmed down for now) has been in use as far as I understand in D.E.Shaw for over a year. Doing that with external package is a bit of a chicken and egg problem, as you need to have a working venv to install this hypothetical package to manage venv; as pointed in the python-idea thread this is not dissimilar to what virtualenvwrapper can do, and it means that you potentially have to update all script invoking the python executable directly to use this wrapper, which can be tricky.

The incompatible version of packages is for sure an issue if you update parent; yes; but at install/creation time pip will properly update and shadow package that are incompatible in parent; I also don't particularly expect everybody to use this feature; and most likely have it use for advanced usage; so it likely will not affect the average user but more teams the mange a large number of venvs; and are likely to know what they are doing.

There is also the opposite use case; you have a change of API of and upstream endpoint and only have to update one venv in order to affect all the downstream one that would otherwise be broken.

Recreation time should be minimal when the recommended practice of using wheels is followed.

While wheel can be "relatively" fast; it can't be faster than doing nothing. I don't have performance number for this but I'm also going to assume that on distributed file system this also greatly reduce the number of files to cache; as with wheel yo do get duplicated.

Note that in some context (hpc system), venvs can be kept for month/years, so fast creation is one thing but the "disposable" part is not; on those this would have a really large effect on space/caching usage. Usually on PB scale FS deduplication at block level is not always doabel.

If problems arise, note that I will be the one that deals with the support fallout

Yes I am aware and will do my best to help fix issues if any there are; I'm also happy to put this feature behind a big "experimental" warning if that helps; or reduce the current featureset/add more checks and documentation.

If you think that there is a better route to get this accepted, I'm all ears; I can revive the thread on python-idea; or move it to Python-Dev; I think that there are a lot of messages and subscribers on both; so want to be careful to use everybody's time carefully.

@vsajip
Copy link
Member

vsajip commented Dec 2, 2020

Does virtualenv support this type of functionality?

@github-actions
Copy link

github-actions bot commented Jan 2, 2021

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Jan 2, 2021
@Carreau
Copy link
Contributor Author

Carreau commented Jan 24, 2021

Apologies for the delay in response.
As far as I can tell virtualenv does not directly support this; but virtualenvwrapper seem to have similar functionalities, see https://virtualenvwrapper.readthedocs.io/en/latest/command_ref.html#add2virtualenv reproduced below for convenience:


add2virtualenv

Adds the specified directories to the Python path for the currently-active virtualenv.

Syntax:

add2virtualenv directory1 directory2 ...

Sometimes it is desirable to share installed packages that are not in the system site-packages directory and which should not be installed in each virtualenv. One possible solution is to symlink the source into the environment site-packages directory, but it is also easy to add extra directories to the PYTHONPATH by including them in a .pth file inside site-packages using add2virtualenv.

Check out the source for a big project, such as Django.

Run: add2virtualenv path_to_source.
Run: add2virtualenv.

A usage message and list of current “extra” paths is printed.
Use option -d to remove the added path.
The directory names are added to a path file named _virtualenv_path_extensions.pth inside the site-packages directory for the environment.

@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Jan 25, 2021
@github-actions
Copy link

This PR is stale because it has been open for 30 days with no activity.

@github-actions github-actions bot added the stale Stale PR or inactive for long period of time. label Feb 25, 2021
@Carreau
Copy link
Contributor Author

Carreau commented Feb 25, 2021

This PR is stale because it has been open for 30 days with no activity.

I would still like to get that in, and would love to know what is needed for that.

@Carreau
Copy link
Contributor Author

Carreau commented Jun 9, 2021

Hello everyone, this still has no conflicts with the base, is there any chances of getting reviews and feedback ?
Thanks.

@Carreau Carreau closed this Jun 30, 2021
@Carreau Carreau reopened this Jun 30, 2021
This add the --inherit option when creating virtual environments.
witch this child environments will inherit all installed package from
all the parents which can be useful to speed up installation; as well as
reduced disk usage.

Example with two environment parent and child.

This acts similarly to Python ChainMap, where the activated environment
is mutable, installing, updating (by shadowing) works by modifying the
child env, uninstalling is not possible when the package is installed in
a parent env.

 $ python -m venv parent
 $ source parent/bin/activate
 $ pip install requests
 $ deactivate

 $ python -m venv child --inherit parent
 $ source child/bin/activate
 $ python -c 'import requests' # ok, pulled from parent.

 $ pip uninstall requests # fails parent is read only.
 $ pip install flask      # works

 $ pip install requests --upgrade # would work and shadow the parent requests

This patch was initially created by the D. E. Shaw group (https://www.deshaw.com/)
and, on their behalf., enhanced/contributed back via
QuanSight Labs (https://www.quansight.com/labs)

The following authors have contributed to the upstream patch:

Co-authored-by: Robert Byrnes <byrnes@deshaw.com>
Co-authored-by: Vitaly Shupak <shchupak@deshaw.com>
@vsajip
Copy link
Member

vsajip commented Jun 30, 2021

Hello everyone, this still has no conflicts with the base, is there any chances of getting reviews and feedback ?

Well, see my earlier comment:

I'm not sure this is really needed in core, given the complexity it introduces - I know there has been discussion on python-ideas, but what about python-dev? The python-ideas thread didn't really indicate a strong use case (i.e. confirmation that the feature would be reasonably widely used because its benefits are compelling)

You might introduce the feature, but I (or another core team member) will have to support it from then on. As you can appreciate, the time available to core devs (almost all volunteers) is limited; so there is a reluctance to introduce changes which potentially have wide-ranging consequences and therefore entail some potentially non-trivial support burden.

The code in the venv module is designed to be built on via subclassing etc., and if a separate standalone PyPI package can be developed to provide this functionality, it would be better than adding it to the stdlib IMO. Adoption of that package would show how desirable the feature is, and if the demand is very high, this can be looked at again.

@MaxwellDupre
Copy link
Contributor

Pycharm allows this feature plus 'Make available to all projects'. I agree with core devs here; this should not be part of Python core.
Pycharm-inherit

@github-actions github-actions bot removed the stale Stale PR or inactive for long period of time. label Aug 1, 2022
@Carreau
Copy link
Contributor Author

Carreau commented Nov 9, 2022

Let's close this for now, to keep the list of PRs small(er).

@Carreau Carreau closed this Nov 9, 2022
@Carreau Carreau mannequin mentioned this pull request Nov 26, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants