Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Crazy Idea: Rewrite virtualenv #691

Closed
wants to merge 66 commits into from
Closed

Crazy Idea: Rewrite virtualenv #691

wants to merge 66 commits into from

Conversation

dstufft
Copy link
Member

@dstufft dstufft commented Dec 25, 2014

I don't think it's any secret that the current virtualenv code base is pretty terrible. It's not well factored, it's full of hacks (some of which I'm pretty sure no longer apply), there are practically no tests and I don't think anybody actually fully understands it. On top of that the -p argument works by essentially shelling out and executing the virtualenv.py script with the target interpreter which causes a bunch of problems with sys.path that any attempt to fully resolve seems to break another use case. In addition to all of that since the creation of virtualenv the stdlib has grown it's own venv module which offers built in isolation.

At first I debated whether or not virtualenv has a place at all in the future(tm) where Python has a venv module built into it. I think that it does. The built in venv module suffers from a common problem with the stdlib modules, while it provides some functionality updates to it require an update to the entire Python interpreter and in a world that often times gets it's Python from LTS releases like CentOS, Ubuntu, Debian, etc they are often times using an older version of Python. Another thing is that even if you have an updated venv that doesn't help you if you want to create a venv of an older Python. In contrast the virtualenv module can be updated independently which means we can push out new features, bug fixes and new versions of pip/setuptools quicker.

However currently virtualenv relies on a giant pile of hacks in order to create it's isolation. These hacks can require more hacks depending on Python version, the interpreter it's running under, and if the downstream distributor of Python (e.g. Debian) applies their own patches. In addition the hacks involve writing our own site.py file and our own distutils.py which means that the behavior of these files inside of a virtualenv is different than when not inside of a virtualenv and this itself can cause a number of problems.

Due to all of this, I think it's a good idea to re-use the isolation support of the venv module if it's available in the target Python and only use the "legacy" hacks as a fallback. I looked at doing something like this in the current virtualenv however the code is so intertwined and spaghetti that doing that almost requires a massive rewrite anyways, so I've decided to just do a from-scratch rewrite of virtualenv and port over the isolation hacks from the old version. This work represents that.

In this rewrite I have the following goals:

  • Re-implement with a clean code base using modern best practices.
  • As complete test coverage as possible.
  • Use the venv module to handle isolation wherever possible, but handle the activation scripts and installing of projects into the virtual environment ourselves.
  • Remove the -p hack where we re-execute virtualenv with a different interpreter but still support -p.

The actual work itself involves:

  • Create a command line interface that is similar to the existing interface.
  • Create a generic EnvironmentBuilder API that any isolation technique can implement.
  • Create a venv module style implementation of the EnvironmentBuilder API.
  • Create a legacy isolation style implementation of the EnvironmentBuilder API.
  • Support overriding the prompt for a virtual environment at creation time.
  • Support installing pip/setuptools (and independently enabling/disabling them) into the virtual environment.
  • Support searching extra directories for pip/setuptools packages.
  • Support controlling the verbosity of the output using the -v, --verbose, -q, and --quiet flags.
  • Support selecting the best EnvironmentBuilder implementation that is available for the target Python.
  • Write tests for everything.
  • Ensure that everything works on platforms other than OS X, particularly Windows.
  • Fix the packaging to handle the new code/layout/etc.
  • Create environments using the "real" Python if called while another environment is active.

The work for the VenvBuilder involves:

  • Create the actual virtual environment.
  • Enabling creating a virtual environment with and without the global site-packages directory.
  • Support clearing an existing virtual environment.
  • Support virtual environments using symlinks, copying, or "pick the best".
  • Support making a virtual environment relocatable.

The work for the LegacyBuilder involves:

  • Create the actual virtual environment.
  • Enabling creating a virtual environment with and without the global site-packages directory.
  • Support clearing an existing virtual environment.
  • Support virtual environments using symlinks, copying, or "pick the best".
  • Support making a virtual environment relocatable.

Obviously the biggest danger involved here is that we're going to lose something that somebody somewhere was depending on and things will break. This is especially bad because virtualenv does not have anything resembling a reasonable test suite and it is a giant ball of hacks, layered upon more hacks, with an extra helping of spaghetti. However I believe in the long run this will be a much more sustainable solution, especially since it allows us to delegate the isolation portion to the interpreter when possible. This will be very helpful to alternate interpreters too such as PyPy or Jython since they'll be able to implement and control the isolation on their own.

/cc @alex @arigo for the impact on PyPy.
/cc @jimbaker for impact on Jython.

I'm probably going to complete the work regardless just to see if it's even possible, but what do the @pypa/virtualenv-developers think about this? Is this something we want to do?

@dstufft
Copy link
Member Author

dstufft commented Dec 25, 2014

If anyone attempts to run this now, the setup.py is basically broken, so you currently have to check out the code and just run it with python -m virtualenv. It currently only supports creating virtual environments of Python 3.3+ (or any Python that has the venv module in it).

@dstufft
Copy link
Member Author

dstufft commented Dec 25, 2014

Oh, and this currently uses click, but I'm not sure if click is worth it or not. If not we should probably just use argparse instead.

@ncoghlan
Copy link
Member

Very nice! One option we may want to consider is to publish this in parallel with the current virtualenv for a while, giving folks a chance to try it out relatively easily without conflicting with the existing virtualenv implementation. While I personally favour that approach, it is more work upfront, and would need to be communicated carefully to avoid a repeat of the setuptools/distribute confusion. The intended pay-off is reducing the risk of swapping out the existing implementation for a new one by making it easier for users to provide feedback.

@ncoghlan
Copy link
Member

For testing, you may want to explore some of the utilities that build atop virtualenv, like vex, pew, virtualenvwrapper, etc.

If those have reasonable test suites, they may also serve as at least partial functional tests for virtualenv itself (similar to the way the stdlib venv tests provide functional testing of ensurepip)

@dstufft
Copy link
Member Author

dstufft commented Dec 25, 2014

Well it wouldn't really be much more work, unlike setuptools/distribute people generally don't import virtualenv in end user code much. So we could just publish it as virtualenv2 for people to use.

@ncoghlan
Copy link
Member

Yeah, the "more work" I was referring to was just setting up the separate project for it, and being clear that it's intended to become the implementation of virtualenv eventually, but we want to give folks a good chance to kick the tyres first.

@dstufft
Copy link
Member Author

dstufft commented Dec 25, 2014

Maybe virtualenv-preview or something to communicate it better that it's a temporary situation.

@ncoghlan
Copy link
Member

"virtualenv-refactor" perhaps? It's not a general purpose preview, it's specifically for testing a specific refactoring effort - while we likely won't retire the name, it would eventually become just an empty shell depending on the real virtualenv.

@sigmavirus24
Copy link
Member

What about "virtualenv-ಠ_ಠ"?

In all seriousness though, can we just put off shaving the naming yak until it's ready to 🚢 ?

@dstufft
Copy link
Member Author

dstufft commented Dec 25, 2014

The last commit gets a semi working isolation level for Pythons which do not have a venv module. It currently only supports Python 2.7 (needs some conditionals) and the only thing I've really tested is that the sys.path ends up with the values I expect. It needs more work still, but it's starting to take shape.

@ionelmc
Copy link

ionelmc commented Dec 29, 2014

If you gonna rename it (pun alert), how about realenv? Or workingenv?

@dstufft
Copy link
Member Author

dstufft commented Dec 29, 2014

Eh, we want to avoid a "real" name because it's only intended to be a temporary fork to enable easy testing.

@ncoghlan
Copy link
Member

I believe workingenv was actually the name of Ian Bicking's virtualenv predecessor :)

A name like "virtualenv-unstable" could be a reasonable way to go - lots of folks are already familiar with the Debian unstable/testing/stable channel based release model.

It's also an approach that generalises - the "-unstable" suffix could be used for any project that would like to provide a clearer separation between their dev pre-releases and their stable project releases.

In that case, it wouldn't be a temporary situation - "virtualenv-unstable" would stick around indefinitely, it would just spend most of its time as an empty shell that depends on the stable version and only diverge when there was a significant change incoming that needed broader testing.

@ionelmc
Copy link

ionelmc commented Dec 29, 2014

For anyone interested, some rough-ish code to make this work on Windows: dstufft#2

@Ivoz Ivoz self-assigned this Dec 30, 2014
@dstufft
Copy link
Member Author

dstufft commented Jan 4, 2015

Status Update:

Things are mostly working the biggest thing left to do functionality wise is figure out a solution to virtual environment inside of a virtual environment. I also need to write tests for everything and possibly figure out a good cross platform testing solution that actually works for us (this could be a good experiment for something to use for pip and setuptools and such as well). I also need to go through all the documentation and make sure it's accurate with the rewrite as well as document the hell out of anything that would have broken.

I think I may drop the --relocatable feature. It has never quite worked right and I think that any solution to relocating virtual environments is better solved by recreating the virtual environment. The biggest reason I can think of not to recreate the virtual environment is not wanting to pay the cost of compiling a second time, however I think the right answer to that problem is to use Wheels. In general I don't think making a relocatable environment is possible without sharp edges, especially with C extensions because they may or may not have hardcoded compile time paths that may or may not be able to be adjusted at runtime.

I'd like to throw a big thank you out to @ionelmc who implemented the "flavor" support in the rewrite which abstracts away differences between Windows/Posix (and possibly other things if we need them!) so that we don't need a bunch of if statements all throughout the code.

@dstufft dstufft mentioned this pull request Jan 5, 2015
5 tasks
@dstufft
Copy link
Member Author

dstufft commented Jan 5, 2015

Closing this in preference to #697.

@dstufft dstufft closed this Jan 5, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

5 participants