Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

16.4.0 causes breakage on system python when used with dh-virtualenv #1327

Closed
nailor opened this issue Feb 27, 2019 · 13 comments
Closed

16.4.0 causes breakage on system python when used with dh-virtualenv #1327

nailor opened this issue Feb 27, 2019 · 13 comments

Comments

@nailor
Copy link

nailor commented Feb 27, 2019

Fix in #1309 is now causing dh-virtualenv to accidentally destroy Python installation. We are still digging into what we can do on our side, or what combination things is the root cause, but currently the behaviour looks pretty bad: spotify/dh-virtualenv#272

Originally posted by @nailor in #1309 (comment)

@gaborbernat
Copy link
Contributor

@rofl0r surprising behaviour 🤔

@rofl0r
Copy link

rofl0r commented Feb 27, 2019

well it seems a few people were relying on the exact undocumented (and bogus) semantics of previous virtualenv version. now that it works correctly they're facing reality.. on the positive side, it seems the vast majority of users has no problems. the other people which commented on #1309 fortunately figured out what went wrong and that the bug was on their end, i hope @nailor and company will do so to.

@nailor
Copy link
Author

nailor commented Feb 27, 2019

hi,

so we are not actually relying on any exact undocumented semantics here of previous versions. The issue is pretty well outlined in this comment by @jhermann: spotify/dh-virtualenv#272 (comment).

To summarise the flow here:

  1. There exists a dh-virtualenv packaged Debian package on the system which was packaged using older virtualenv, i.e. having symlinks pointing to stdlib.
  2. A new version of the said package is built, using upgraded virtualenv (16.4.0), which builds a new Debian package that will actually contain the actual directories, but links for the files of stdlib.
  3. Upon installation of this package, due to the nature how Debian packages work, those existing directory symlinks are followed and the new file symlinks are installed, causing those newly installed symlinks to replace the files in the Python stdlib with symlinks to itself
  4. Everything is broken.

I'd like to stress that the change of moving from symlinking to the stdlib to symlinking to actual files inside a physical directory is what creates this breakage. Dh-virtualenv is not relying on "undocumented semantics", but this change combined with how Debian packages are delpoyed (in a nuthsell just tarball), will cause this to break in unexpected way (effectively replacing your whole stdlib with looping symlinks).

@rofl0r
Copy link

rofl0r commented Feb 27, 2019

hi,
do you have some more details on the mechanism causing this:

Upon installation of this package, due to the nature how Debian packages work, those existing directory symlinks are followed and the new file symlinks are installed, causing those newly installed symlinks to replace the files in the Python stdlib with symlinks to itself

?

@jhermann
Copy link

jhermann commented Feb 27, 2019

BTW, anyone using plain tarballs with overwriting (i.e. unversioned target dirs) for deployment will also be affected by this.

Due to the high impact of accidental overwrites, a small change in layout should be considered. Namely replacing the subdirs (encoding, importlib, …) with again symlinks to ../.stdlib/…, and create the new physical only-files-are-symlinked structure in …/lib/.stdlib.

Not the prettiest layout, but then who runs a beauty contest in those dirs…

@nailor Moving stuff according to this after venv creation could be a local fix in dh-venv.

@rofl0r
Copy link

rofl0r commented Feb 27, 2019

@jhermann you mean a tarball created out of a virtual-env created directory ? theoretically tar should preserve the information which files are mere symlinks.

@jhermann
Copy link

do you have some more details on the mechanism causing this:

Debian package upgrades work by overwriting the old tree with the new tree of files (in-place), and only then purging the diff (old / removed files). The reasoning is that this provides the biggest chance for uninterrupted service / seamless upgrades, because there is no race condition window of "no files".

@jhermann
Copy link

@jhermann you mean a tarball created out of a virtual-env created directory ? theoretically tar should preserve the information which files are mere symlinks.

tar might not have the problem, because it also stores information about directories, i.e. might handle the change of symlink → dir correctly. Well, dpkg / apt does not.

@nailor
Copy link
Author

nailor commented Feb 27, 2019

Given files inside Debian package are actually packaged in a tar package, I would be slightly surprised not to be able to reproduce this bug with plain tar

@nailor
Copy link
Author

nailor commented Feb 27, 2019

OK, did some digging around here. This is not replicated by just running tar. The reason why this backfires lies indeed in Debian packaging. Note that this is not a bug in a Debian packaging, nor in a way with virtualenv, but the change of moving from symlinked directories to symlinked files inside a regular directory is the one that backfires.

After going through some dpkg source code and documentation, here are the relevant parts:

A directory will never be replaced by a symbolic link to a directory or vice versa; instead, the existing state (symlink or not) will be left alone and dpkg will follow the symlink if there is one.

  • Same is echoed in dpkg-maintscript-helper man page, urging to add preinst step to clean / fix those directories
  • If I'm reading dpkg source code on tar extraction correctly (no expert here), you can actually see it in action: Debian, upon installing debian packages, replaces symlinks in that tar package, effectively achieving what is documented: Not changing symlinks to directories (or vice versa).

Last bit is why this is not happening on regular tar packages.

That said, the #1309 included 16.4.0 upgrade, while itself makes sense will cause system python to be overridden upon upgrading a Debian package containing virtualenv created with <16.4.0 with a package containing a virtualenv with 16.4.0.

This can be very problematic. The problem definitely is not contained to dh-virtualenv only, or to packages created only with it. This can potentially be affecting any Debian packages going over this upgrade bump. Worst case scenario for user is that they just upgrade a Debian package containing the unfortunate version bump of a virtualenv, ruining their system python completely.

@gaborbernat @rofl0r I'd suggest either rolling back the change or changing the behaviour slightly as proposed by @jhermann

@rofl0r
Copy link

rofl0r commented Feb 27, 2019

tbh i find the thought of using virtual-env environments for packages a bit odd.
do you have some step-by-step instructions that would allow us to reproduce a failing setup, @nailor ?

@nailor
Copy link
Author

nailor commented Feb 27, 2019

There are many valid cases to do what dh-virtualenv does and why people do this same thing with other tools, but that's besides the point.

I was able to reproduce this quite easily by following our tutorial. Before building the package, you can just pip install virtualenv==16.3.0, then you install the package, pip install virtualenv==16.4.0, re-build the package and then install the new package. System python is now officially toast.

I've attached the boilerplate package in a tar. Just extract it in a Debian based system and build it following the above steps: sample-package.tar.gz

EDIT: please run this in a disposable machine (or Docker container), this will ruin the system Python

gaborbernat added a commit that referenced this issue Feb 28, 2019
@nailor
Copy link
Author

nailor commented Mar 1, 2019

@gaborbernat Thanks!

Sign up for free to subscribe to this conversation on GitHub. Already have an account? Sign in.
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants