pip should reinstall . #536

Open
dholth opened this Issue May 15, 2012 · 45 comments

Comments

Projects
None yet
Member

dholth commented May 15, 2012

When asked to 'pip install .' it would be nice if pip actually reinstalled it, rather than checking the version number.

Contributor

ianb commented May 15, 2012

Should the general heuristic be to reinstall local files that are given on the command line? This would keep local files in requirement files from being reinstalled automatically, which may or may not be a good idea, I can't decide.

(It should be feasible to determine why something is being installed using comes_from and so treat command-line arguments differently from other indirect installations.)

Member

dholth commented May 15, 2012

The heuristic would be local directories passed on the command line. I was developing something without using pip install -e and "my edits don't show up until I edit setup.py with a new version number" was rather surprising.

Is there just a way to reinstall one thing without updating all the deps?

Contributor

ianb commented May 16, 2012

I'm not sure there is, --ignore-installed would force a reinstall, but it'd force a reinstall of all the dependencies too. pip install -e should work okay, since edits should show up, and I believe it does force installation. But pip install . copies files and doesn't force.

I'd personally prefer if pip install didn't do the work if none is required. It's "more idempotent" :-), and I think more useful when installing "released" packages. But I agree that an optional --reinstall that forces reinstall of the requested package (but not its dependencies) is useful. That would be my suggestion, if I can pipe in here. (Right now I'm using pip install --no-deps --ignore-installed to do that for my package in development, but it's a bit of a mouthful.)

Perhaps an --ignore-installed-this option that ignores installed except for dependencies?

Contributor

rbtcollins commented Oct 15, 2015

This might be fixed actually; needs someone to test it :)

Contributor

xavfernandez commented Oct 22, 2015

This might be fixed actually; needs someone to test it

Not sure what is fixed/needed fixing ^^

pip install . keeps installing only when version changes and pip install --no-deps --ignore-installed . does indeed works fine and reinstall . everytime.

Contributor

rbtcollins commented Oct 22, 2015

'pip install .' should reinstall everytime

'pip install .' should reinstall everytime

Could you explain why?

Owner

pfmoore commented Oct 23, 2015

I agree. I can see the logic that you're looking for a workflow of "do some edits / pip install .", but that's not the behaviour pip supports - you should be using pip install -e ..

If pip install . is special-cased to not check the version, then what about pip install .., or any of a number of other local directories that you might have been working in? What about pip install <full path to .>? You'd basically be saying that when you install any directory in the local filesystem, you should ignore the version, and I don't think that's a good idea at all.

IMO, you should either use pip install -e . or pip install --ignore-installed ..

Owner

pfmoore commented Oct 23, 2015

(To clarify, I agree with @piotr-dobrogost's question, not with @rbtcollins suggestion...)

Contributor

rbtcollins commented Oct 25, 2015

@pfmoore I don't think . needs any special casing.

pip supports two means to resolve distributions: by name (where version constraints can be applied) or by location (where they cannot). Location should always reinstall in my opinion, because thats a lot easier to model for users.

'.' is just a commonly used location, but the logic applies equally to git+ urls, tarball urls (e.g. vcs exports), and working directories of dependencies.

Owner

pfmoore commented Oct 25, 2015

@rbtcollins OK, I see what you're saying now. Not sure if I agree (I'll have to think about it) but I see what you're proposing.

rgommers commented Nov 3, 2015

+1 for always reinstall for all local paths and urls. The current behavior is very counterintuitive.

Contributor

xavfernandez commented Nov 3, 2015

I agree concerning local directory path or vcs link without specific commit.
But I'm torn concerning direct links to tarball if a version is specified in the name.

I'd find it strange to have pip install path/to/foo-1.0.tar.gz reinstall everytime while pip install --no-index --find-links=some_dir/containing/the/foo/tarball/ foo==1.0 would not.

@rbtcollins

pip supports two means to resolve distributions: by name (where version constraints can be applied) or by location (where they cannot). Location should always reinstall in my opinion, because thats a lot easier to model for users.

Couldn't/shouldn't pip find out what's the version first (by running egg_info, using #egg=... and #version=... or by other means) in the location case thus eliminating difference in behavior?

Contributor

rbtcollins commented Nov 4, 2015

@piotr-dobrogost developers don't have accurate version information in working trees.

Member

njsmith commented Nov 4, 2015

+1 to @rbtcollins's suggestion too, FWIW.

Common situation: a VCS checkout with nominal version "1.2+dev". If I run pip install ., and then tweak some code (without changing the version number, because before it was an anonymous unreleased dev version, and now it's still an anonymous unreleased dev version), and then run pip install . again, then it should install my tweaked code. Likewise if I write pip install git+.../master on Saturday and then again on Wednesday -- there's an excellent chance that what is in master has changed between those two calls, even if the version number hasn't changed.

And then yeah we could try to get fancy and avoid the reinstall in some particular cases, like tarballs-with-embedded-version-numbers (but not other tarballs) or URLs-with-hashes (but not other URLs), but it seems like there are better things to spend our cleverness on -- plus the cleverer we try to be, the more likely we are to introduce subtle bugs and confuse users. "Always reinstall when given a location" is simple and works.

Owner

pfmoore commented Nov 4, 2015

And then yeah we could try to get fancy and avoid the reinstall in some
particular cases, like tarballs-with-embedded-version-numbers (but not
other tarballs) or URLs-with-hashes (but not other URLs), but it seems like
there are better things to spend our cleverness on -- plus the cleverer we
try to be, the more likely we are to introduce subtle bugs and confuse
users. "Always reinstall when given a location" is simple and works.

I don't think there's any cleverness needed. If you're installing from a
local directory then assume it's a working directory and always install.
If installing from a local file, it's a sdist and should follow normal
install rules, just as if it were downloaded from PyPI. If installing from
something remote (i.e. a URL) then it's not a working directory, so install
with the normal (current) rules.

(I understand from the distutils-sig discussions on new sdist formats that
you consider there to be a difference between a tarred up working directory
and a sdist, but that's a distinction pip doesn't make currently, so isn't
relevant here).

There may be corner cases with the above that you'd prefer to work
differently, but it's a simple understandable rule that improves over the
current behaviour in the cases people care about (basically just pip install .).

Owner

dstufft commented Nov 4, 2015

I agree with @pfmoore wrt always installing from a local directory, use typical rules for a tarball.

Member

njsmith commented Nov 4, 2015

I'm not overly fussed about the tarball case, because the working directory case is the one that people are stubbing their toe on everyday and I don't want to hold up fixing it. If we at least have consensus to fix that then we should go ahead and do it.

I do think you're all wrong though :-)

First, people do actually do

pip install https://github.com/pypa/zip/archive/develop.zip

or

pip install git+ssh://git@github.com:pypa/pip.git

and I don't see what we gain by gratuitously breaking them.

So that's an obvious case at one end bf of the spectrum. But on further thought, I also disagree about even the most extreme example on the other end. Consider:

pip install ./lxml-2.3.1-whatever.whl

I think this should do an install even if lxml==2.3.1 is already installed. If I just went to the trouble of manually downloading and specifying a file by hand, then there are few things more infuriating than some piece of smug, overly confident software chiding me and refusing to follow my explicit instructions. And honestly pip doesn't know better than me here. If I asked to get an installation of lxml==2.3.1 and there's already one installed, then fair enough, pip has done what I asked for. But in this case, that's not what happened. I asked for an installation of lxml out of this file right here, and pip has no idea whether I already have that or not. Maybe my existing installation is broken and now I'm trying to switch to a known-good wheel that I got from cgohlke's site or something. It simply isn't the case that all tarballs and wheels with the same (name, number) are interchangeable. What do we gain by pretending they are, besides pissed off users?

Contributor

rbtcollins commented Nov 4, 2015

So, let me try again.

Axiomatic: Users have to learn about any hidden behaviours we have.

"Lookup by name == trust version; otherwise == always install." -> There is one condition and 3 concepts for users to learn.

"Lookup by anything other than local directories == trust version; everything else == always install." is one condition but 4 concepts - its harder.

But more importantly, that heuristics is less useful, as @njsmith points out, because tarball exports from VCS systems will also tend to have broken version metadata (assuming they work at all, which many won't). Local build artifacts generated from CI systems will have broken metadata.

"It is a file" is not a good indicator for "has good version metadata"

I'm seeing where both @njsmith and @rbtcollins come from but still treating the exact same file differently on the merit of its origin (downloaded from index or read from local system) seems wrong. Also I do not see the need to go out of the way to make installing something with broken metadata just work when all that is needed in this case is passing flag forcing installation.
Generally the whole idea of treating different formats differently and guessing what format and in what scenario has what chance of having valid metadata leads nowhere. In light of this I wouldn't special case any format, including local directory. For development there's editable mode and for cases one wants to ignore version metadata there's forced installation.

The original motivation for wanting this was:

I was developing something without using pip install -e and "my edits don't show up until I edit setup.py with a new version number" was rather surprising.

As developing without using editable mode is an odd thing to do (there's no information why this was taking place; where there any obstacles to using editable mode?) it's hard to treat this as valid motivation for whatever follows.

Member

njsmith commented Nov 7, 2015

treating the exact same file differently on the merit of its origin (downloaded from index or read from local system) seems wrong.

That's exactly not the argument. The argument is that we should respect the user's request: if they typed out the name of a specific file and asked for it to be installed then we should install that file, and the only way we can do that reliably is by, you know, installing it. (Even our metadata standards acknowledge that version comparisons aren't sufficient; they include the @ escape hatch to refer to a specific file/URL.) If the user types out a distribution name without specifying where to get it, then that's totally different. The difference isn't where we downloaded the file, the difference is about respecting the user's expressed desires.

(Seriously, when pip does it's "sorry Dave, you already have that installed" routine, then the nasty combination of feeling helpless and patronized at -- by a stupid piece of software no less! -- is so frustrating that it makes it one of the very few cases where my initial impulse is to yell "no fuck YOU FIRST" and dropkick my laptop. It's a pretty visceral reaction. Maybe this kind of feeling is par for the course when dealing with python packaging right now, but I'd really like it if using pip became a pleasant thing that I looked forward to :-/.)

As developing without using editable mode is an odd thing to do

Well, this is somewhere tangential, and I'm in the middle of writing a longer reply to the mailing list thread about why I also dislike editable mode and recommend that no one use it, so... I will just say that the world contains multitudes who use a variety of different workflows, and pip should support them too. And in particular if we think "setup.py install must die" as Nick's bof session said, then pip install needs to be at least as capable as setup.py install. Right now setup.py handles non-editable install workflows just fine (or at least well enough to convince most users that it's working), so if pip doesn't then that's effectively a regression.

Owner

dstufft commented Nov 7, 2015

I've been thinking about this more. I think I am on board with the idea that if someone points to a direct location we should install that regardless of what is already installed and the version number should only be used in the resolution algorithm to detect conflicts with other people depending on that thing.

@dstufft: absolutely!

treating the exact same file differently on the merit of its origin (downloaded from index or read from
local system) seems wrong.

The issue however is that it is not the exact same file(s) -- it's a file with the same version number.

In the use case of an end user installing a package the current behaviour makes sense -- of course it does. And PYPi enforces that you can't upload a fixed/updated package with the same version number.

But the other use-case is developers of packages -- and then you really, really, don't want to have to increment your version number with every stinking change. And maybe you are testing/debugging the package installation / building, not just the code, so -e doesn't make sense either.

In my case, I build conda packages, but I like use pip to do the actual building, so I get all the proper pipi meta-data -- and during testing/debugging the build process I really do want to re-build teh same version, really I do.

And I think as people start to build more complex binary wheels, this will come up more.

Member

njsmith commented Jan 29, 2016

Conceptually, I think the idea is that right now, when the users passes a specific directory to 'pip install', and this directory has metadata saying

Name: foobar
Version: 1.2

then internally pip treats this as a place where foobar 1.2 can be obtained, and also as if the user requested foobar==1.2.

Instead, it should treat it AS IF the metadata instead said

Name: foobar
Version: <unique nonce>
Provides: foobar (1.2)

and as if the user requested foobar==<unique nonce>. So if there's another package that depends on foobar 1.2, then this distribution is fine for satisfying that requirement, but the user really does insist on getting exactly this distribution.

Contributor

xavfernandez commented Jan 31, 2016

I now think the clearest and simplest solution would be to go with @rbtcollins suggestion (always reinstall).
If someone does not want this behavior, they'll always have:

pip install --no-index --find-links=some/path/to/foo-1.0.tar.gz foo==1.0

as an escape hatch.

@pradyunsg pradyunsg added a commit to pradyunsg/pip that referenced this issue Jun 25, 2016

@pradyunsg pradyunsg Add tests for checking more new behaviour
The behaviour which gets tests added in this was discussed in #536 and
was implemented by accident in #3806.
111d0e6
Member

njsmith commented Apr 17, 2017

A slightly different version of this just bit me again in a very frustrating way: there's a bug in the latest release of the readthedocs sphinx theme that causes broken line numbering. This is fixed in current master, but the fix hasn't made it any releases yet. This is messing up my docs, so I figured no problem, I'll manually pull in the latest version of the theme by adding a line to my RTD requirements file naming the specific git commit hash I want. That'll replace the default version that RTD installs with a newer fixed version and away we go.

Except haha, nope! That doesn't actually do anything, because that commit's setup.py has the same version number as the last release, and the last release is already installed, so pip looks at my request for d2a6253d9c782888c4468c637a5ee2a42ab3eb88 and says "oh yeah you already have that version" and refuses to install it. I do not have that version. And since the only way I can affect the environment here is by editing that requirements.txt file (I can't run arbitrary commands or change pip command-line options or anything), I think I'm just screwed.

I think my only possible workaround at this point is to fork the theme and bump the version number by hand, and then install from my fork? Ughh ☹

So I guess in addition to actually installing the specific file/path that I gave it, I would also appreciate it it pip would actually install the specific git commit that I gave it!

Member

pradyunsg commented Jun 28, 2017 edited

There seems to be consensus that pip should reinstall every time the user does pip install <local-dir>. I'd like to confirm this before I assign this to myself. :)


IIUC, the following would not behave differently from today even if this request is implemented.

  • pip install --no-deps --ignore-installed <local-dir> pip install --no-deps --force-reinstall <local-dir> - reinstall the local directory package regardless.

    If the dependencies change, running pip install --upgrade --upgrade-strategy=only-if-needed . next should fix that (to the extent that pip fixes dependency requirements anyway).

  • pip install --no-index --find-links=<local-dir> <pkg-name> - reinstall the local package only when there's a version bump.

So, if someone is eager for the always-reinstall behaviour (or want to pin to the current behaviour), you can use one of the above.

@dstufft @pfmoore Is this correct?

Owner

pfmoore commented Jun 28, 2017

Honestly, I've no idea. I'm OK in principle with the idea of "pip should reinstall every time the user does pip install <local-dir>" (although I still think that it's just as reasonable to direct people to be explicit and use --force-reinstall or one of the other options pip already has to get this behaviour) but I'd like to see a clearly-defined spec for what "should reinstall" means before I can comment any further. For instance, are we saying that:

  1. "should reinstall" means that we assume --force-reinstall was specified
  2. "should reinstall" means that we assume --ignore-installed was specified
  3. "should reinstall" means that we assume the version is infinitely large
  4. ... or something else entirely?

Once we're clear on what "should reinstall" actually means, we can debate how corner cases will be handled.

Also, to be clear, we're only considering local directories here, so this won't address @njsmith's case from #536 (comment) - is that correct? My personal view is that restricting the change to local directories is correct - as I said above, there are existing explicit options for people wanting this behaviour in other contexts.

Owner

dstufft commented Jun 28, 2017

I think it definitely makes sense for local directories, because local directories are likely going to be cases where someone has made modifications and the version number no matter matches reality. Other cases are... more grey. Starting with local directories seems to be a reasonable idea though, and considering other cases as time goes on.

and FTR, I think it should mean that we essentially treat it like --force-reinstall happened, uninstall the old version and install the version at <local dir>. I don't like --ignore-installed because it won't uninstall the old version.

Owner

pfmoore commented Jun 28, 2017

OK, so installing a local directory simply implies --force-reinstall (and hence if anyone wants the new behaviour for anything other than a local directory, all they have to do is explicitly use --force-reinstall). That sounds good to me.

Regarding @pradyunsg's other questions:

  • pip install --no-index --find-links=<local-dir> <pkg-name> - reinstall the local package only when there's a version bump.

That works, but why would anyone care? If it was an important use case, we wouldn't be having this discussion anyway. It's a way of getting the current behaviour after the change, I guess...

  • pip install --no-deps --ignore-installed <local-dir> - reinstall the local directory package regardless. If the dependencies change, running pip install --upgrade --upgrade-strategy=only-if-needed . next should fix that (to the extent that pip fixes dependency requirements anyway).

I have no idea what the point is here. It seems to be trying to get the new behaviour with the current pip, but pip install --force-reinstall <local dir> does that much more simply (and is by definition the same).

Member

njsmith commented Jun 28, 2017

@pfmoore: I think conceptually, pip install local-dir should be treated as if the user wrote pip install packagename==$UNIQUE_NONCE, where that's a unique version number that can only be satisfied by the local-dir, not by the existing version or any other source. The underlying idea here is that we want to respect the user's wishes, and the user named a directory, not a (name, version) tuple, so we can't assume that other packages that have same (name, version) tuple will fulfill their request.

We would also want to skip caching wheels in this case; probably we already do but mentioning for completeness.

More generally I think it would make sense to apply similar logic to any case where pip's input is not a package-name-plus-version-constraints, and for the same reason. So pip install path/to/some.whl should also always install that particular wheel path, etc. The one case where I think this might cause problems is where people are listing git urls in requirements files, and applied naively this would cause the package to be reinstalled every time the file is read. But here the current behavior isn't correct either; really we should be reinstalling iff the revision pointed to by the url has changed. This would require some extra metadata and stuff though so is a trickier problem.

I should also say I'm totally fine with @dstufft's suggestion to implement this incrementally.

Member

njsmith commented Jun 28, 2017

Here's an alternative conceptual model that I think is equivalent.

Right now, if you pip install packagename, maybe with some constraints, then pip first consults its pool of available package sources to resolve that into a (name, version) tuple. Then it tries to find a source for that (name, version) tuple, by looking in two places, in priority order: (1) the current environment (if it's already installed then installing it is really easy), (2) whatever indexes are configured.

If you add --force-reinstall, then that takes the current environment out of the list of package sources it considers.

Now, for something like pip install ./local-dir, it first consults that directory to resolve it into a (name, version) tuple, and then tries to install that (name, version) tuple by looking in 3 places, in priority order: (1) the current environment, (2) any explicitly specified wheels/files/urls that were processed in the previous step, (3) any configured indexes.

So far I'm just describing how pip currently works. You can see that there's some kind of prioritization going on because currently if you specify a local directory, then you might be told "name-drop is already installed" (environment beats explicitly specified files), but you'll never see pip go off and install name-version from pypi instead (explicitly specified files beats indexes).

So the suggestion here is to swap the priority order, so it goes (1) explicitly specified files, (2) current environment, (3) configured indexes.

Owner

pfmoore commented Jun 28, 2017

@njsmith OK, so that's an option I hadn't considered (although it's probably what I really meant by my option (3) but hadn't thought it through. @dstufft prefers (1), and I'm mildly in favour of (1). Basically, in my view (1) is simpler to explain, as it introduces no new behaviour, just alters which behaviour applies in this case.

Here's an alternative conceptual model that I think is equivalent.

Hmm, I have to say I find this explanation pretty baffling. I'm willing to take your word for it that it's the same as your previous statement, but I wouldn't like to try to explain the proposed behaviour this way to a new user.

However, regarding the wheel cache, you're absolutely right that letting any of this near the wheel cache is a big problem. Wheels are essentially required to be uniquely specified by (project name, version). And this new behaviour is explicitly acknowledging that (project name, version) is not sufficient to uniquely identify a wheel. I suspect that there are a number of other ways that people could break the wheel cache, but we'd probably respond with "different wheels with the same name/version aren't supported". This change makes the case of pip install <local dir> into an explicitly supported case which is precisely to allow reinstalling different code that has the same (name, version).

I'm not (quite) changing my mind to oppose this proposal, but I do think we need to carefully think through the implications for the principle "(name, version) uniquely identifies a wheel". (I think it might be possible to rescue the principle by distinguishing between "local" wheels and "published" wheels, but it'll be tricky to formulate the distinction clearly).

Owner

dstufft commented Jun 28, 2017

I'm not sure what the practical difference between (1) and (3) is TBH, I'm mostly just opposed to (2).

Member

pradyunsg commented Jun 28, 2017

It's a way of getting the current behaviour after the change, I guess...

Basically, I asked that to figure out how to get the behaviours even if there's a change in pip and the current pip - something that would behave the same regardless; kinda for reference if someone wants those behaviours - in an effort to understand better what's being asked for.

Member

njsmith commented Jun 28, 2017

The wheel cache thing is related, but I think it's separate. It's already true that pip install dir and pip install foo.whl either put things in the cache or they don't, and if they do then it's a bug, and that's all regardless of what we do here.

Owner

pfmoore commented Jun 29, 2017

Agreed it's somewhat separate, my concern is that saying "the problem is because you changed the code but left the version the same" may not be an acceptable answer after we make this change (whereas now, while it may not be liked, I think it's the reality). In other words, I'm not sure that at the moment "if they do, it's a bug", as we currently expect (project, version) to uniquely identify a wheel.

But either we don't worry about it for now (which may cause problems for users of the change - I'm not one of those so I can't really say) or we thrash out the correct behaviour as part of this change. No big deal either way.

Member

pradyunsg commented Jun 29, 2017

we thrash out the correct behaviour as part of this change

I'd prefer that but, also, I don't want to be a part of a (possibly long?) discussion right now.

Would it be possible to hold out further discussion until someone comes around willing to take this through till implementation? Or is there someone who is willing to right now?

Owner

pfmoore commented Jun 29, 2017

Agreed, it's an implementation issue. No need to get into details now.

Member

njsmith commented Jun 29, 2017

I just tried pip install ./directory with pip 9.0.1 and it seems to have gone via setup.py install rather than anything involving the wheel cache.

Member

pradyunsg commented Jul 7, 2017

I've labelled this issue as an "awaiting PR".

This label is essentially for indicating that further discussion related to this issue should be deferred until someone comes around to make a PR. This does not mean that the said PR would be accepted - it has not been determined whether this is a useful change to pip and that decision has been deferred until the PR is made.

JordanSlaman commented Jul 18, 2017 edited

I experienced this 'issue' also this week, and in IRC it was suggested I log my experience ITT.

We wanted to switch to a different service to prerender our pages for SEO reasons, and this service required a different custom backend for one of our dependancies; django-seo-js==0.3.1

I forked the package to a private repository and went off to get the implementation working, installing it manually into my venv. I figured changing our production requirements.txt to the git uri I installed from locally would be adequate and off it went to the live sites.

This caused a few minutes of downtime as the package did not get updated so django crashed trying to call the new custom backend, and it was resolved quickly by uninstalling the dependency on each application server and re-installing from requirements.txt, restarting uwsgi.

What would have been better is if I had known that pip determines to install requirements by matching against data in setup.py so I could have changed them in my forked version.

Alternatively and as this issue suggests if pip did not keep/match metadata against local caches and instead installed from the URI provided on 'pip install' as I had assumed it would, my issue would have been prevented.

Took me a bit of googling and asking around to figure out what combination of --upgrade or --no-cache-dir might have what effects depending on what venv/system pip caches were in what state before I determined what I think is the 'proper' solution, which seems to be eloquently explained here: https://www.python.org/dev/peps/pep-0440/

Anyways just my $.02

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment