Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Newest version of package not installed unless specified #24199

Open
mikecormier opened this issue Sep 2, 2020 · 10 comments
Open

Newest version of package not installed unless specified #24199

mikecormier opened this issue Sep 2, 2020 · 10 comments

Comments

@mikecormier
Copy link
Member

When installing a package, like ggd, an older version of the package is installed unless the newest version is provided. Example conda install -c bioconda ggd will install version 0.1.2, which is a few versions away from the latest version. This problem has been seen with multiple packages from bioconda.

Additionally, when a package is updated or removed other packages are downgraded to an older version even though the package being updated or removed has no dependencies on the other packages. Again, an example is when a package, say vep is updated or removed it downgrades ggd to version 0.1.2 even though vep does not depend on ggd or any of the dependencies ggd has.

This seems to be a similar problem with the version priority when installing a specific package like ggd. How do we set version priority for a package so it doesn't revert back to an older version unless specified to do so?

@dpellow
Copy link
Contributor

dpellow commented Sep 10, 2020

@mikecormier - have you found a solution to this problem?

jmarshall added a commit to jmarshall/pysam that referenced this issue Oct 19, 2020
The broken-openssl/libcrypto condapocalypse (see conda/conda#9905,
bioconda/bioconda-recipes#24199, et al) has now caught up with samtools
et al 1.9. Update pinnings to (at least) the current version, 1.11.

Remove the ncurses pinning, which has been unnecessary since
bioconda/bioconda-recipes#17273.
jmarshall added a commit to jmarshall/pysam that referenced this issue Oct 19, 2020
The broken-openssl/libcrypto condapocalypse (see conda/conda#9905,
bioconda/bioconda-recipes#24199, et al) has now caught up with samtools
et al 1.9. Update pinnings to (at least) the current version, 1.11.

Remove the ncurses pinning, which has been unnecessary since
bioconda/bioconda-recipes#17273. Instead pin the blas variant as
the openblas variant that bcftools (with GSL) was built against.
jmarshall added a commit to pysam-developers/pysam that referenced this issue Oct 19, 2020
The broken-openssl/libcrypto condapocalypse (see conda/conda#9905,
bioconda/bioconda-recipes#24199, et al) has now caught up with samtools
et al 1.9. Update pinnings to (at least) the current version, 1.11.

Remove the ncurses pinning, which has been unnecessary since
bioconda/bioconda-recipes#17273. Instead pin the blas variant as
the openblas variant that bcftools 1.11(+GSL) was built against.
@jmarshall
Copy link
Member

This is a significant general problem impacting Conda's usefulness, also reported as conda/conda#9905 and having received some discussion there.

There are many other similar reports here, e.g., #24621, #24320, #24264, #24182, #24033, and #22824.

@jmarshall
Copy link
Member

jmarshall commented Dec 14, 2020

@bgruening @dpryan79 @bioconda/core: This bug is causing confusion for users every day — most recently as reported in samtools/htslib#1192 — and has been ongoing for months. Whether this is properly a conda problem (cf conda/conda#9905) or something that could in principle be fixed with updates to the bioconda repo metadata files, please pin this issue and add an explanation of the workarounds (check the version you're offered and explicitly specify the latest version if necessary; mamba apparently) and ideally an overview of the roadmap towards fixing this (if there is one).

@dpryan79
Copy link
Contributor

@jmarshall As you're aware, we have no control over this issue. I'm happy to pin this, but be aware that it will not help since regular users almost never read such things. As you indicated, the only solutions are (1) to specify the exact version you want or (2) to use mamba, which lacks this annoying quirk.

@dpryan79 dpryan79 pinned this issue Dec 15, 2020
@jmarshall
Copy link
Member

jmarshall commented Jan 12, 2021

Conda create/install starts by downloading a trimmed-down current_repodata.json metadata file from each channel, and by default only downloads the full repodata.json files if it needs to. This issue occurs when the newest package version cannot be satisfied via current_repodata.json but an older version can, so it uses that instead of retrying with the full repodata.json. Hence it can be worked around in any of the following ways:

  1. Ask for the particular newest version (to prevent the older version from succeeding):
    conda create/install package==version

  2. Skip the current_repodata.json attempt entirely:
    CONDA_REPODATA_FNS=repodata.json conda create/install package

  3. Use mamba instead of conda

@jmarshall
Copy link
Member

jmarshall commented Jan 12, 2021

But we're here to fix this problem, not just provide awkward workarounds that users would need to use 😄

As described in conda/conda#9905 (comment) and its followups, the problem here is that these bioconda recipes depend on superseded conda-forge package versions that are no longer in conda-forge's current_repodata.json cache. When there are older versions of these bioconda recipes with reduced requirements (or otherwise) hence do not depend on such packages omitted from conda-forge's current_repodata.json, those older versions are selected for installation (leading to user disappointment) rather than installation falling back to retrieving conda-forge's complete repodata.json and selecting the desired up-to-date bioconda packages and their now-visible conda-forge dependencies.

For concreteness, consider htslib and libdeflate. The htslib-1.11 package was first built on 2020-09-23 against the then-current libdeflate-1.6. At that time, libdeflate-1.6 was the latest version and was in conda-forge's current_repodata.json, and conda create -n tmp htslib delivered htslib-1.11 as expected. However on 2020-11-12 conda-forge released libdeflate-1.7 and libdeflate-1.6 fell out of their current_repodata.json. From then onwards, conda create -n tmp htslib has installed an older htslib that happens to depend on an older libdeflate from the bioconda channel.

(If the dependant package was also in conda-forge, then the construction of current_repodata.json would ensure that the particular packages it requires were also listed in current_repodata.json. But this does not automatically work across separate channels.)

In lieu of improving the current_repodata.json cache mechanism, bioconda can avoid this problem by ensuring (via manually-enforced policy) that its current builds of these packages depend only on conda-forge packages that actually are listed in conda-forge's current_repodata.json. There are several possible ways to ensure this (again, for concreteness consider htslib and libdeflate):

  1. Whenever conda-forge release a new version of libdeflate, rebuild the current htslib against that version. This is what PR Rebuild htslib with current conda-forge libdeflate #26085 does, and may or may not be the right thing to do depending on bioconda's pinning policy for libdeflate.

    There are currently 16 bioconda packages requiring libdeflate and I think the current latest builds of all of them depend on libdeflate >=1.6,<1.7.0a0. In particular, wiggletools 1.2.7 and 1.2.8 have been built in the last week and depend on libdeflate 1.6 despite 1.7 being the current conda-forge version at the time they were built. Conversely PR Rebuild htslib with current conda-forge libdeflate #26085 appears to have happily built against 1.7. So I'm not sure why this PR seems to have got libdeflate-1.7 successfully, and perhaps the final build would get 1.6 similarly to the wiggletools ones… [Edited to add: Wiggletools build-depends on htslib, hence while building it the pinned htslib build is pulled in which at present pulls in libdeflate-1.6. Hence why building wiggletools gets 1.6 but building htslib itself gets the current conda-forge libdeflate, 1.7.]

  2. Ask conda-forge to add a package whose purpose is to list package versions that bioconda wants to have available in conda-forge's current_repodata.json, e.g. libdeflate-1.6. This would be similar to the existing _current_repodata_hack which (I assume) is not intended to be installed by anyone but merely exists to ensure other packages are listed in current_repodata.json. A draft of such a package is at jmarshall/staged-recipes/…/_current_repodata_bioconda_hack.

  3. [2023 addition] Now that repodata patching is reasonably practical, it provides another solution to this problem. As long as new libdeflate releases remain compatible, existing packages can be patched to allow the use of newer libdeflates. See Newest version of package not installed unless specified #24199 (comment) below for details.

As noted in #17212 (comment), which of these two approaches is the appropriate one comes down to @bioconda/core's policy around libdeflate pinning (and in general pinning of other conda-forge dependencies). Is libdeflate currently pinned to 1.6? When would that be updated to 1.7?

[Edited to add: Libdeflate was pinned in the distant past, but this was removed in bioconda/bioconda-utils#610. That PR and commit don't explain why the pinning was removed — but I suspect it's because, of 16 bioconda packages that list libdeflate as a dependency, only a couple (htslib, staden_io_lib, possibly one or two others) actually ought to depend directly on libdeflate. So the answer is libdeflate is not pinned; and it's not pinned because it really doesn't need to be.]

I have a slight preference for (2) as it is more flexible and less timing-critical at the moments when packages such as libdeflate are updated. What are @bioconda/core's thoughts?

@jkbonfield
Copy link

Maybe I'm missing something obvious, but looking at the Makefile it seems libdeflate 1.6 and 1.7 both have a library .so version of 0. Hence the package ABI hasn't changed and there should be no reason at all for a package to claim it depends precisely on 1.6. That's just a recipe for disaster.

Can conda not pin on library so versions instead of package release numbers? If not, then maybe just start with the assumption that so doesn't change and pin if, and only if, a ABI breaking change is discovered later on.

@jmarshall
Copy link
Member

jmarshall commented Jan 12, 2021

@jkbonfield: Yes, conda is way too conservative here, but not having soversion-based dependency tracking infrastructure is a separate problem. (Adding such infrastructure would greatly reduce the times during which this issue appears, but it's a lot of work and a major change that would require major buy-in from the conda mothership — while this proposal is a simple policy that bioconda could implement today.)

@jmarshall
Copy link
Member

jmarshall commented Jan 15, 2021

For the htslib/libdeflate case: it turns out that libdeflate is not pinned at all in bioconda. Hence the correct approach to fix it (at least, until libdeflate-1.8 comes out) really is simply to bump htslib so that there exists an htslib package built against conda-forge's current libdeflate-1.7. PR #26237 does that and has been merged, so conda create -n myenv htslib/samtools/bcftools now all do the right thing using just current_repodata.json. 🎉

(It remains to work through the dozen other overlinked bioconda packages that spuriously depend on both htslib and libdeflate. By bumping them to remove their explicit libdeflate requirement — in reality, they only depend on it via htslib — they will no longer be stuck on libdeflate-1.6.)

When in the future conda-forge releases libdeflate-1.8 and libdeflate-1.7 falls out of their current_repodata.json, conda create -n myenv htslib etc will start choosing old versions again, until we bump htslib correspondingly. There are bioconda people involved in maintaining libdeflate-feedstock so this shouldn't be a huge imposition, but nonetheless there will always be a period of failure between libdeflate updates and htslib package bumps — unless we take steps to ensure the older libdeflate remains available in the union of channels' current_repodata.json files, e.g. by either of

  1. Realising that bioconda may have shot itself in the foot by migrating libdeflate 😄 and re-adding an equivalent libdeflate-1.7 (currently) package to the bioconda channel. I'm sure having packages available in multiple channels is not ideal, but does it cause problems?

  2. [Same as (2) above] Ask conda-forge to add a package whose purpose is to list package versions that bioconda wants to have available in conda-forge's current_repodata.json, e.g. libdeflate-1.6 and/or libdeflate-1.7. This would be similar to the existing _current_repodata_hack which (I assume) is not intended to be installed by anyone but merely exists to ensure other packages are listed in current_repodata.json. A draft of such a package is at jmarshall/staged-recipes/…/_current_repodata_bioconda_hack.


For the other packages for which older versions are still being installed, it remains to identify the particular critical conda-forge package that is causing the problem and bump the affected package (if there are no pinning considerations) and/or resolve the problem appropriately. I might investigate for blast.

@jmarshall
Copy link
Member

jmarshall commented Apr 28, 2023

These days repodata patching provides another alternate for dealing with this problem, in cases like libdeflate where there is in fact forward compatibility and the library's soversion has not changed.

  • Pinning an exact dependency on a particular minor version of libdeflate, as conda's build system does, is artificial — really any version of libdeflate from the minor version against which a package was build through to the most recent libdeflate release will do (until they bump their soversion one day).

    Repodata patching solves the “predicting the future” problem as packages built months or years ago can have their repodata patched and the dependency widened as each new release of libdeflate appears. Hence packages will never have a tight dependency on a version of libdeflate that has fallen out of conda-forge's current_repodata.json.

PR #40675 applies this to htslib, staden_io_lib, and fastp, which are the main bioconda packages directly using libdeflate.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

5 participants