Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

more detailed explanation of Matrix/TMB compatibility requirements? #387

Closed
bbolker opened this issue Oct 10, 2023 · 17 comments
Closed

more detailed explanation of Matrix/TMB compatibility requirements? #387

bbolker opened this issue Oct 10, 2023 · 17 comments

Comments

@bbolker
Copy link
Collaborator

bbolker commented Oct 10, 2023

Hi,

I'm well aware of the need for Matrix/TMB compatibility, as added in this commit.

This is a continuing source of friction in packages/stacks that use TMB (e.g. wham, macpan2, glmmTMB. It's very easy to make the warning go away if you have development tools installed and you know what to do, but despite our attempts to document this in glmmTMB (e.g. here, here), it continues to be a source of concern and confusion for end-users. When this comes up on the r-devel list, we've tried to suggest that there be a way to ask CRAN to re-build binary packages when a new version of a lower-level package (Matrix > TMB > glmmTMB) is posted, but we haven't had much response. The other alternative is to make sure that we bump the versions ourselves and re-post to CRAN as necessary, but so far this also seems to be a source of friction (i.e. in order to do this we have to at least make sure that we are up to date on all CRAN checks, run tests on r-hub etc., check downstream packages for breakage ...)

We could try to reduce developer friction, e.g. by making it easier for ourselves to push new glmmTMB versions (e.g. we could maintain a tag/branch with the HEAD equal to the last CRAN submission, or even keep the last-submitted tarball around, so that we could resubmit it without doing any more checks).

However, it would be really useful to me at least to understand the exact reason why this incompatibility arises, to be able to do a better job explaining and resolving the issue ...

If this is more suited for the glmmTMB issues list, or the TMB mailing list, please let me know.

@kaskr
Copy link
Owner

kaskr commented Oct 10, 2023

I don't remember the exact details, but the motivation for making the Matrix package version test was definitely a binary incompatibility issue (perhaps in CHM_FR or AS_CHM_FR). What I do remember is, that the problem could be triggered by installing TMB using one Matrix version and then switching to another Matrix version. After that, everything would fail, even TMB::runExample("simple").

Good news is that the problem is hard to reproduce today. If I install TMB using Matrix 1.6-1, I can switch to any other Matrix version without breaking TMB! (I tested down to Matrix 1.2-3 from 2015-11-19, the earliest I can install).

Based on this, I think the version test isn't justified anymore and can be omitted by default.

kaskr added a commit that referenced this issue Oct 12, 2023
- Use options("TMB.check.Matrix"=TRUE) to re-enable
@kaskr
Copy link
Owner

kaskr commented Oct 13, 2023

@bbolker I was wrong - we really need to keep the version test. Here's an example:

I start out with consistent build from source of the 3 packages:

install.packages(c("Matrix","TMB","glmmTMB"))
## trying URL 'https://cloud.r-project.org/src/contrib/Matrix_1.6-1.1.tar.gz'
## trying URL 'https://cloud.r-project.org/src/contrib/TMB_1.9.6.tar.gz'
## trying URL 'https://cloud.r-project.org/src/contrib/glmmTMB_1.1.8.tar.gz'

I can run example(glmmTMB) without problems. But then I upgrade Matrix to the current development version (presumably next release):

install.packages("Matrix", repos="http://R-Forge.R-project.org")
## trying URL 'http://R-Forge.R-project.org/src/contrib/Matrix_1.6-2.tar.gz'

Now TMB is broken:

> library(glmmTMB)
Warning message:
In checkMatrixPackageVersion() : Package version inconsistency detected.
TMB was built with Matrix version 1.6.1.1
Current Matrix version is 1.6.2
Please re-install 'TMB' from source using install.packages('TMB', type = 'source') or ask CRAN for a binary version of 'TMB' matching CRAN's 'Matrix' package
> example(glmmTMB)

glmTMB> ## No test: 
glmTMB> (m1 <- glmmTMB(count ~ mined + (1|site),
glmTMB+   zi=~mined,
glmTMB+   family=poisson, data=Salamanders))
Error in fitTMB(TMBStruc) : 
  negative log-likelihood is NaN at starting parameter values

I follow the advice of the warning:

install.packages('TMB', type = 'source')

and it fixes the problem! (i.e. example(glmmTMB) works again).

kaskr added a commit that referenced this issue Oct 13, 2023
@bbolker
Copy link
Collaborator Author

bbolker commented Oct 13, 2023

Thanks. It would still be nice to know why it fails, as I don't ... although at least this nice reproducible example will make it a little easier to investigate? (There's nothing obvious in the Matrix devel-version NEWS file that indicates API breakage ...)

@kaskr
Copy link
Owner

kaskr commented Oct 13, 2023

There's no API change. That is demonstrated by the fact that I can recompile TMB and make things work again.

In case of the Matrix package, the 'API' contains a number of macros e.g. AS_CHM_SP (x) := as_cholmod_sparse(...).
It seems the new Matrix package no longer has as_cholmod_sparse. Perhaps it has been renamed... In any case, the API AS_CHM_SP(x) hasn't changed. However, the binary 'TMB.so' compiled with old Matrix is not compatible with new 'Matrix.so'.
This analysis might not be entirely accurate, but the important point is that a binary incompatibility can arise without there being an API change.

@jaganmn
Copy link
Contributor

jaganmn commented Oct 17, 2023

Yes, Matrix 1.6-2 fixes some bugs in its headers, resulting in ABI changes. Here are the NEWS entries concerning the ABI or API:

      \item The prototype of API function \code{M_cholmod_band_inplace}
      was wrongly copied from \code{cholmod_band},
      instead of from \code{cholmod_band_inplace}.

      \item Many API function prototypes wrongly used \code{const}
      qualifiers where the registered routines do not.

      \item Some API declarations and macros not used by \emph{any}
      reverse \code{LinkingTo} are removed or remapped.

      \item API headers are now nested under \file{inst/include/Matrix/}
      for better namespacing.  Where possible, packages should start to
      use \code{LinkingTo: Matrix (>= 1.6-2)} and include files from the
      new subdirectory, e.g., with \code{#include <Matrix/Matrix.h>}.

      \item Users including API headers can define macro
      \code{R_MATRIX_INLINE},
      typically with \code{#define R_MATRIX_INLINE inline},
      to allow the compiler to inline stubs for registered routines.

So, indeed, 1.6-2 breaks a long run of Matrix releases without ABI changes. But that does not imply that TMB should continue to warn about any version mismatch by default.

When we release new versions of Matrix, it is our responsibility to notify the CRAN team about ABI changes and to ask for rev. dep. binaries to be rebuilt. And we will do that when we submit 1.6-2.

But rebuilds do not affect package versions: currently, users must "know" to use install.packages and not update.packages to get the latest build. That is not a reasonable expectation, so it is also our responsibility to ask rev. dep. maintainers affected by ABI changes to submit an update coinciding with the next Matrix release.

The updates serve only to ensure that update.packages installs the latest builds, so they should be minimal, typically only changing the Date and Version fields in DESCRIPTION and ideally also changing to LinkingTo: Matrix (>= x.y-z), if x.y-z is the new version. Such updates do not require extensive rev. dep. checking.

This whole situation should be considered exceptional, happening only if we update SuiteSparse or discover bugs and only until CRAN starts tagging builds. For as long as I am/have been a co-author, we do not change the ABI without determining and consulting affected rev. dep. maintainers. (And had I not seen this thread I'd have notified you about the 1.6-2 changes. Well, you might still receive an e-mail.)

I am really not convinced that TMB and others should continue defending against problems that arise only exceptionally and which should be handled as much as possible by the maintainers introducing ABI changes (by asking CRAN for rebuilds, rev. dep. maintainers for version bumps) and as little as possible by end users.

@jaganmn
Copy link
Contributor

jaganmn commented Oct 18, 2023

Perhaps TMB could change to warning about version mismatch only when the mismatch involves Matrix versions known to introduce an ABI change. For now, the list would contain only 1.6-2, unless you recall the much older (and now probably irrelevant) instances. When we update SuiteSparse in 1.7-0, we may ask you to add 1.7-0, and so on ...

@kaskr
Copy link
Owner

kaskr commented Oct 18, 2023

@jaganmn Thanks for dealing so thoroughly with this issue! I'm going to follow your advice to only check the few versions that might cause trouble.

@jaganmn
Copy link
Contributor

jaganmn commented Oct 18, 2023

Eventually (actually TODO for 1.6-2), Matrix will associate with three "versions" detectable from C and R code: a package version, a SuiteSparse version, and an ABI version. The ABI version will obviate the need for a hard-coded list of ABI-breaking package versions.

@jaganmn
Copy link
Contributor

jaganmn commented Oct 18, 2023

Done now, with NEWS:

      \item New \R{} function \code{Matrix.Version}, taking no
      arguments and returning \code{list(Package, ABI, SuiteSparse)}
      giving the numeric versions of each.  ABI versioning is new:
      the ABI version is 1 in this release and will be incremented
      by 1 in each future release that changes the ABI.
      Versions and their components are defined for packages with
      \code{LinkingTo: Matrix}.  See \file{inst/include/Matrix/Matrix.h}.

So you can start comparing ABI versions defined backwards compatibly as:

abi <-
    if (packageVersion("Matrix") >= "1.6-2")
        Matrix.Version()[["ABI"]]
    else numeric_version("0")

Edit: I've just changed to list(package, abi, suitesparse), i.e., to using lower case, in order to match R.Version. So you'll want Matrix.Version()[["abi"]].

@kaskr
Copy link
Owner

kaskr commented Oct 20, 2023

As a reminder for myself, the new Matrix::Matrix.Version() can give the ABI for the current installation of Matrix, but it doesn't provide a way to compare ABIs across different Matrix versions.

> Matrix:::Matrix.Version()
$package
[1] ‘1.6.2’

$abi
[1] ‘1’

$suitesparse
[1] ‘5.10.1’

In the future we should store and test on Matrix 'abi' rather than 'package version' (there is no way to lookup ABI from package version).

@tillea
Copy link

tillea commented Nov 7, 2023

Just to add a further data point in how far this strict versioned dependency is unfortunate for consumers of this code. In Debian there is a bug report about this where we are discussing the consequences for the Debian distribution. It would be great if some well designed tests inside TMB the strict compatibility requirements could be dropped.
Kind regards, Andreas.

@jaganmn
Copy link
Contributor

jaganmn commented Nov 7, 2023

And ideally TMB would maintain its own ABI version for use by its reverse dependencies. Then they can stop depending so strictly on package versions, too, assuming that TMB is a good citizen and does not change its ABI with every release.

@tillea
Copy link

tillea commented Nov 8, 2023

And ideally TMB would maintain its own ABI version for use by its reverse dependencies. Then they can stop depending so strictly on package versions, too, assuming that TMB is a good citizen and does not change its ABI with every release.

This would be extremely helpful.

jaganmn referenced this issue in satijalab/seurat-object Nov 8, 2023
Add check on package attachment to see if dependencies have udpated.
These updates may cause runtime binary breaks in SeuratObject, but still allow
SeuratObject to be loaded, so this check happens at attachment
@kaskr
Copy link
Owner

kaskr commented Nov 9, 2023

Point taken - there should ideally be an identifier (let's call it ABI version) to signify when reverse LinkingTo: TMB must recompile to avoid unnecessary rebuilds.
It just bothers me that such version tags are prone to human errors. Learning from past mistakes, ABI breakage can be tricky to predict.
I would also have appreciated some official guidelines from the 'Writing R Extensions' manual on how to access the ABI version (Why not put it in the DESCRIPTION file? Why not use it in install.packages etc. to prevent incompatible updates?).
Anyway, I'll probably follow the approach of Matrix and add a TMB.Version() which can then be used by glmmTMB. However, @tillea note that it won't let the problem go away completely - just makes it happen less frequently.

@tillea
Copy link

tillea commented Nov 9, 2023

However, @tillea note that it won't let the problem go away completely - just makes it happen less frequently.

Perfectly understandable - thanks for working on making this less frequently. Andreas.

@bbolker
Copy link
Collaborator Author

bbolker commented Apr 2, 2024

Just wanted to bump this - would be nice to have a TMB.Version() modeled after Matrix::Matrix.Version() ...

kaskr added a commit that referenced this issue Apr 3, 2024
@kaskr
Copy link
Owner

kaskr commented Apr 4, 2024

@bbolker Just added TMB.Version() so closing this...

However, it seems to me like a lot of machinery to solve an almost non-existing problem: The list of ABI breaking versions TMB:::abi.break is tiny, and AFAICT only one of these is 'forward breaking' (1.8.0).

@kaskr kaskr closed this as completed Apr 4, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

4 participants