Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

save dependent data matrix only once in constrained ordination object #227

Closed
9 tasks done
jarioksa opened this issue Feb 16, 2017 · 6 comments
Closed
9 tasks done

Comments

@jarioksa
Copy link
Contributor

jarioksa commented Feb 16, 2017

I suggest to simplify the internal structure of "cca" object by saving the analysed dependent data only once. Currently we have up to three versions of the same data in the ordination object x:

  • x$pCCA$Fit : fitted values for partial terms
  • x$CCA$Xbar: residual values after partial terms
  • x$CA$Xbar: residual values after partial terms and constraints

All these matrices can be found from the original matrix with appropriate QR decomposition item using qr.fitted() and qr.resid(). We do it already now: very often we need the fitted values after constraints (and conditions) that we have not saved, but we need to find it as qr.fitted(x$CCA$QR, x$CCA$Xbar). Using QR decomposition is pretty cheap: we do it several times in each permutation of anova.cca & permutest.cca already now. So we could have considerable saving of result object sizes with little cost on time by saving only the original adjusted matrix and using QR operations when needed. Currently that transformed dependent matrix is returned by initXXXX() functions in R/ordConstrained.R.

A more severe problem is that we sometimes need the original transformed matrix prior to any partial terms or constraints. In most cases we can get that as x$pCCA$Fit + x$CCA$Xbar, but that does not work for dbrda(), and therefore several must be disabled for partial db-RDA. Saving the original input data (after initDBRDA()) would solve all these problems.

The only problem that I see with this is that the "cca" class result object will change, and this means that several support functions need to be changed. We also need to check that this does not break other packages using vegan, and alert their maintainers if necessary. The current "cca" structure is not a result of very thorough consideration: I just happened to write cca.default() this way in 2002, and since then other functions have been adopted to this structure.

In vegan the following functions must be adopted to this suggested change:

  • fitted methods
  • goodness.cca (now enabled for partial dbrda)
  • inertcomp (now enabled for dbrda)
  • mso (never implemented for dbrda)
  • permutest.cca (model = "direct" enabled for partial dbrda)
  • predict.rda
  • simulate methods (not implemented for dbrda)
  • stressplot methods (now enabled for partial dbrda)
  • tolerance.cca (only meaningful for cca)
jarioksa pushed a commit that referenced this issue Feb 22, 2017
Works with the RFC #227 configuration which saves only the initial
input matrix. Does not yet work with dbrda.
@jarioksa
Copy link
Contributor Author

jarioksa commented Feb 22, 2017

I have opened a new branch ordination-Xbar-issue-#227 which implements this change. At the first stage, I have

  • Changed the ordConstrained function to add the internal working matrix of the initial input data at the main level of the result object like suggested in this RFC. The old items are also saved at the moment which makes the result objects still bigger, but allows their handling with the existing commands while these are not updated to the new style. After updating, only the main level input data will be returned.
  • Added a function that extracts any internal structure from the new result object.
  • Added a function that extracts the same internal structures from the old-style result objects and that is called if old-style object is used as input.

The current function allows using updated functions both with the newly created and legacy result objects. Currently the legacy result objects are handled silently and my plan is to handle them silently in the first release (2.5-0). In later releases, the old style functions are handled with a warning that urges users to update the objects, and finally we remove the compatibility function completely. In most cases the old object mod can be updated to the new style with mod <- update(mod) (data must be available). This should provide a rather painless transition to the new style functions.

@jarioksa
Copy link
Contributor Author

jarioksa commented Feb 28, 2017

The suggested change will be made in PR #228. This is the first step, and still keeps the old structure so that now the working data may be saved in four versions. The removal of CCA$Xbar and CA$Xbar will break CRAN package RVAideMemoire that accesses these items in several functions. Although only one CRAN package will be broken, I think several scripts at large will be broken.

PR #228 provides function ordiYbar() that is able to extract these components with the suggested new and with the old configuration. Switching to this code will allow smooth working also with legacy result objects. To allow smooth transition, we should release ordiYbar() already in the next vegan release (2.4-3) before vegan uses its code! (There is already one future-compatibility change in the cran-2.4 tree: constrained ordinations will add assign argument to terms so that ordination objects created with 2.4-3 can be analysed still in 2.5-0.)

My plan is to have the new structure in vegan 2.5-0 (with no scheduled release date). I think we need to keep the old superfluous (and large!) items till other functions can adapt to the change, although vegan won't touch those items any longer. There will be legacy result objects around for a long time (i.e., created with previous versions of vegan and saved in the workspace), but ordiYbar will work smoothly (and silently) with them. Later we can start work with warnings, but this need not happen but in the next major release (2.6-0). In general, legacy object mod can be updated with mod <- update(mod) if the data sets and variables are still available in the work space. So the deprecation steps would be:

  1. Release compatibility function ordiYbar in the next minor release (2.4-3). If the future code seems to break other packages, contact their maintainers and tell about ordiYbar.
  2. Release the bloated result version (new $Ybar plus legacy pCCA$Fit, CCA$Xbar and CA$Xbar) in the next major version (2.5-0). Dependent packages and scripts relying on old structure will still work smoothly.
  3. In some minor version, remove the old items and only keep $Ybar, but still work smoothly with legacy objects without $Ybar. This will break dependent packages and scripts that have not switched to ordiYbar or adapted to the change.
  4. In 2.6-0 work with legacy objects without $Ybar, but issue a warning that asks to update the result objects.
  5. In the unspecified future, remove support for the legacy objects.

Comments?

jarioksa pushed a commit that referenced this issue Mar 2, 2017
Change in CCA object structure (saving Xbar, issue #227)
@jarioksa
Copy link
Contributor Author

jarioksa commented Mar 2, 2017

Merging of PR #228 closes the implementation part of this issue. The transition strategy part is still open.

jarioksa pushed a commit that referenced this issue Mar 20, 2017
Works with the RFC #227 configuration which saves only the initial
input matrix. Does not yet work with dbrda.

(cherry picked from commit ba860a5)
@jarioksa
Copy link
Contributor Author

Function ordiYbar() was cherry-picked to the cran-2.4 branch and will be published with the upcoming 2.4-3 CRAN release. This means that from 2.4-3 users and developers can start transition away from the Xbar objects before they are removed.

@jarioksa jarioksa added this to the 2.5-0 milestone May 17, 2017
@jarioksa
Copy link
Contributor Author

TODO: run package tests against dependent packages to find the maintainers of dependent packages that need to be pre-alerted for the change.

@jarioksa
Copy link
Contributor Author

Vegan 2.5-0 will still be compatible with the old object structure. The old Xbar will be available although we don't use it any longer, and will disappear only later during the development. Breakage is expected, but not yet.

jarioksa pushed a commit that referenced this issue Apr 23, 2018
We have warned on this move since vegan 2.4-3 and provided tools
(ordiYbar) to find these elements both in the old and current
versions of vegan. In initial tests back then RVAideMemoire package
accesses these items directly, but it was fixed soon after this change.

See github issue #227.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

1 participant