Consistent ways to euclidify dissimilarities #179

jarioksa · 2016-06-03T08:39:31Z

We have dissimilarity-based methods that can handle negative eigenvalues in a correct and consistent ways: dbrda, capscale, wcmdscale, adonis, adonis2, betadisper and varpart. Some of these functions also provide options to euclidify dissimilarities so that there are no negative eigenvalues. However, not all functions do this consistently. Another aspect is that even when functions have consistent euclidification, support functions handle the results inconsistently.

Consistent Euclidification

We use two ways to euclidify dissimilarities: Lingoes or Cailliez adjustment of dissimilarities (argument add) and taking square root of dissimilarities (argument sqrt.dist). Lingoes and Cailliez adjustments is guaranteed to produce euclidified dissimilarities, whereas sqrt.dist has no guarantee, but still seems to work for most commonly used ecological dissimilarities. Currently functions dbrda, capscale, wcmdscale and dbrda have Lingoes and Cailliez adjustment, but adonis2 and betadisper do not have adjustment. They should be made consistent. The internal sqrt.dist is available in dbrda, capscale and varpart, but not in others. It should be added at least to adonis2 and betadisper. I am not sure about wcmdscale. Moreover, I want to leave adonis as a legacy function, and only have these new features in adonis2.

Consistent Handling of Euclidified Dissimilarities

The euclidified dissimilarities should be handled consistently when users access results objects. There are several inconsistencies. I have just started to search for these inconsistencies, and I do not know how much there is work ahead. We should first decide a policy and then implement the policy. Here some examples of inconsistencies:

library(vegan)
data(dune)
d <- vegdist(dune)
mlin <- capscale(d ~ 1, add = "lingoes")
msq <- capscale(d ~ 1, sqrt.dist = TRUE)
par(mar=c(4,4,1,1)+.1, mfrow=c(1,2))
stressplot(mlin, k=19, main = "Lingoes")
stressplot(msq, k=19, main = "sqrt.dist")

Lingoes adjusted plot uses the unadjusted input dissimilarities as x-axis (and for the red line of perfect fit) and show the adjusted values only for ordination distances (points above the 1:1 red line), whereas sqrt.dist shows the observed dissimilarities after adjustments. This is a policy issue, but I think both should regard the adjustment as an internal operation: x-axis should so the observed dissimilarities like in the input, and y-axis the ordination distances after adjustment.

The treatment is also inconsistent among methods:

## continue after previous: we need mlin
mdeb  <- dbrda(d ~ 1, add = "lingoes")
range(d - residuals(mlin))
## [1] -1.609823e-15  2.498002e-15
range(d - residuals(mdeb))
## [1] -0.26792848 -0.09250691

In capscale (object mlin) we remove the Lingoes adjustments and return estimates of observed dissimilarities, whereas with dbrda we return Lingoes adjusted values. This is a policy issue and we can argue for either case. However, I argue for regarding these adjustments (additive constant, square root) as internal and always returning similar data as was input. That is, with non-Euclidean dissimilarities, we should get back non-Euclidean dissimilarities instead of euclidified ones.

The text was updated successfully, but these errors were encountered:

now fitted & residuals are consistent in capscale and dbrda which was one of the concerns in issue #179 in github.

now stressplot of sqrt.dist and Lingoes/Cailliez adjusted are consistent, and stressplots for dbrda and capscale are consistent. This was one of the concerns in issue #179.

addresses issue #179 in github

addresses issue #179

solves problems in github issue #179: 1) sqrt.dist= and add= adjustments are regarded as internal to the method and fitted etc. remove these in dbrda & capscale and also when showing observed statistics in stressplot etc. 2) functions dbrda, capscale, varpart, adonis2 and betadisper all have similar sqrt.dist= and add= arguments.

jarioksa · 2016-06-05T05:23:35Z

These issues were solved with 5b7fb1e.

jarioksa added this to the 2.4-0 milestone Jun 3, 2016

jarioksa pushed a commit that referenced this issue Jun 4, 2016

remove euclidifying constant in fitted.dbrda (effects also residuals)

7123934

now fitted & residuals are consistent in capscale and dbrda which was one of the concerns in issue #179 in github.

jarioksa pushed a commit that referenced this issue Jun 4, 2016

adonis2 gained option to use sqrt.dist or euclidifying constants

b25e933

addresses issue #179 in github

jarioksa pushed a commit that referenced this issue Jun 4, 2016

betadisper gains standard options to avoid negative eigenvalues

177fbd3

addresses issue #179

jarioksa closed this as completed Jun 5, 2016

ChrisKeefe mentioned this issue Dec 23, 2020

Correct negative eigenvalues for PCoA on semi-metric distances qiime2/q2-diversity#302

Open

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Consistent ways to euclidify dissimilarities #179

Consistent ways to euclidify dissimilarities #179

jarioksa commented Jun 3, 2016 •

edited

Loading

jarioksa commented Jun 5, 2016

Consistent ways to euclidify dissimilarities #179

Consistent ways to euclidify dissimilarities #179

Comments

jarioksa commented Jun 3, 2016 • edited Loading

Consistent Euclidification

Consistent Handling of Euclidified Dissimilarities

jarioksa commented Jun 5, 2016

jarioksa commented Jun 3, 2016 •

edited

Loading