Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Consistent ways to euclidify dissimilarities #179

Closed
jarioksa opened this issue Jun 3, 2016 · 1 comment
Closed

Consistent ways to euclidify dissimilarities #179

jarioksa opened this issue Jun 3, 2016 · 1 comment
Milestone

Comments

@jarioksa
Copy link
Contributor

jarioksa commented Jun 3, 2016

We have dissimilarity-based methods that can handle negative eigenvalues in a correct and consistent ways: dbrda, capscale, wcmdscale, adonis, adonis2, betadisper and varpart. Some of these functions also provide options to euclidify dissimilarities so that there are no negative eigenvalues. However, not all functions do this consistently. Another aspect is that even when functions have consistent euclidification, support functions handle the results inconsistently.

Consistent Euclidification

We use two ways to euclidify dissimilarities: Lingoes or Cailliez adjustment of dissimilarities (argument add) and taking square root of dissimilarities (argument sqrt.dist). Lingoes and Cailliez adjustments is guaranteed to produce euclidified dissimilarities, whereas sqrt.dist has no guarantee, but still seems to work for most commonly used ecological dissimilarities. Currently functions dbrda, capscale, wcmdscale and dbrda have Lingoes and Cailliez adjustment, but adonis2 and betadisper do not have adjustment. They should be made consistent. The internal sqrt.dist is available in dbrda, capscale and varpart, but not in others. It should be added at least to adonis2 and betadisper. I am not sure about wcmdscale. Moreover, I want to leave adonis as a legacy function, and only have these new features in adonis2.

Consistent Handling of Euclidified Dissimilarities

The euclidified dissimilarities should be handled consistently when users access results objects. There are several inconsistencies. I have just started to search for these inconsistencies, and I do not know how much there is work ahead. We should first decide a policy and then implement the policy. Here some examples of inconsistencies:

library(vegan)
data(dune)
d <- vegdist(dune)
mlin <- capscale(d ~ 1, add = "lingoes")
msq <- capscale(d ~ 1, sqrt.dist = TRUE)
par(mar=c(4,4,1,1)+.1, mfrow=c(1,2))
stressplot(mlin, k=19, main = "Lingoes")
stressplot(msq, k=19, main = "sqrt.dist")

stressplot

Lingoes adjusted plot uses the unadjusted input dissimilarities as x-axis (and for the red line of perfect fit) and show the adjusted values only for ordination distances (points above the 1:1 red line), whereas sqrt.dist shows the observed dissimilarities after adjustments. This is a policy issue, but I think both should regard the adjustment as an internal operation: x-axis should so the observed dissimilarities like in the input, and y-axis the ordination distances after adjustment.

The treatment is also inconsistent among methods:

## continue after previous: we need mlin
mdeb  <- dbrda(d ~ 1, add = "lingoes")
range(d - residuals(mlin))
## [1] -1.609823e-15  2.498002e-15
range(d - residuals(mdeb))
## [1] -0.26792848 -0.09250691

In capscale (object mlin) we remove the Lingoes adjustments and return estimates of observed dissimilarities, whereas with dbrda we return Lingoes adjusted values. This is a policy issue and we can argue for either case. However, I argue for regarding these adjustments (additive constant, square root) as internal and always returning similar data as was input. That is, with non-Euclidean dissimilarities, we should get back non-Euclidean dissimilarities instead of euclidified ones.

@jarioksa jarioksa added this to the 2.4-0 milestone Jun 3, 2016
jarioksa pushed a commit that referenced this issue Jun 4, 2016
now fitted & residuals are consistent in capscale and dbrda which
was one of the concerns in issue #179 in github.
jarioksa pushed a commit that referenced this issue Jun 4, 2016
now stressplot of sqrt.dist and Lingoes/Cailliez adjusted are
consistent, and stressplots for dbrda and capscale are consistent.
This was one of the concerns in issue #179.
jarioksa pushed a commit that referenced this issue Jun 4, 2016
jarioksa pushed a commit that referenced this issue Jun 4, 2016
jarioksa pushed a commit that referenced this issue Jun 5, 2016
solves problems in github issue #179:
1) sqrt.dist= and add= adjustments are regarded as internal to the method
and fitted etc. remove these in dbrda & capscale and also when showing
observed statistics in stressplot etc.
2) functions dbrda, capscale, varpart, adonis2 and betadisper all have
similar sqrt.dist= and add= arguments.
@jarioksa
Copy link
Contributor Author

jarioksa commented Jun 5, 2016

These issues were solved with 5b7fb1e.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant