Skip to content

Extended Agglomerative Hierarchical Clustering in R

Notifications You must be signed in to change notification settings

sergio-gomez/mdendro

Folders and files

NameName
Last commit message
Last commit date

Latest commit

 

History

13 Commits
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 
 

Repository files navigation

DOI

mdendro

Extended Agglomerative Hierarchical Clustering in R

Description

R package mdendro enables the calculation of agglomerative hierarchical clustering (AHC), extending the standard functionalities in several ways:

  • Native handling of both similarity and dissimilarity (distances) matrices.

  • Calculation of pair-group dendrograms and variable-group multidendrograms [1].

  • Implementation of the most common AHC methods in both weighted and unweighted forms: single linkage, complete linkage, average linkage (UPGMA and WPGMA), centroid (UPGMC and WPGMC), and Ward.

  • Implementation of two additional parametric families of methods: versatile linkage [2], and beta flexible. Versatile linkage leads naturally to the definition of two additional methods: harmonic linkage, and geometric linkage.

  • Calculation of the cophenetic (or ultrametric) matrix.

  • Calculation of five descriptors of the final dendrogram: cophenetic correlation coefficient, space distortion ratio, agglomerative coefficient, chaining coefficient, and tree balance.

  • Calculation and plots of the descriptors for the parametric methods.

All this functionality is obtained with three functions: linkage, descval and descplot. Function linkage may be considered as a replacement for functions hclust (in package stats) and agnes (in package cluster). To enhance usability and interoperability, the linkage class includes several methods for plotting, summarizing information, and class conversion.

References

  1. A. Fernández, S. Gómez. Solving non-uniqueness in agglomerative hierarchical clustering using multidendrograms. Journal of Classification 25, 43-65 (2008). DOI:10.1007/s00357-008-9004-x.
  2. A. Fernández, S. Gómez. Versatile linkage: A family of space-conserving strategies for agglomerative hierarchical clustering. Journal of Classification 37, 584-597 (2020). DOI:10.1007/s00357-019-09339-z.

Authors

Documentation

The full documentation of mdendro, including description, installation, tutorial, rationale and reference manual, can be found here.