Clustering and classification inference for high dimension low sample size data with U-statistics. The package contains implementations of nonparametric statistical tests for sample homogeneity, group separation, clustering, and classification of multivariate data. The methods have high statistical power and are tailored for data in which the dimension L is much larger than sample size n.
- Bn test for group separation with 2 predefined groups
- Overall group homogeneity testing
- Clustering of a sample into the best two significant subgroups
- Hierarchical clustering considering only significant subgroups
- Significant classification of a new observation into one of two predefined groups
-
Cybis, Gabriela B., Marcio Valk, and Sílvia RC Lopes. "Clustering and classification problems in genetics through U-statistics." Journal of Statistical Computation and Simulation 88.10 (2018)
-
Valk, Marcio, and Gabriela Bettella Cybis. "U-statistical inference for hierarchical clustering." arXiv preprint arXiv:1805.12179 (2018).