User guide

Build guide

The Mahalanobis-average hierarchical clustering project was developed with the CMake build tool.

To build the executable, use CMake configure and build commands in a build directory. Then, the directory para will contain gmhclust executable.

The only dependency is the CUDA compiler (nvcc). The executable should be portable to all platforms supporting nvcc; it was successfully tested on Ubuntu 18.04 and Windows 10.

See the following steps:

cd gmhc
mkdir build && cd build
cmake ..
cmake --build .
ls para/gmhclust

Running the program

The gmhclust executable has three command line parameters:

Dataset file path – The mandatory parameter with a path to a dataset file.
The file is binary and has structure as follows:
A. 4B unsigned integer D – point dimension
B. 4B unsigned integer N – number of points
C. N.D 4B floats – N single-precision D-dimensional points stored one after another
Mahalanobis threshold – An absolute positive number that states the Mahalanobis threshold. It is the mandatory parameter.
Apriori assignments file path – An optional path to an apriori assignments file — a file with space separated 4B unsigned integers (assignment numbers). The number of integers is the same as the number of points in the dataset; it sequentially assigns each point in the dataset file an assignment number. Then simply, if the i-th and the j-th assignment numbers are equal, then the i-th and j-th points are assigned the same apriori cluster.

The command, that executes the program gmhclust to cluster data dataset with the apriori assignment file asgns and the threshold 100 is
./gmhclust data 100 asgns

Output

The executable writes the clustering process to the standard output in a text format. Each line contains an ID pair of merged clusters with their merge distance as well.
IDs are assigned as follows:

Initial dataset points are assigned nonnegative integers ([0, n-1]).
Merged clusters are assigned the next possible ID ([n, 2n-1]).

An example output for 4 points in a dataset would look like this:

0 2 0.65
1 4 1.2
3 5 0.1

R package build guide

To build the package, use CMake configure and build commands in a build directory. Specifically, build target gmhc_package.

See the following steps (last step to work, you need to have root rights):

cd gmhc
mkdir build && cd build
cmake -DCMAKE_BUILD_TYPE=Release ..
cmake --build . --target gmhc_package

Building the specified target installs gmhc package to the default R package directory. Then in R session, it can be used as follows:

library('gmhc')
?gmhclust

Name		Name	Last commit message	Last commit date
Latest commit History 84 Commits
cmake		cmake
gmhc_package		gmhc_package
include		include
para		para
para_timer		para_timer
serial		serial
thesis		thesis
.clang-format		.clang-format
.gitignore		.gitignore
CMakeLists.txt		CMakeLists.txt
CMakeSettings.json		CMakeSettings.json
LICENSE		LICENSE
README.md		README.md

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

User guide

Build guide

Running the program

Output

R package build guide

About

Releases

Packages

Contributors 2

Languages

License

asmelko/gmhc

Folders and files

Latest commit

History

Repository files navigation

User guide

Build guide

Running the program

Output

R package build guide

About

Resources

License

Stars

Watchers

Forks

Releases

Packages 0

Contributors 2

Languages

Packages