Hierarchical Clusteringfor JS implemented via priority-queue algorithm
$ npm install compute-hclust
For use in the browser, use browserify.
var hclust = require( 'compute-hclust' );
Given an input data
in the form of a two-dimensional array, this function carries out Hierarchical Clustering.
The function accepts the following options
:
- distance:
string
indicating the chosen distance metric. Default:euclidean
. - linkage:
string
indicating the linkage criterion to be used, eithersingle
orcomplete
.
The function returns an object which exports two function: getClusters(k)
and getTree
. The former takes one parameter, the number of clusters k
, and returns an array of clusters, where each cluster is represented as an array holding the indices of data points assigned to it. The function getTree
returns a binary tree representing the tree hierarchy generated by the clustering algorithm.
The algorithm makes use of a priority queue, making it more efficient than a naive implementation. It has been adapted from the book Introduction to Information Retrieval by Christopher D. Manning, Prabhakar Raghavan and Hinrich Schütze, Cambridge University Press. 2008 [p.386].
'use strict';
var hclust = require( 'compute-hclust' );
var util = require( 'util' );
var dat = [
[1, 2, 1],
[80 ,100, 98],
[1, 1.9, 1],
[80, 101, 99],
[1, 3, 2],
[2, 2, 1]
];
var clustering = hclust( dat );
console.log( clustering.getClusters( 2 ) );
// [ [ 0, 2, 5, 4 ], [ 1, 3 ] ]
console.log( clustering.getClusters( 3 ) );
// [ [ 0, 2, 5 ], [ 1, 3 ], [ 4 ] ]
console.log( util.inspect( clustering.getTree(), null, 5 ) );
/*
{ left:
{ left:
{ left:
{ left: { obs: [ 1, 2, 1 ], size: 1 },
right: { obs: [ 1, 1.9, 1 ], size: 1 },
size: 2 },
right: { obs: [ 2, 2, 1 ], size: 1 },
size: 3 },
right: { obs: [ 1, 3, 2 ], size: 1 },
size: 4 },
right:
{ left: { obs: [ 80, 100, 98 ], size: 1 },
right: { obs: [ 80, 101, 99 ], size: 1 },
size: 2 },
size: 6 }
*/
To run the example code from the top-level application directory,
$ node ./examples/index.js
Unit tests use the Mocha test framework with Chai assertions. To run the tests, execute the following command in the top-level application directory:
$ make test
All new feature development should have corresponding unit tests to validate correct functionality.
This repository uses Istanbul as its code coverage tool. To generate a test coverage report, execute the following command in the top-level application directory:
$ make test-cov
Istanbul creates a ./reports/coverage
directory. To access an HTML version of the report,
$ make view-cov
Copyright © 2015. Philipp Burckhardt.