Skip to content

Commit

Permalink
document hclust() (#60)
Browse files Browse the repository at this point in the history
  • Loading branch information
ahwillia authored and alyst committed May 20, 2018
1 parent 4c2058d commit faedba5
Show file tree
Hide file tree
Showing 3 changed files with 42 additions and 56 deletions.
1 change: 1 addition & 0 deletions doc/source/algorithms.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,5 +7,6 @@ This package implements a variety of clustering algorithms:

kmeans.rst
kmedoids.rst
hclust.rst
affprop.rst
dbscan.rst
56 changes: 0 additions & 56 deletions doc/source/hclust.md

This file was deleted.

41 changes: 41 additions & 0 deletions doc/source/hclust.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,41 @@
Hierarchical Clustering
========================

`Hierarchical clustering <https://en.wikipedia.org/wiki/Hierarchical_clustering>`_ algorithms build a dendrogram of nested clusters by repeatedly merging or splitting clusters.

**Functions**

.. function:: hclust(D, method)

Perform hierarchical clustering on distance matrix D with specified method.

:param D: The pairwise distance matrix. ``D[i,j]`` is the distance between points ``i`` and ``j``.
:param method: A Symbol specifying how distance is measured between clusters (which is used to determine which clusters to merge on each iteration). Valid methods are ``:single``, ``:average``, or ``:complete``.

- ``:single``: cluster distance is equal to the minimum distance between any of the members
- ``:average``: cluster distance is equal to the mean distance between any of the cluster's members
- ``:complete``: cluster distance is equal to the maximum distance between any of the members.

The function returns an object of type `Hclust` with the fields
- ``merge`` the clusters merged in order. Leafs are indicated by negative numbers
- ``height`` the distance at which the merges take place
- ``order`` a preferred grouping for drawing a dendogram.
- ``method`` the name of the clustering method.

Example:

.. code-block:: julia
D = rand(1000,1000)
D += D' # symmetric distance matrix (optional)
result = hclust(D, :single)
.. function:: cutree(result; h, k)

Cuts the dendrogram to produce clusters at a specified level of granularity.

:param result: Object of type ``Hclust`` holding results of a call to ``hclust()``.
:param h: Integer specifying the height at which to cut the tree.
:param k: Integer specifying the number of desired clusters.

The output is a vector specifying the cluster index for each datapoint.

0 comments on commit faedba5

Please sign in to comment.