Flock¶ ↑

Ruby bindings to Cluster 3.0

Description¶ ↑

Provides bindings to clustering methods in Cluster 3.0.

* K-Means
* Kohonen Self-Organizing Maps
* Tree Cluster or Hierarchical Clustering

Synopsis¶ ↑

Specify vectors explicitly¶ ↑

require 'pp'
require 'flock'

data     = Array.new(13) {[]}
mask     = Array.new(13) {[]}
weights  = Array.new(13) {1.0}

data[ 0][ 0]=0.1; data[ 0][ 1]=0.0; data[ 0][ 2]=9.6; data[ 0][ 3] = 5.6;
data[ 1][ 0]=1.4; data[ 1][ 1]=1.3; data[ 1][ 2]=0.0; data[ 1][ 3] = 3.8;
data[ 2][ 0]=1.2; data[ 2][ 1]=2.5; data[ 2][ 2]=0.0; data[ 2][ 3] = 4.8;
data[ 3][ 0]=2.3; data[ 3][ 1]=1.5; data[ 3][ 2]=9.2; data[ 3][ 3] = 4.3;
data[ 4][ 0]=1.7; data[ 4][ 1]=0.7; data[ 4][ 2]=9.6; data[ 4][ 3] = 3.4;
data[ 5][ 0]=0.0; data[ 5][ 1]=3.9; data[ 5][ 2]=9.8; data[ 5][ 3] = 5.1;
data[ 6][ 0]=6.7; data[ 6][ 1]=3.9; data[ 6][ 2]=5.5; data[ 6][ 3] = 4.8;
data[ 7][ 0]=0.0; data[ 7][ 1]=6.3; data[ 7][ 2]=5.7; data[ 7][ 3] = 4.3;
data[ 8][ 0]=5.7; data[ 8][ 1]=6.9; data[ 8][ 2]=5.6; data[ 8][ 3] = 4.3;
data[ 9][ 0]=0.0; data[ 9][ 1]=2.2; data[ 9][ 2]=5.4; data[ 9][ 3] = 0.0;
data[10][ 0]=3.8; data[10][ 1]=3.5; data[10][ 2]=5.5; data[10][ 3] = 9.6;
data[11][ 0]=0.0; data[11][ 1]=2.3; data[11][ 2]=3.6; data[11][ 3] = 8.5;
data[12][ 0]=4.1; data[12][ 1]=4.5; data[12][ 2]=5.8; data[12][ 3] = 7.6;

mask[ 0][ 0]=1; mask[ 0][ 1]=1; mask[ 0][ 2]=1; mask[ 0][ 3] = 1;
mask[ 1][ 0]=1; mask[ 1][ 1]=1; mask[ 1][ 2]=0; mask[ 1][ 3] = 1;
mask[ 2][ 0]=1; mask[ 2][ 1]=1; mask[ 2][ 2]=0; mask[ 2][ 3] = 1;
mask[ 3][ 0]=1; mask[ 3][ 1]=1; mask[ 3][ 2]=1; mask[ 3][ 3] = 1;
mask[ 4][ 0]=1; mask[ 4][ 1]=1; mask[ 4][ 2]=1; mask[ 4][ 3] = 1;
mask[ 5][ 0]=0; mask[ 5][ 1]=1; mask[ 5][ 2]=1; mask[ 5][ 3] = 1;
mask[ 6][ 0]=1; mask[ 6][ 1]=1; mask[ 6][ 2]=1; mask[ 6][ 3] = 1;
mask[ 7][ 0]=0; mask[ 7][ 1]=1; mask[ 7][ 2]=1; mask[ 7][ 3] = 1;
mask[ 8][ 0]=1; mask[ 8][ 1]=1; mask[ 8][ 2]=1; mask[ 8][ 3] = 1;
mask[ 9][ 0]=1; mask[ 9][ 1]=1; mask[ 9][ 2]=1; mask[ 9][ 3] = 0;
mask[10][ 0]=1; mask[10][ 1]=1; mask[10][ 2]=1; mask[10][ 3] = 1;
mask[11][ 0]=0; mask[11][ 1]=1; mask[11][ 2]=1; mask[11][ 3] = 1;
mask[12][ 0]=1; mask[12][ 1]=1; mask[12][ 2]=1; mask[12][ 3] = 1;

pp Flock.kcluster(6, data, mask: mask)

# method: (kcluster)
#    - Flock::METHOD_AVERAGE (kmeans, this is the default)
#    - Flock::METHOD_MEDIAN  (kmedians)
# method: (treecluster)
#    - Flock::METHOD_AVERAGE_LINKAGE (default)
#    - Flock::METHOD_SINGLE_LINKAGE
#    - Flock::METHOD_MAXIMUM_LINKAGE
#    - Flock::METHOD_CENTROID_LINKAGE
# metric:
#    - Flock::METRIC_EUCLIDIAN (default)
#    - Flock::METRIC_CITY_BLOCK
#    - Flock::METRIC_CORRELATION
#    - Flock::METRIC_ABSOLUTE_CORRELATION
#    - Flock::METRIC_UNCENTERED_CORRELATION
#    - Flock::METRIC_ABSOLUTE_UNCENTERED_CORRELATION
#    - Flock::METRIC_SPEARMAN
#    - Flock::METRIC_KENDALL
# seed: (initial cluster assignment)
#    - Flock::SEED_RANDOM            (uniform random, this is the default)
#    - Flock::SEED_KMEANS_PLUSPLUS   (kmeans++ - initial cluster centers chosen weighted by distance from closest center)
#    - Flock::SEED_SPREADOUT         (similar to kmeans++ but deterministic, spreads out cluster centers)

pp Flock.kcluster(
  6,
  data,
  mask:      mask,
  method:    Flock::METHOD_AVERAGE,
  metric:    Flock::METRIC_EUCLIDIAN,
  transpose: 0,
  weights:   Array.new(13) {1.0},
  seed:      Flock::SEED_RANDOM
)

pp Flock.treecluster(
  6,
  data,
  mask:      mask,
  method:    Flock::METHOD_AVERAGE,
  metric:    Flock::METRIC_EUCLIDIAN,
  transpose: 0,
  weights:   Array.new(13) {1.0},
)

Sparse data and clustering string labels¶ ↑

require 'pp'
require 'flock'

data = []

# keys don't need to be numeric
data << { 1 => 0.5, 2 => 0.5 }
data << { 3 => 1, 4 => 1 }
data << { 4 => 1, 5 => 0.3 }
data << { 2 => 0.75 }
data << { 1 => 0.60 }

pp Flock.kcluster(2, data, sparse: true)

data = []

# a much simpler way to cluster text labels.
data << %w(apple orange)
data << %w(black white)
data << %w(white cyan)
data << %w(orange)
data << %w(apple)

# additional options such as metric, iterations can be passed in a hash.
pp Flock.kcluster(2, data, sparse: true)
pp Flock.treecluster(2, data, sparse: true)

Self-Organizing Map¶ ↑

Self-Organizing Maps (SOM) require that you specify a 2D grid on which data points can cluster. Some of the grid points may be empty and others might have clusters mapped to them. There is no need to provide a fixed cluster size.

require 'pp'
require 'flock'

data = []

# a much simpler way to cluster text
data << %w(apple orange)
data << %w(black white)
data << %w(white cyan)
data << %w(orange)
data << %w(apple)

# nxgrid, nygrid, data are required.
# additional options such as metric, iterations can be passed in a hash.

# cluster upto 4 groups in a 2x2 grid.
pp Flock.self_organizing_map(2, 2, data, sparse: true)

Note: SOM clustering provides the 2D grid coordinate for each vector instead of an integer cluster value for each vector like kcluster and treecluster.

Changes from 0.4.x to 0.5.0¶ ↑

Deprecated methods¶ ↑

kmeans: use kcluster instead
sparse_kmeans: use kcluster with option sparse: true
sparse_treecluster: use treecluster with option sparse: true
sparse_self_organizing_map: use self_organizing_map with option sparse: true

Method signature¶ ↑

kmeans, treecluster and self_organizing_map no longer take mask as a parameter.
mask needs to passed along with other optional parameters in the options hash.

TODO¶ ↑

K-Tree clustering
Use Sparse Matrix instead of converting sparse data into dense matrices.
BIRCH hierarchical clustering.
EM clustering.
kcluster auto-suggest cluster size.

License¶ ↑

Creative Commons Attribution - CC BY

Name		Name	Last commit message	Last commit date
Latest commit History 38 Commits
examples		examples
ext		ext
lib		lib
.gitignore		.gitignore
API.rdoc		API.rdoc
README.rdoc		README.rdoc
Rakefile		Rakefile
VERSION		VERSION
flock.gemspec		flock.gemspec

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples

examples

ext

ext

lib

lib

.gitignore

.gitignore

API.rdoc

API.rdoc

README.rdoc

README.rdoc

Rakefile

Rakefile

VERSION

VERSION

flock.gemspec

flock.gemspec

Repository files navigation

Flock¶ ↑

Description¶ ↑

Synopsis¶ ↑

Specify vectors explicitly¶ ↑

Sparse data and clustering string labels¶ ↑

Self-Organizing Map¶ ↑

Changes from 0.4.x to 0.5.0¶ ↑

Deprecated methods¶ ↑

Method signature¶ ↑

TODO¶ ↑

License¶ ↑

About

Releases

Packages

Languages

deepfryed/flock

Folders and files

Latest commit

History

Repository files navigation

Flock¶ ↑

Description¶ ↑

Synopsis¶ ↑

Specify vectors explicitly¶ ↑

Sparse data and clustering string labels¶ ↑

Self-Organizing Map¶ ↑

Changes from 0.4.x to 0.5.0¶ ↑

Deprecated methods¶ ↑

Method signature¶ ↑

TODO¶ ↑

License¶ ↑

About

Resources

Stars

Watchers

Forks

Languages