Skip to content

ConsensusCluster doesn't seem to use consensus matrix for clustering #25

@gageblack

Description

@gageblack

Report

When reviewing the ConsensusCluster implementation (src/flowsom/models/consensus_cluster.py, v0.2.2), it seems that fit_predict() ignores the fit() function that does the consensus clustering and just runs AgglomerativeClustering once on raw data (lines 105-109):

    if self.z_score:
        data = self._z_score(data)
    return self.cluster(n_clusters=self.n_clusters, linkage=self.linkage).fit_predict(data)

Based on the consensus clustering paper and the R FlowSOM implementation, it seems that fit_predict() should:

  1. Build consensus matrix using fit()
  2. Convert to distance matrix: distance_matrix = 1 - self.Mk
  3. Cluster using consensus distances (or at least have an option to cluster on the matrix or data, as in the ConsensusCluster class: AgglomerativeClustering(n_clusters=k, metric='precomputed').fit_predict(distance_matrix)

Versions

v0.2.2

Metadata

Metadata

Assignees

No one assigned

    Labels

    bugSomething isn't working

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions