Join GitHub today
ValueError: Linkage 'Z' uses the same cluster more than once thrown when clustering #6785
I compute cophenet index on the Z matrix generated by the
Error thrown in cophenet index:
Error thrown in dendrogram:
When examining the matrix, I find that an internal node appears twice -- once merged w/ a leaf and once merged w/ an internal node. Since a node is only allowed one parent, this seems to be a bug.
scipy version: 0.18.1
Matrix file is attached (25620 rows).
Hi, it looks like it was a bug indeed, I believe it was fixed in #6495
My comment "the original version relies on the order of merges in nn_chain" --- I think now it is not true and was my mistake, I was mislead by the paper. Sorry about that.
An example that now
import numpy as np from scipy.cluster.hierarchy import linkage, is_valid_linkage metric = 'hamming' M = np.loadtxt("z_bug_matrix.txt") Z = linkage(M, method='average', metric='hamming') print(is_valid_linkage(Z))
So yes, it was incorrect before that commit. It could work when the distances are more or less unique (often).
Not sure about release plans, @ev-br ?
Can we consider this issue as resolved? I believe so, I have some related doubts, but I will open another issue for that.
Milestone 0.19.0 is currently due in January, https://github.com/scipy/scipy/milestones