Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

DBscan Adjacency Lists have repeated clusters #200

Closed
MRIO opened this issue Jul 23, 2020 · 1 comment
Closed

DBscan Adjacency Lists have repeated clusters #200

MRIO opened this issue Jul 23, 2020 · 1 comment
Labels

Comments

@MRIO
Copy link

MRIO commented Jul 23, 2020

Dear all,

I have been attempting to using dbscan on a 2D array (size: 2 x 1105 ) of point positions (results of UMAP), but i get this strange result, where there are repeated DbscanClusters inside the output array:


13-element Array{DbscanCluster,1}:
 DbscanCluster(17, [4, 12, 84, 90, 94, 675, 676, 737, 873, 965], [27, 108, 177, 880, 954, 1050, 1067])
 DbscanCluster(10, Int64[], [46, 48, 51, 57, 188, 225, 226, 228, 270, 542])
 DbscanCluster(11, [48, 51, 228], [46, 49, 57, 188, 225, 226, 270, 542])
 DbscanCluster(14, [418, 759, 832, 988, 1046], [830, 831, 855, 865, 989, 991, 996, 1021, 1070])
 DbscanCluster(10, Int64[], [624, 654, 664, 803, 805, 821, 859, 987, 1057, 1069])
 DbscanCluster(10, Int64[], [624, 654, 664, 803, 805, 821, 859, 987, 1057, 1069])
 DbscanCluster(10, Int64[], [624, 654, 664, 803, 805, 821, 859, 987, 1057, 1069])
 DbscanCluster(10, Int64[], [624, 654, 664, 803, 805, 821, 859, 987, 1057, 1069])
 DbscanCluster(10, Int64[], [624, 654, 664, 803, 805, 821, 859, 987, 1057, 1069])
 DbscanCluster(10, Int64[], [624, 654, 664, 803, 805, 821, 859, 987, 1057, 1069])
 DbscanCluster(10, Int64[], [624, 654, 664, 803, 805, 821, 859, 987, 1057, 1103])
 DbscanCluster(10, Int64[], [624, 654, 664, 803, 805, 821, 859, 987, 1057, 1103])
 DbscanCluster(11, [1057], [624, 654, 664, 803, 805, 821, 859, 987, 1069, 1103])

The data on which I'm applying dbscan looks sane (and there are no repeats):

image

Any idea why this is happening?

@alyst alyst added the bug label Mar 21, 2023
@alyst
Copy link
Member

alyst commented Mar 21, 2023

There was a bug in coordinate-based dbscan() implementation: it created a cluster even if there are no seeds/core points.
I suspect this duplication bug is related: the output shows some clusters that have no core (an empty 1st array), which is not correct.
This should be fixed by #248, so I will close it.
If this issue still remains, this issue could be reopened, but it would be nice to have a reproducible example.

@alyst alyst closed this as completed Mar 21, 2023
alyst added a commit that referenced this issue Mar 22, 2023
alyst added a commit that referenced this issue Mar 24, 2023
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants