[BUG] DBSCAN results incorrect #80

cjnolet · 2019-01-11T17:38:41Z

@daxiongshu ran our DBSCAN & k-means implementations against [1] and found that our results do not match, even for datasets as small as size 2^10.

[1] https://scikit-learn.org/stable/auto_examples/cluster/plot_cluster_comparison.html

cjnolet · 2019-01-11T17:39:47Z

branch-0.5 should be compared against 0.4 release.

dantegd · 2019-01-11T18:38:01Z

what is [1]?

cjnolet · 2019-01-11T18:44:28Z

Updated original comment

cjnolet · 2019-01-11T18:47:50Z

I ran @daxionshu's notebook against branches 0.5, 0.4, 0.3. This means this has been broken since before the refactor.

I believe the sklearn toy datasets should be tested even on the C++ side. That way when results don't match it's very clear to see which layer bugs were introduced.

cjnolet · 2019-01-11T19:14:00Z

@teju85, have you gotten a chance to look at this or #63 yet? It looks like a fix for this is slated for 0.5. Referencing #83 to reproduce the problem.

teju85 · 2019-01-14T05:46:36Z

@cjnolet which of these issues against dbscan needs to be prioritized? 54, 63 or 80?

teju85 · 2019-01-14T05:48:08Z

Also, is there a standalone python script that could repro this mismatch? (Sorry, if you have had it somewhere already!)

cjnolet · 2019-02-09T23:07:46Z

I’m going to go ahead and close this for now since we have discussed how the subtle differences in eps affect the results.

cjnolet added bug Something isn't working ? - Needs Triage Need team to review and classify labels Jan 11, 2019

cjnolet mentioned this issue Jan 11, 2019

Visual inspection of kmeans and dbscan #83

Closed

cjnolet added 1 - On Deck To be worked on next and removed ? - Needs Triage Need team to review and classify labels Jan 11, 2019

cjnolet added this to Issue-Needs prioritizing in v0.5 Release via automation Jan 11, 2019

dantegd moved this from Issue-Needs prioritizing to Issue-P0 in v0.5 Release Jan 11, 2019

cjnolet mentioned this issue Jan 12, 2019

cuml dbscan terminating on large datasets 'invalid configuration argument' #54

Closed

cjnolet self-assigned this Jan 13, 2019

cjnolet moved this from Issue-P0 to Done in v0.5 Release Jan 24, 2019

cjnolet closed this as completed Feb 9, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

[BUG] DBSCAN results incorrect #80

[BUG] DBSCAN results incorrect #80

cjnolet commented Jan 11, 2019 •

edited

cjnolet commented Jan 11, 2019

dantegd commented Jan 11, 2019

cjnolet commented Jan 11, 2019

cjnolet commented Jan 11, 2019

cjnolet commented Jan 11, 2019 •

edited

teju85 commented Jan 14, 2019

teju85 commented Jan 14, 2019

cjnolet commented Feb 9, 2019

[BUG] DBSCAN results incorrect #80

[BUG] DBSCAN results incorrect #80

Comments

cjnolet commented Jan 11, 2019 • edited

cjnolet commented Jan 11, 2019

dantegd commented Jan 11, 2019

cjnolet commented Jan 11, 2019

cjnolet commented Jan 11, 2019

cjnolet commented Jan 11, 2019 • edited

teju85 commented Jan 14, 2019

teju85 commented Jan 14, 2019

cjnolet commented Feb 9, 2019

cjnolet commented Jan 11, 2019 •

edited

cjnolet commented Jan 11, 2019 •

edited