New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Updates to DBSCAN fitting #302
Conversation
Improvements to DBSCAN fitting - moving from fork to main repo
Will take a look before end of today! |
Thanks, but it's not urgent, I just wanted to fix my broken access issues and get this onto the right repo before my machine melts! |
Sounds like a good thing to improve, I've typically been using refined boundaries as everything else seems to struggle. Assigning to DBSCAN models is slow -- apparently CUDA was going to add something to make this much faster?
No, sounds like a good change! I think that's caused frustrations before (and heaven knows we don't need more options)
Interesting, will look at this part
Would it be appropriate to make this the default or perhaps remove the former behaviour as an option altogether? Aware we have a very long CLI
Great, will have a quick look through the code now |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks like a really helpful change
I've made a few comments from a UI and maintenance perspective
Looks like no issues with tests & mandrake here? |
Thanks for the comments! Nope, no mandrake issues here, seems like it is a local installation issue. I will adjust the CLI and cascade the changes through the code. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Had a look through tests and docs and those look good to me too
One final thing – maybe want to bump the version if not already past current release |
Glad someone was paying attention, done in 2920b00. |
Motivated by trying to fit a DBSCAN model to a large dataset. Problems were:
--assign-subsample
option) negates the speed up of the initial fit--no-assign
flag, which skips the assignment, labels the model appropriately, and allows a refined model fit that then assigns all pointsIf you approve these changes conceptually, then I'll add tests and docs.