-
Notifications
You must be signed in to change notification settings - Fork 41
Add DBSCAN updates, SNN #78
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
Conversation
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks good overall. A few minor naming and code simplification suggestions.
* The minimum number of neighbors with a similarity score greater than or | ||
* equal to eps to be considered a core point. | ||
*/ | ||
private int minPts; |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor naming: Can you spell out the name a bit more to make it clearer? minPoints? Or minNeighbors? (Here and elsewhere this is used)
return false; | ||
} | ||
// Copy data into a data structure that can be indexed | ||
this.points = this.getData() instanceof List |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor: CollectionUtil.asArrayList() basically does the same thing as this line.
*/ | ||
private List<DataType> getConnectedNeighbors(DataType point) | ||
{ | ||
List<DataType> result = new ArrayList<>(); |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For performance you may want to initialize this to the expected number of neighbors (you can trim it at the end)
* | ||
* @param k The number of nearest neighbors k. | ||
*/ | ||
public void setNumNeighbors(int k) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor naming: Normally these setters/getters have a one-to-one mapping to the fields. So maybe rename "k" to numNeighbors?
* | ||
* @param eps The minimum similarity score required. | ||
*/ | ||
public void setSimilarity(int eps) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Minor naming: Given this is a minimum threshold maybe call it setMinSimilarity? Or setSimilarityThreshold? (Also change the eps to match the name)
* each point also has the distance (a double given by the semimetric | ||
* evaluate method) between that point and a reference point. | ||
*/ | ||
private class SemimetricPointComparator |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You can probably replace this class with just using a lambda expression above and use Double.compare to make it a one-liner.
@jeremy-wendt thanks for making the pull request for me and thanks @jbasilico for your comments on my code. What's the best course of action for making these fixes? If I was still at Sandia I would have been the one to make the pull request and then making the corrections would be trivial, but given that I didn't make the pull request I'm not sure if there's a good way for me to edit the code. My knowledge of github best practices is still very limited, so just let me know if there's anything I can do to help. |
Most of these changes created internally by Melissa Bain. Credit where due.