-
Notifications
You must be signed in to change notification settings - Fork 332
Improve configuration object for clustering API #382
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Improve configuration object for clustering API #382
Conversation
Thanks @arovir01! @nutsiepully and @alanchiao, please review this change, introduced following the discussion at the clustering RFC design review. |
It'd be good to first consider this in the context of something like the feature request template here and more clearly define the motivation. This change would ultimately expose four more symbols:
So from the template, what's the motivation of this, relative to the original string-based parameter for configuration? Has there been explicit interest to extend this list by others, with initialization methods that perform better than before for their use cases? And are there other types of configuration besides initialization that you foresee, which maintaining these new symbols with backwards-compatibility requirements would make harder? If it's just for experimentation purposes, for now, we can always suggest that people fork the code and modify the implementation of one of the existing initializations to their arbitrary algorithm, to see how it can do better, until there is wider interest. There can even be a Github issue tracking this to wait for people to express interest. Would an enum be better if there is just an interest to avoid strings? |
910ea99
to
57dcd9f
Compare
* Added enum CentroidInitialization to be used in clustering params to specify centroid initialization method (instead of a string) * Updated serialization and unit tests accordingly
57dcd9f
to
2e70a9e
Compare
Changed to enum as requested. Is there anything else I should change? |
API change itself looks good.
|
Getting another person to also take a look at this and will formalize how many people should generally be reviewing smaller public API changes, according to the ownership RFC, going forward. |
We must keep the
Done. |
Got it. |
The change looks good to me on it's own. I don't particularly see a benefit in moving from strings to enums, since python doesn't have much type safety. But if you prefer enum, that works just as well. Like @alanchiao said, this changes the API surface so just ensure any clients are aware of it. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Approving. @arovir01: please take one more look and will merge after you @ me.
@nutsiepully I agree, it is not a huge improvement, but I prefer enums over strings even in Python. I find it a cleaner practice. @alanchiao I think we can proceed with the merge, there's no further change we wish to make to the code at this stage. |
This PR alters the configuration object used in weight clustering in accordance with the RFC proposal. It contains the following changes:
CentroidInitialization
to capture possible centroid initialization methods