-
Notifications
You must be signed in to change notification settings - Fork 128
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
merge Autogmm and GaussianCluster #306
Comments
to clarify, AutoGMM does everything that GaussianCluster does and more. The difference is, GaussianCluster has an option to run kmeans initialization multiple times, which I have found empirically can lead to better results than any of the AutoGMM initializations. If AutoGMM had a way of running multiple kmeans inits and taking the best, we could deprecate GaussianCluster |
I'll get this done this semester |
@bdpedigo I will take this one, sounds fun! : ) DoD:
This is an initial very much qualitative assesment of what the final implementation should do. I am sure it will progressively become more and more quantitative. Please let me know if you have any suggestions/ideas. |
see |
@PerifanosPrometheus: @tliu68 is working on some changes to the internals of AutoGMM (see #371). I think this will just affect the initialization. You two might want to coordinate a bit so that the merge is easy at the end! |
@tliu68 this PR from @PerifanosPrometheus will more or less add 1 or more runs of kmeans as an initialization in AutoGMM, and maybe will rename the class itself. |
Cool! @PerifanosPrometheus I'll let you know what specific changes I'm going to make soon. Thanks! |
@tliu68 Currently, I am also on the planning stage so I haven't finalized any changes yet. Please let me know if there is anything that I should particularly be careful dealing with |
@PerifanosPrometheus I've opened a draft PR #589. Please see my description on all major changes I made. My plan for next step is to try to clean up/reorganize the for loops and see if I can incorporate parallel processing. But I believe the major updates we wanted to make have been addressed. Please let me know if you have any question and all suggestion/advice welcome! |
@tliu68 @PerifanosPrometheus please coordinate |
@tliu68 I would like to continue working on this issue, however I noticed that PR #589 has been merged. I would like to touch back with you to have an update on changes merged. Mainly, I believe that fixing this issue will involve modifying the param_grid dict in the previous implementation, has that been changed? Lastly, I heard that you were planning on merging the function with sklearn(not entirely sure this info is correct) in that case what would be the timeline to make my changes(if even desirable). |
Yes, I am working on merging this into sklearn (described in issue #601). The previous PR #589 basically changed the initialization process so that instead of running AgglomerativeClustering and GaussianMixture for all combinations in the old
Does that make sense? Also, Ben and I will probably make more changes (to be discussed) but it would be great if we can start a PR into sklearn soon. Do you mind sharing your plan? |
Seems like there is enough redundant code, they are both doing something similar, and we can still get both behaviors with different options
The text was updated successfully, but these errors were encountered: