New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Documentation restructuring, some fixes, clustering docs #57
Conversation
[capi] offer include directories [cluster] mini-batch kmeans makes shuffled batching over dataset when calling fit
awesome, will check this out today!! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Some questions but nothing major.
I looked at the clustering notebook here because git didn't want to render it in the PR diffs
I'm not looking at this with a copy editing mind, just basic stuff:
- API docs link didn't work for me
- Why does a fixed random seed make it only "quasi" reproducible?
- Do you want the notebooks to have citations? Could cite something with MSMs
- I love the visualization in block 11
- I would appreciate more informative plot titles - e.g. in the plots in blocks 8, 11, 15, 17, and 24, to distinguish them
- Can a custom metric only be defined in cpp or is python fine too?
- I like this example a lot
|
awesome good to merge pending plot titles! |
New titles look great! I like how you did it |
Thanks! Will merge then once CI gives green light |
fit()
the previous behavior was falling back to ordinary k-means. Now it takes shuffled samples of the dataset and performs clustering on these mini batches