Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

cluster center initialization #22

Open
jbfuehrer opened this issue Mar 5, 2019 · 4 comments
Open

cluster center initialization #22

jbfuehrer opened this issue Mar 5, 2019 · 4 comments

Comments

@jbfuehrer
Copy link

Hey,

is there a reason why the mechanics for determining the cluster centers changed from the kmpp algorithm used inside DBoW2 to the version now used in fbow?

I noticed that especially with smaller vocabularies, sometimes the exact same feature is chosen multiple times as the initial cluster center which results in one of them always being empty (because all features fall into the one being found first during linear search) and therefore generating unused/meaningless words.

I ported the DBoW2 KMPP implementation over to fbow and can do a PR. Just wanted to make sure I'm not missing any domain knowledge before doing so.

Greets

@dukeNashor
Copy link

Same thoughts here. The new initial-cluster-center-choosing-algorithm doesn't make sense to me, either.

@S-o-T
Copy link

S-o-T commented May 16, 2019

@rmsalinas Can you please comment on this? Any plans to fix the issue?
@jbfuehrer Can you please commit your impl to your fork repo at least, will be much appreciated.

@jbfuehrer
Copy link
Author

@S-o-T done, also created a PR now.

@S-o-T
Copy link

S-o-T commented May 16, 2019

@jbfuehrer Thank you.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants