Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

About the hierarchical softmax #10

Closed
quanpn90 opened this issue Jan 5, 2016 · 1 comment
Closed

About the hierarchical softmax #10

quanpn90 opened this issue Jan 5, 2016 · 1 comment

Comments

@quanpn90
Copy link

quanpn90 commented Jan 5, 2016

Hi,

Thanks for the great model, and happy new year.

I would like to ask about your hierarchical softmax. Is it your intention to equally share the words to the cluster, or to make the implementation easier. I find it hard to understand the way you distribute the words to clusters, did you use a normal distribution ? I tried to group words based on their unigram frequencies (like in Mikolov's model) but the result is very bad.

Also, I guess you have also tried fbnn HSM. I tried to apply it on top of the network (after the final dropout), but it gives very huge loss. Is it possible to improve your HSM to make it work better with asynchronous clusters (some may have several words, while some have a lot of words).

Thank you,

@yoonkim
Copy link
Owner

yoonkim commented Jan 5, 2016

It's mostly to make the implementation easier, and Ifound it to work surprisingly well.

I did also try fbnn but couldn't get it to work. I am not a 100% sure why, but I think there is an issue with precision: https://groups.google.com/forum/#!searchin/torch7/HSM/torch7/Hq_KL4k69dM/D3lf0r1OAQAJ

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants