Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

A bug of vocabulary size #9

Open
jalused opened this issue Jul 18, 2017 · 0 comments
Open

A bug of vocabulary size #9

jalused opened this issue Jul 18, 2017 · 0 comments

Comments

@jalused
Copy link

jalused commented Jul 18, 2017

In Question-Answer task, using the provided data. The vocabulary size of the provided pretrained QACNN model is 3231, but the vocabulary size calculated by "build_vocab" function is 3449.
This does not cause any error in tensorflow 0.12.0 of gpu version, however, it will report "index out of range" error when using a tensorflow 0.12.0 of cpu version.
According to tensorflow/tensorflow#5847. This is because when using tensorflow of cpu version, the "embedding_lookup" function will report error if any index is out of range in "embedding_lookup"function, but it will return zero vector for words whose index is out of range using tensorflow of gpu version. This is a small bug, since we usually use gpu to train deep model, and the words whose index out of range may be infrequent words, therefore, it does not have a great impact on model performace. It will be wonderful if it can be fixed.
Could you please provide a new pretrained QACNN model with correct vocabulary size? Or I can create a pull request to fix the bug in "build_vocab"?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

1 participant