Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

How can I access words from a given index ? #41

Closed
neshkatrapati opened this issue Aug 27, 2015 · 4 comments
Closed

How can I access words from a given index ? #41

neshkatrapati opened this issue Aug 27, 2015 · 4 comments

Comments

@neshkatrapati
Copy link

Is reverse lookup supported ?.

@kpu
Copy link
Owner

kpu commented Aug 27, 2015

No. Most use cases want to be able to recover the original word even if it's not in the vocabulary. Which won't work if all unknowns are mapped to 0. There are two options: remember words as they go in or load all words at the beginning. Loading all words at the beginning can be done by reading lm/enumerate_vocab.hh.

@kpu kpu closed this as completed Aug 27, 2015
@neshkatrapati
Copy link
Author

Okay, Is there a simple way for me to loop through all the n-grams of a specified order ?, sort of like an iterator ?

@kpu
Copy link
Owner

kpu commented Aug 27, 2015

Yes, but only at loading time because the original strings are not stored in memory. Read lm/enumerate_vocab.hh .

@kpu
Copy link
Owner

kpu commented Aug 27, 2015

You can also find them null-delimited at the end of a binarized model file.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants