Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Allow for local embedding files #4

Closed
joelkuiper opened this issue Jul 11, 2018 · 4 comments
Closed

Allow for local embedding files #4

joelkuiper opened this issue Jul 11, 2018 · 4 comments

Comments

@joelkuiper
Copy link

Right now the code path is to download some pretrained embedding from a remote source. However I have some domain specific embeddings that are in the standard gensim format. Would it be possible to support loading word embeddings simply from disk?

@alanakbik
Copy link
Collaborator

Hi Joel,

yes that is possible. If your embeddings are stored as gensim KeyedVectors, you can simply point the WordEmbedding class in the constructor to the head file.

embeddings = WordEmbeddings('path/to/KeyedVectors/file')

If your file is not a KeyedVector, you can use gensim to create it from a normal word embedding text file. First, load the text file:
vectors = gensim.models.KeyedVectors.load_word2vec_format('path/to/text/file', binary=False)

Second, save as KeyedVector:
vectors.save('path/to/keyed/vector', pickle_protocol=4)

Done! Then, you can init the WordEmbeddings object as above by pointing to this file.

Does that work for you?

@joelkuiper
Copy link
Author

Oh I must've missed that somewhere. Cool 👍

@yuquanle
Copy link

yuquanle commented Dec 8, 2018

I download the glove vector from https://nlp.stanford.edu/projects/glove/. Then I follow the above. But the final results is null (like this "tensor([])").

@alanakbik
Copy link
Collaborator

Hi @yuquanle could you give more information on how you are converting and loading the word embeddings? Could you paste your whole script? Also, which version of Flair are you using?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants