RuntimeError: Vector for token darang has 230 dimensions, but previously read vectors have 300 dimensions. All vectors must have the same number of dimensions. #57
Comments
Hi There! This code base works just fine:
You must have modified the |
Thanks for your reply! I am running into one more issue: After downloading the pre-trained embeddings, when it starts loading them, my RAM gets filled up and then machine dies or gives me memory error. I am not an expert in NLP or have any prior experience in text. All I want to do is to load pre-trained embeddings and features for the words in my dataset. I tried on two machines with the following configurations: Machine 2: wiki.en.vec: 6.60GB [05:28, 21.4MB/s] <-- [this step finishes successfully.] Your help is highly appreciated. Thanks. |
Yup, this is a known problem. You are attempting to put into memory all 6 gigabytes of embeddings. I'd use There are other more sophisticated options like so: https://github.com/vzhong/embeddings |
Ah, I see. Thank you, I will try these solutions. |
Expected Behavior
Load FastText vectors
Environment:
Ubuntu 16.04
Python 3.6.4
Pytorch 0.4.1
Actual Behavior
Throws the following error:
File "", line 1, in
File "/home/zxi/.local/lib/python3.6/site-packages/torchnlp/word_to_vector/fast_text.py", line 83, in init
super(FastText, self).init(name, url=url, **kwargs)
File "/home/zxi/.local/lib/python3.6/site-packages/torchnlp/word_to_vector/pretrained_word_vectors.py", line 72, in init
self.cache(name, cache, url=url)
File "/home/zxi/.local/lib/python3.6/site-packages/torchnlp/word_to_vector/pretrained_word_vectors.py", line 153, in cache
word, len(entries), dim))
RuntimeError: Vector for token darang has 230 dimensions, but previously read vectors have 300 dimensions. All vectors must have the same number of dimensions.
Steps to Reproduce the Problem
The text was updated successfully, but these errors were encountered: