-
Notifications
You must be signed in to change notification settings - Fork 632
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
tokenize error #8
Comments
I am experiencing this problem as well, exactly the same |
This seems like a spacy error -- have y'all tried downloading the vocab files that accompany spacy?
|
@cemoody Yes, I did and I also upgraded the required module (numpy, spacy...) to the newest one, but this error is still exist. |
So it looks like it's an issue with SpaCy: ...I can't reproduce this, so it's tough for me to debug. All I can really do is what Honnibal is suggesting -- try adding the |
@cemoody That works for me finally, thanks for help. |
@longma307 Glad it helped! :) |
I have been following your instruction to test lda2vec, but I got an error when I tried to run this line:
tokens, vocab = preprocess.tokenize(texts,max_length,tag=False,parse=False,entity=False)
runfile('/Users/lm/Dropbox/Athena/Feature_Reduction/WordVectors/lda2vec_test.py', wdir='/Users/m/Dropbox/Athena/Feature_Reduction/WordVectors')
Traceback (most recent call last):
File "", line 1, in
runfile('/Users/lm/Dropbox/Athena/Feature_Reduction/WordVectors/lda2vec_test.py', wdir='/Users/lm/Dropbox/Athena/Feature_Reduction/WordVectors')
File "/Users/lm/Documents/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 699, in runfile
execfile(filename, namespace)
File "/Users/lm/Documents/anaconda/lib/python2.7/site-packages/spyderlib/widgets/externalshell/sitecustomize.py", line 81, in execfile
builtins.execfile(filename, *where)
File "/Users/lm/Dropbox/Athena/Feature_Reduction/WordVectors/lda2vec_test.py", line 29, in
tokens, vocab = preprocess.tokenize(texts,max_length,tag=False,parse=False,entity=False)
File "build/bdist.macosx-10.5-x86_64/egg/lda2vec/preprocess.py", line 65, in tokenize
nlp = English(data_dir=data_dir)
File "/Users/lm/Documents/anaconda/lib/python2.7/site-packages/spacy/language.py", line 210, in init
vocab = self.default_vocab(package)
File "/Users/lm/Documents/anaconda/lib/python2.7/site-packages/spacy/language.py", line 144, in default_vocab
return Vocab.from_package(package, get_lex_attr=get_lex_attr)
File "spacy/vocab.pyx", line 65, in spacy.vocab.Vocab.from_package (spacy/vocab.cpp:3592)
with package.open(('vocab', 'strings.json')) as file_:
File "/Users/lm/Documents/anaconda/lib/python2.7/contextlib.py", line 17, in enter
return self.gen.next()
File "/Users/lm/Documents/anaconda/lib/python2.7/site-packages/sputnik/package_stub.py", line 68, in open
raise default(self.file_path(*path_parts))
IOError: /Users/lm/Documents/anaconda/lib/python2.7/site-packages/spacy/en/data/vocab/strings.json
I have my related modules(numpy,space...) updated to the newest, I still got this error.
The text was updated successfully, but these errors were encountered: