执行token.py出错 #4

AaronWhite95 · 2019-05-15T13:09:33Z

请问在token.py中执行创建词袋的步骤时，报如下的错是为什么呢？stackoverflow上的方法都行不通
Traceback (most recent call last):
File "feature_extract.py", line 51, in
tokens = token.get_tokens()
File "/home/xfbai/Entity-Relation-SVM-master/new_token.py", line 65, in get_tokens
X_train_counts = vectorizer.fit_transform(cut_docs)
File "/home/xfbai/anaconda3/lib/python3.6/site-packages/sklearn/feature_extraction/text.py", line 1031, in fit_transform
self.fixed_vocabulary_)
File "/home/xfbai/anaconda3/lib/python3.6/site-packages/sklearn/feature_extraction/text.py", line 962, in _count_vocab
raise ValueError("empty vocabulary; perhaps the documents only"
ValueError: empty vocabulary; perhaps the documents only contain stop words

谢谢

Da-Capo · 2019-06-04T03:46:50Z

这个代码也算是上古黑历史了[捂脸]，不排除版本问题。我查了下这个报错是生成了空的词袋，你可以试着打印下 cut_docs 的内容确定是不是分词出了问题，或者看下CountVectorizer()的参数是不是有问题，还不行的话再向上排查。

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

执行token.py出错 #4

执行token.py出错 #4

AaronWhite95 commented May 15, 2019

Da-Capo commented Jun 4, 2019

执行token.py出错 #4

执行token.py出错 #4

Comments

AaronWhite95 commented May 15, 2019

Da-Capo commented Jun 4, 2019