Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

KeyError #4

Open
VuceWillis opened this issue Jun 10, 2016 · 3 comments
Open

KeyError #4

VuceWillis opened this issue Jun 10, 2016 · 3 comments

Comments

@VuceWillis
Copy link

VuceWillis commented Jun 10, 2016

So, I was trying out an own implementation where I used the following label_dict:
label_dict = {"good": ["pos_word"],
"awesome": ["pos_word"],
"great": ["pos_word"],
"bad": ["neg_word"],
"horrible": ["neg_word"],
"terrible": ["neg_word"]}

Running:

Set values for various parameters

num_features = 300 # Word vector dimensionality
min_word_count = 50 # Minimum word count
num_workers = 4 # Number of threads to run in parallel
context = 8 # Context window size
downsampling = 1e-3 # Downsample setting for frequent words

Initialize and train the model

model = Word2Vec_Supervised(sentences, hs=0, workers=num_workers, size=num_features, min_count=min_word_count, window=context, sample=downsampling,
label_dict=label_dict)

Gave me the following KeyError:
Exception in thread Thread-4:
Traceback (most recent call last):
File "/home/vuk/anaconda2/lib/python2.7/threading.py", line 801, in __bootstrap_inner
self.run()
File "/home/vuk/anaconda2/lib/python2.7/threading.py", line 754, in run
self.__target(_self.__args, *_self.__kwargs)
File "/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py", line 529, in worker_train
job_words = self._get_job_words(alpha, work, job, neu1)
File "/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py", line 488, in _get_job_words
a1 = sum(train_sentence_sg_categ_nogil(self, sentence, alpha, work) for sentence in job)
File "/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py", line 488, in
a1 = sum(train_sentence_sg_categ_nogil(self, sentence, alpha, work) for sentence in job)
File "word2vec_inner_supervised.pyx", line 903, in word2vec_inner_supervised.train_sentence_sg_categ_nogil (./gensim/models/word2vec_inner_supervised.c:8797)
KeyError: 'pos_word'

Any idea why this would happen?

@s4sarath
Copy link
Owner

Put min_count = 1 , and try .

model = Word2Vec_Supervised(sentences, hs=0, workers=num_workers,
size=num_features, min_count=min_word_count, window=context,
sample=downsampling,
label_dict=label_dict)

On Fri, Jun 10, 2016 at 2:07 PM, VuceWillis notifications@github.com
wrote:

So, I was trying out an own implementation where I used the following
label_dict:
label_dict = {"good": ["pos_word"],
"awesome": ["pos_word"],
"great": ["pos_word"],
"bad": ["neg_word"],
"horrible": ["neg_word"],
"terrible": ["neg_word"]}

Running:
Set values for various parameters

num_features = 300 # Word vector dimensionality

min_word_count = 50 # Minimum word count

num_workers = 4 # Number of threads to run in parallel
context = 8 # Context window size

downsampling = 1e-3 # Downsample setting for frequent words
Initialize and train the model

model = Word2Vec_Supervised(sentences, hs=0, workers=num_workers,
size=num_features, min_count=min_word_count, window=context,
sample=downsampling,
label_dict=label_dict)

Gave me the following KeyError:
Exception in thread Thread-4:
Traceback (most recent call last):
File "/home/vuk/anaconda2/lib/python2.7/threading.py", line 801, in
__bootstrap_inner
self.run()
File "/home/vuk/anaconda2/lib/python2.7/threading.py", line 754, in run
self.__target(_self.__args, *_self.__kwargs)
File
"/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py",
line 529, in worker_train
job_words = self._get_job_words(alpha, work, job, neu1)
File
"/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py",
line 488, in _get_job_words
a1 = sum(train_sentence_sg_categ_nogil(self, sentence, alpha, work) for
sentence in job)
File
"/home/vuk/anaconda2/lib/python2.7/site-packages/gensim-0.12.4-py2.7-linux-x86_64.egg/gensim/models/word2vec_supervised.py",
line 488, in
a1 = sum(train_sentence_sg_categ_nogil(self, sentence, alpha, work) for
sentence in job)
File "word2vec_inner_supervised.pyx", line 903, in
word2vec_inner_supervised.train_sentence_sg_categ_nogil
(./gensim/models/word2vec_inner_supervised.c:8797)
KeyError: 'pos_word'

Any idea why this would happen?


You are receiving this because you are subscribed to this thread.
Reply to this email directly, view it on GitHub
#4, or mute the
thread
https://github.com/notifications/unsubscribe/AKJPKKUJRvMExwEdAifTEdi8GApjZN13ks5qKSJZgaJpZM4IyvHW
.

@VuceWillis
Copy link
Author

That indeed solves the KeyError issue.
However, it's printing "Inside the Cython categ Function" a lot now, which causes my notebook to (not crash) but almost crash (I can hardly interrupt the kernel). Maby an idea to leave that print statement out?

@s4sarath
Copy link
Owner

Sorry for the late reply . I will leave that print statement commented .

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants