Improve word2vec #641

Merged
merged 15 commits into from Jul 12, 2016

Conversation

Projects
None yet
3 participants
@unnonouno
Member

unnonouno commented Nov 19, 2015

  • Use multi dimensional embed ID
  • Fix initial parameter
  • Fix default minibatch size
  • Merge kernels in negative sampling
  • Fix default negative sampling size (=5)

With my GTX 970, its performance is about 300Kwords/sec with minbatch-size=5000, n_units=400, window=5, HSM and CBOW model. Negative sampling is slower than HSM because its implementation is not tuned.

@unnonouno unnonouno changed the title from [WIP] Improve word2vec to Improve word2vec Nov 20, 2015

examples/word2vec/train_word2vec.py
- h = h + e if h is not None else e
-
+ e = model.embed(context)
+ h = F.sum(e, axis=0) * (1. / context.data.shape[0])

This comment has been minimized.

@okuta

okuta Nov 24, 2015

Member

How about len(context.data)?

@okuta

okuta Nov 24, 2015

Member

How about len(context.data)?

examples/word2vec/train_word2vec.py
@@ -90,23 +87,20 @@ def __init__(self, n_in, n_out):
super(SoftmaxCrossEntropyLoss, self).__init__(
W=L.Linear(n_in, n_out),
)
+ self.W.W.data[...] = 0

This comment has been minimized.

@okuta

okuta Nov 24, 2015

Member

How about change the name self.W?

@okuta

okuta Nov 24, 2015

Member

How about change the name self.W?

@okuta okuta self-assigned this Jan 5, 2016

'''
T f = wx;
- if (i % m == 0) {
+ if (i % n_samples == 0) {
f = -f;
}
T loss;

This comment has been minimized.

@okuta

okuta Jan 11, 2016

Member

How about below code?

T loss = 0;
if (f > 0) {
    loss = f;
    f = -f;
}
loss += log1pf(__expf(f));
@okuta

okuta Jan 11, 2016

Member

How about below code?

T loss = 0;
if (f > 0) {
    loss = f;
    f = -f;
}
loss += log1pf(__expf(f));
examples/word2vec/train_word2vec.py
@@ -194,7 +195,7 @@ def calculate_loss(model, dataset, position):
with open('word2vec.model', 'w') as f:
f.write('%d %d\n' % (len(index2word), args.unit))
- w = model.embed.W.data
+ w = cuda.to_cpu(model.embed.W.data)
for i in range(w.shape[0]):

This comment has been minimized.

@okuta

okuta Jan 11, 2016

Member

How about for i, wi in enumerate(w):?

@okuta

okuta Jan 11, 2016

Member

How about for i, wi in enumerate(w):?

@cemoody

This comment has been minimized.

Show comment
Hide comment
@cemoody

cemoody Mar 31, 2016

Contributor

Hi @unnonouno, it might be possible to merge a lot of this PR with #1045. Both change the CUDA C code for NegativeSampling; one makes it faster and the other adds ignoring -1 labels.

Contributor

cemoody commented Mar 31, 2016

Hi @unnonouno, it might be possible to merge a lot of this PR with #1045. Both change the CUDA C code for NegativeSampling; one makes it faster and the other adds ignoring -1 labels.

@unnonouno

This comment has been minimized.

Show comment
Hide comment
@unnonouno

unnonouno Apr 10, 2016

Member

Thank you very much! I'll review your PR #1045 first. And after that, I'll fix this PR.

Member

unnonouno commented Apr 10, 2016

Thank you very much! I'll review your PR #1045 first. And after that, I'll fix this PR.

@unnonouno

This comment has been minimized.

Show comment
Hide comment
@unnonouno

unnonouno Jun 27, 2016

Member

I removed fixes related to the negative example as they are conflicted. I'll make another PR related to NS.

Member

unnonouno commented Jun 27, 2016

I removed fixes related to the negative example as they are conflicted. I'll make another PR related to NS.

@okuta okuta added this to the v1.11.1 milestone Jul 12, 2016

@okuta

This comment has been minimized.

Show comment
Hide comment
@okuta

okuta Jul 12, 2016

Member

LGTM!

Member

okuta commented Jul 12, 2016

LGTM!

@okuta okuta merged commit 684456d into master Jul 12, 2016

5 checks passed

continuous-integration/appveyor/branch AppVeyor build succeeded
Details
continuous-integration/appveyor/pr AppVeyor build succeeded
Details
continuous-integration/travis-ci/pr The Travis CI build passed
Details
continuous-integration/travis-ci/push The Travis CI build passed
Details
coverage/coveralls Coverage increased (+0.01%) to 95.432%
Details

@okuta okuta deleted the improve-word2vec branch Jul 12, 2016

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment