Word2vec on GPU slower than CPU #13048

manneshiva · 2017-09-14T20:27:56Z

System information

OS Platform and Distribution: Linux Ubuntu 16.04
TensorFlow installed from (source or binary): source
TensorFlow version (use command below): 1.3.0
Python version: 2.7.12
Bazel version (if compiling from source): 0.5.0
CUDA/cuDNN version: 8.0/6.0
GPU model and memory: NVIDIA GTX 1060 / 3GB
Docker used: yes
I picked up the code from the word2vec example on your official repo and made a few changes.The core code to train word2vec remains the same.

Describe the problem

I have been working on benchmarking commonly used frameworks/libraries for unsupervised learning of word embeddings(word2vec). I am currently comparing tensorflow(cpu/gpu), gensim, deeplearning4j and the original c code on standard metrics like training time, peak memory usage and quality of learned vectors. Link to my github repo (still working on it). I ran the benchmark on text8 corpus(plan to run it on a much larger corpus later for the true picture) which gave me strange results.

Tensorflow on GPU is much slower than CPU
Tensorflow is much slower than other frameworks

Is this behavior expected? Would appreciate any inputs.

Source code / logs

Link to tensorflow code
Link to results of sample benchmark on text8 corpus

cy89 · 2017-09-15T20:28:05Z

A few thoughts:
Our tutorial code tends to be written to maximize clarity rather than to maximize performance. It's not surprising that tutorial code wouldn't necessarily run very efficiently, on either CPU or GPU.

I don't see anything in the word2vec code that suggests that it's been optimized to work on a GPU.
Embeddings, by their nature, tend to emphasize fine-grained, random memory lookups. That plays much less to the strengths of the GPU.

This question is better asked on StackOverflow since it is not a bug or feature request. There is also a larger community that reads questions there. Thanks!

piskvorky · 2017-09-18T07:38:51Z

@cy89 do you know of a more optimized implementation of word2vec in TensorFlow (less tutorial-ish)?

GuoleiSun · 2018-04-16T14:36:09Z

Same problem. Word2vec cpu is 10 times faster than word2vec gpu. Yes, it is very surprising! But this is what I got. Both cpu and gpu are slow. Waste a lot of time studying the code and modifying the code.

ticlazau · 2018-07-15T18:59:24Z

Hello,

I have the same problem on TensorFlow 1.8 running word2vec_optimized.py on a system with Volta GPUs.

Rgds,
FM

manneshiva changed the title ~~Word2vec GPU slower than Word2vec CPU~~ Word2vec on GPU slower than CPU Sep 14, 2017

cy89 closed this as completed Sep 15, 2017

nateraw mentioned this issue Nov 9, 2018

General Questions nateraw/Lda2vec-Tensorflow#13

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Word2vec on GPU slower than CPU #13048

Word2vec on GPU slower than CPU #13048

manneshiva commented Sep 14, 2017

cy89 commented Sep 15, 2017

piskvorky commented Sep 18, 2017 •

edited

Loading

GuoleiSun commented Apr 16, 2018

ticlazau commented Jul 15, 2018

Word2vec on GPU slower than CPU #13048

Word2vec on GPU slower than CPU #13048

Comments

manneshiva commented Sep 14, 2017

System information

Describe the problem

Source code / logs

cy89 commented Sep 15, 2017

piskvorky commented Sep 18, 2017 • edited Loading

GuoleiSun commented Apr 16, 2018

ticlazau commented Jul 15, 2018

piskvorky commented Sep 18, 2017 •

edited

Loading