Word2Bits benchmark #1991

menshikh-iv · 2018-03-21T16:28:17Z

Description

Pretty interesting paper Word2Bits - Quantized Word Vectors by Maximilian Lam, looks like it possible to apply "quantization" to the current w2v algorithm and receive a memory-compact representation without sacrificing quality.

ToDo

Make needed changes in current w2v code (according to this article), only for testing
Compare this approach by embedding quality (+memory consumption) with current w2v implementation (reproduce evaluation from paper)
- Train corpus: English wikipedia
- Benchmark:
  - accuracy method (classical approach)
  - SQuAD task (more detailed described in the paper)

If benchmark shows good-enough results, this will be a part of Gensim.

The text was updated successfully, but these errors were encountered:

menshikh-iv · 2018-03-22T06:31:18Z

Looks like very good task for you @persiyanov :)

persiyanov · 2018-03-31T18:23:36Z

I'm posting benchmark results in related pull request #2011

menshikh-iv · 2019-01-10T10:55:06Z

Fixed by #2011 (benchmark)

menshikh-iv added testing Issue related with testing (code, documentation, etc) difficulty medium Medium issue: required good gensim understanding & python skills performance Issue related to performance (in HW meaning) labels Mar 21, 2018

persiyanov mentioned this issue Mar 31, 2018

[WIP #1991]: Word2Bits benchmarks #2011

Closed

menshikh-iv closed this as completed Jan 10, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Word2Bits benchmark #1991

Word2Bits benchmark #1991

menshikh-iv commented Mar 21, 2018 •

edited by piskvorky

menshikh-iv commented Mar 22, 2018

persiyanov commented Mar 31, 2018

menshikh-iv commented Jan 10, 2019

Word2Bits benchmark #1991

Word2Bits benchmark #1991

Comments

menshikh-iv commented Mar 21, 2018 • edited by piskvorky

Description

ToDo

menshikh-iv commented Mar 22, 2018

persiyanov commented Mar 31, 2018

menshikh-iv commented Jan 10, 2019

menshikh-iv commented Mar 21, 2018 •

edited by piskvorky