PV-DBOW or PV-DM? #11

jwijffels · 2020-11-10T20:47:56Z

Is this implementation the distributed bag of words ('PV-DBOW') or the distributed memory ('PV-DM') model

jwijffels · 2020-11-21T07:59:38Z

@hiyijian would be great to have an answer on this

rekola · 2021-10-17T14:59:48Z

Both models are implemented.

jwijffels · 2021-10-18T06:28:36Z

@rekola where in the c++ code can you call dm and pv-dbow

rekola · 2021-10-18T11:12:19Z

It seems my information was based on your work in https://github.com/bnosac/doc2vec/blob/master/R/paragraph2vec.R, which says:

# cbow = 0 = skip-gram                                             = PV-DBOW
# cbow = 1 = continuous bag of words including vector of paragraph = PV-DM

Is this not true?

I've been working on a fork of doc2vec to remove the word and sentence length limits. This original version also crashes, if you have more than 30 million documents in your dataset.

jwijffels · 2021-10-18T11:21:46Z

Yes, that is indeed my interpretation

# cbow = 0 = skip-gram                                             = PV-DBOW
# cbow = 1 = continuous bag of words including vector of paragraph = PV-DM

and I would prefer to have a validation from @hiyijian as in the R wrapper I call: https://github.com/bnosac/doc2vec/blob/3e947562a0a69e11eb292283116a4fdc9cf5c0f4/src/rcpp_doc2vec.cpp#L14 which calls the train functionality from this repository https://github.com/hiyijian/doc2vec/blob/master/cpp/Doc2Vec.cpp#L65 and I make the above assumption

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

PV-DBOW or PV-DM? #11

PV-DBOW or PV-DM? #11

jwijffels commented Nov 10, 2020

jwijffels commented Nov 21, 2020

rekola commented Oct 17, 2021

jwijffels commented Oct 18, 2021

rekola commented Oct 18, 2021

jwijffels commented Oct 18, 2021 •

edited

PV-DBOW or PV-DM? #11

PV-DBOW or PV-DM? #11

Comments

jwijffels commented Nov 10, 2020

jwijffels commented Nov 21, 2020

rekola commented Oct 17, 2021

jwijffels commented Oct 18, 2021

rekola commented Oct 18, 2021

jwijffels commented Oct 18, 2021 • edited

jwijffels commented Oct 18, 2021 •

edited