README

This is a TensorFlow version of the Enhanced Paragraph Vector. The code is for the training of paragraph vector, so you still need to manually compute the P(w|d) and combine them with the language modeling approach if necessary. Please cite the following paper if you use this implementation for your research.

[1] Qingyao Ai, Liu Yang, Jiafeng Guo, and W. Bruce Croft. 2016. Analysis of the Paragraph Vector Model for Information Retrieval. In Proceedings of the 2016 ACM International Conference on the Theory of Information Retrieval (ICTIR '16). ACM, New York, NY, USA, 133-142. DOI: https://doi.org/10.1145/2970398.2970409
[2] Qingyao Ai, Liu Yang, Jiafeng Guo, and W. Bruce Croft. 2016. Improving Language Estimation with the Paragraph Vector Model for Ad-hoc Retrieval. In Proceedings of the 39th International ACM SIGIR conference on Research and Development in Information Retrieval (SIGIR '16). ACM, New York, NY, USA, 869-872. DOI: https://doi.org/10.1145/2911451.2914688

Input data:

vocab.txt.gz: A vocabulary file with one word per line.
train.txt.gz: The train data file with a document per line, and the document text is represented using indices of words as they appear in the vocabulary file.

Training and testing:

decode=False: train models from the training data.
decode=True:

  test_mode='output_embedding': output the model embeddings.

  test_mode='get_doc_softmax_norm': compute the softmax denominators for each document.

In the paper, The final $P_{pv}(w|d)$ is the $exp(w*d)$ divided by the $d$'s denominator.

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
ParagraphVector.py		ParagraphVector.py
README.md		README.md
data_util.py		data_util.py
main.py		main.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Repository files navigation

README

Input data:

Training and testing:

About

Releases

Packages

Languages

QingyaoAi/Enhanced-Paragraph-Vector-Model-for-Ad-hoc-Retrieval

Folders and files

Latest commit

History

Repository files navigation

README

Input data:

Training and testing:

About

Resources

Stars

Watchers

Forks

Releases

Packages 0

Languages

Packages