Skip to content

jyy0553/MDPRank

Repository files navigation

MDPRank

One of the central issues in learning to rank for information retrieval is to develop algorithms that construct ranking models by directly optimizing evaluation measures such as normalized discounted cumulative gain (NDCG). Existing methods usually focus on optimizing a specific evaluation measure calculated at a fixed position, e.g., NDCG calculated at a fixed position K. In information retrieval the evaluation measures, including the widely used NDCG and P@K, are usually designed to evaluate the document ranking at all of the ranking positions, which provide much richer information than only measuring the document ranking at a single position. Thus, it is interesting to ask if we can devise an algorithm that has the ability of leveraging the measures calculated at all of the ranking postilions, for learning a better ranking model. In this paper, we propose a novel learning to rank model on the basis of Markov decision process (MDP), referred to as MDPRank. In the learning phase of MDPRank, the construction of a document ranking is considered as a sequential decision making, each corresponds to an action of selecting a document for the corresponding position. The policy gradient algorithm of REINFORCE is adopted to train the model parameters. The evaluation measures calculated at every ranking positions are utilized as the immediate rewards to the corresponding actions, which guide the learning algorithm to adjust the model parameters so that the measure is optimized. Experimental results on LETOR benchmark datasets showed that MDPRank can outperform the state-of-the-art baselines.

Reinforcement Learning to Rank with Markov Decision Process (SIGIR-2017)

image image

Releases

No releases published

Packages

No packages published

Languages