Popularity Ranker

Popularity Ranker re-ranks results obtained via TF-IDF Ranker <tfidf_ranking> using information about the number of article views. The number of Wikipedia articles views is an open piece of information which can be obtained via Wikimedia REST API. We assigned a mean number of views for the period since 2017/11/05 to 2018/11/05 to each article in our English Wikipedia database enwiki20180211.

The inner algorithm of Popularity Ranker is a Logistic Regression classifier based on 3 features:

tfidf score of the article
popularity of the article
multiplication of two above features

The classifier is trained on SQuAD-v1.1 train set.

Quick Start

Before using the model make sure that all required packages are installed running the command:

python -m deeppavlov install en_ranker_pop_enwiki20180211.json

Building the model

from deeppavlov import build_model, configs

ranker = build_model(configs.doc_retrieval.en_ranker_pop_enwiki20180211, download=True)

Inference

result = ranker(['Who is Ivan Pavlov?'])
print(result[:5])

Output

>> ['Ivan Pavlov', 'Vladimir Bekhterev', 'Classical conditioning', 'Valentin Pavlov', 'Psychology']

Text for the output titles can be further extracted with ~deeppavlov.vocabs.wiki_sqlite.WikiSQLiteVocab class.

Configuration

Default ranker config is doc_retrieval/en_ranker_pop_enwiki20180211.json <doc_retrieval/en_ranker_pop_enwiki20180211.json>

Running the Ranker

Note

About 17 GB of RAM required.

Interacting

When interacting, the ranker returns document titles of the relevant documents.

Run the following to interact with the ranker:

python -m deeppavlov interact en_ranker_pop_enwiki20180211 -d

Available Data and Pretrained Models

Available information about Wikipedia articles popularity is downloaded to ~/.deeppavlov/downloads/odqa/popularities.json and pre-trained logistic regression classifier is downloaded to ~/.deeppavlov/models/odqa/logreg_3features.joblib by default.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

popularity_ranking.rst

popularity_ranking.rst

Popularity Ranker

Quick Start

Configuration

Running the Ranker

Interacting

Available Data and Pretrained Models

References

Files

popularity_ranking.rst

Latest commit

History

popularity_ranking.rst

File metadata and controls

Popularity Ranker

Quick Start

Configuration

Running the Ranker

Interacting

Available Data and Pretrained Models

References