Open Domain Question Answering (ODQA) is a task to find an exact answer to any question in Wikipedia articles. Thus, given only a question, the system outputs the best answer it can find. The default ODQA implementation takes a batch of queries as input and returns 5 answers sorted via their score.
Before using the model make sure that all required packages are installed running the command:
python -m deeppavlov install en_odqa_infer_wiki
Training (if you have your own data)
from deeppavlov import configs
from deeppavlov.core.commands.train import train_evaluate_model_from_config
train_evaluate_model_from_config(configs.doc_retrieval.en_ranker_tfidf_wiki, download=True)
train_evaluate_model_from_config(configs.squad.multi_squad_noans, download=True)
Building
from deeppavlov import configs
from deeppavlov.core.commands.infer import build_model
odqa = build_model(configs.odqa.en_odqa_infer_wiki, load_trained=True)
Inference
result = odqa(['What is the name of Darth Vader\'s son?'])
print(result)
Output:
>> Luke Skywalker
There are pretrained ODQA models for English and Russian languages in :doc:`DeepPavlov </index/>`.
The architecture of ODQA skill is modular and consists of two models, a ranker and a reader. The ranker is based on DrQA [1] proposed by Facebook Research and the reader is based on R-NET [2] proposed by Microsoft Research Asia and its implementation [3] by Wenxuan Zhou.
Note
About 24 GB of RAM required. It is possible to run on a 16 GB machine, but than swap size should be at least 8 GB.
ODQA ranker and ODQA reader should be trained separately. Read about training the ranker :ref:`here <ranker_training>`. Read about training the reader in our separate :ref:`reader tutorial <reader_training>`.
When interacting, the ODQA skill returns a plain answer to the user's question.
Run the following to interact with English ODQA:
python -m deeppavlov interact en_odqa_infer_wiki -d
Run the following to interact with Russian ODQA:
python -m deeppavlov interact ru_odqa_infer_wiki -d
The ODQA configs suit only model inferring purposes. For training purposes use the :ref:`ranker configs <ranker_training>` and the :ref:`reader configs <reader_training>` accordingly.
Scores for ODQA skill:
Model | Dataset | Ranker@5 | Ranker@25 | ||
F1 | EM | F1 | EM | ||
:config:`enwiki20180211 <odqa/en_odqa_infer_wiki.json>` | SQuAD (dev) | 35.89 | 29.21 | 39.96 | 32.64 |
:config:`enwiki20161221 <odqa/en_odqa_infer_enwiki20161221.json>` | 37.83 | 31.26 | 41.86 | 34.73 | |
DrQA [1] enwiki20161221 | - | 27.1 | - | - | |
R3 [4] enwiki20161221 | 37.5 | 29.1 | - |
EM stands for "exact-match accuracy". Metrics are counted for top 5 and top 25 documents returned by retrieval module.
[1] | (1, 2) https://github.com/facebookresearch/DrQA/ |
[2] | https://www.microsoft.com/en-us/research/publication/mrc/ |
[3] | https://github.com/HKUST-KnowComp/R-Net/ |
[4] | https://arxiv.org/abs/1709.00023 |