Skip to content


Repository files navigation

Question answering

This project implements a question answering system using Wikipedia as a resource. The system that performs document retrieval is Wikipedia's own elastic search engine. The model that does reading comprehension is BERT fine-tuned on Wikipedia, courtesy of the transformers library 🥰

from answer_question import Answerer

question = "what is the population of France?"

answerer = Answerer(model_server_address="http://localhost:8080/v1/models/bert_qa_squad:predict")

ans = answerer.answer_question(question)["answer"]["answer"] # 67 . 4 million

To learn a little more about this, check out the blog post.


Install the requirements and start Docker:

  • conda create --name question_answering python=3.8 -y && conda activate question_answering
  • pip install -r requirements.txt

Download the model as a docker servable and boot it up:

  • docker pull camoverride/bert-squad-qa-large:v0.2
  • docker run -t --rm -p 8080:8080 camoverride/bert-squad-qa-large:v0.2

Or build it locally and then boot it up:

  • python models/ (creates the folder models/bert_qa_squad because the model is too large to store in git)
  • docker build -t camoverride/bert-squad-qa-large:v0.2 .
  • docker run -t --rm -p 8080:8080 camoverride/bert-squad-qa-large:v0.2

Test it out:

  • python

If you want a closer look at how the functions all work, run the tests:

  • python -m unittest tests/
  • python -m unittest tests/

If you need to tweak the model server configuration, check out model_server_config.yaml.

Under the hood

Question answering systems all operate in two steps:

  • Find some relevant documents. This process is called document retrieval and the software that does this is called a search engine. This is implemented in This function requires an active internet connection.
  • Read through the documents and find the answer. This process is called reading comprehension and is performed by a model like BERT. This is implemented in This function requires a model server.

These two steps (plus some post-processing) are implemented in the Answerer class, which lives in

I built the model server from a SavedModel that I run with tensorflow serving. However, it was too big to save in this repo. A module that re-creates this model artifact is in models/