# Interactive question answering with OpenVINO

This demo shows interactive question answering with OpenVINO. We use [small BERT-large like model](https://github.com/openvinotoolkit/open_model_zoo/tree/master/models/intel/bert-small-uncased-whole-word-masking-squad-int8-0002) distilled and quantized to INT8 on SQuAD v1.1 training set from larger BERT-large model. The model comes from [Open Model Zoo](https://github.com/openvinotoolkit/open_model_zoo/). At the bottom of this notebook, you will see live inference results from your inputs.

## Imports

In [None]:
from openvino import inference_engine as ie

## The model

### Download the model

We use `omz_downloader`, which is a command line tool from the `openvino-dev` package. `omz_downloader` automatically creates a directory structure and downloads the selected model. If model is already downloaded, this step is skipped.

You can download and use any of the following models: `bert-large-uncased-whole-word-masking-squad-0001`, `bert-large-uncased-whole-word-masking-squad-int8-0001`, `bert-small-uncased-whole-word-masking-squad-0001`, `bert-small-uncased-whole-word-masking-squad-0002`, `bert-small-uncased-whole-word-masking-squad-int8-0002`, just change the model name below. Any of these models is already converted to OpenVINO Intermediate Representation (IR), so there is no need to use `omz_converter`.

In [None]:
# directory where model will be downloaded
base_model_dir = "model"

# desired precision
precision = "FP16-INT8"

# model name as named in Open Model Zoo
model_name = "bert-small-uncased-whole-word-masking-squad-int8-0002"

model_path = f"model/intel/{model_name}/{precision}/{model_name}.xml"
model_weights_path = f"model/intel/{model_name}/{precision}/{model_name}.bin"

download_command = f"omz_downloader " \
                   f"--name {model_name} " \
                   f"--precision {precision} " \
                   f"--output_dir {base_model_dir} " \
                   f"--cache_dir {base_model_dir}"
! $download_command

### Load the model

Downloaded models are located in a fixed structure, which indicates vendor, model name and precision. Only a few lines of code are required to run the model. First, we create an Inference Engine object. Then we read the network architecture and model weights from the .xml and .bin files. Finally, we load the network onto the desired device. You can choose `CPU` or `GPU` in case of this model.

In [None]:
# initialize inference engine
ie_core = ie.IECore()
# read the network and corresponding weights from file
net = ie_core.read_network(model=model_path, weights=model_weights_path)
# load the model on the CPU (you can use GPU as well)
exec_net = ie_core.load_network(network=net, device_name="CPU")

# get input and output names of nodes
input_keys = list(exec_net.input_info)
output_keys = list(exec_net.outputs.keys())

Input keys are the names of the input nodes and output keys contain names of output nodes of the network. In the case of the BERT-large like model, we have four inputs and two outputs.

In [None]:
input_keys, output_keys

## Processing

We need the vocabulary to convert input text into list of tokens.

In [None]:
vocab_file_path = "data/vocab.txt"

with open(file=vocab_file_path, encoding="utf-8") as f:
    vocab = {token: idx for idx, token in enumerate(f.read().splitlines())}

### Main Processing Function

Run question answering on specific knowledge base (website).


In [None]:
def run_question_answering(urls):
    print(f"Context: {urls}", flush=True)
    while True:
        question = input()
        if question == "":
            break

        answer = ""

        print(f"Question: {question}")
        print(f"Answer: {answer}")

## Run

Change sources to your own to answer your questions. You can use as many sources as you want. Remember that the context (knowledge base) is built from paragraphs. If some information is outside of paragraphs, algorithm won't able to find it.

In [None]:
sources = ["https://en.wikipedia.org/wiki/OpenVINO", "https://en.wikipedia.org/wiki/Intel"]

run_question_answering(sources)