# Build Your First QA System

[![Open In Colab](https://colab.research.google.com/assets/colab-badge.svg)](https://colab.research.google.com/drive/11ulGMt1zZhWgjz_J2SYhHgfcM0EEtlo0?usp=sharing)


### Prepare environment

#### Colab: Enable the GPU runtime
Make sure you enable the GPU runtime to experience decent speed in this tutorial.
**Runtime -> Change Runtime type -> Hardware accelerator -> GPU**

<img src="https://raw.githubusercontent.com/deepset-ai/haystack/master/docs/img/colab_gpu_runtime.jpg">

In [None]:
# Make sure you have a GPU running
!nvidia-smi

Thu Jun  2 09:59:21 2022       
+-----------------------------------------------------------------------------+
| NVIDIA-SMI 460.32.03    Driver Version: 460.32.03    CUDA Version: 11.2     |
|-------------------------------+----------------------+----------------------+
| GPU  Name        Persistence-M| Bus-Id        Disp.A | Volatile Uncorr. ECC |
| Fan  Temp  Perf  Pwr:Usage/Cap|         Memory-Usage | GPU-Util  Compute M. |
|                               |                      |               MIG M. |
|   0  Tesla T4            Off  | 00000000:00:04.0 Off |                    0 |
| N/A   47C    P0    27W /  70W |   2858MiB / 15109MiB |      0%      Default |
|                               |                      |                  N/A |
+-------------------------------+----------------------+----------------------+
                                                                               
+-----------------------------------------------------------------------------+
| Proces

In [None]:
# Install the latest release of Haystack in your own environment
#! pip install farm-haystack

# Install the latest master of Haystack
!pip install --upgrade pip
!pip install git+https://github.com/deepset-ai/haystack.git#egg=farm-haystack[colab]

Looking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
[0mLooking in indexes: https://pypi.org/simple, https://us-python.pkg.dev/colab-wheels/public/simple/
Collecting farm-haystack[colab]
  Cloning https://github.com/deepset-ai/haystack.git to /tmp/pip-install-hgp4llap/farm-haystack_a5633021a4be413eab5d1cce473ce034
  Running command git clone --filter=blob:none --quiet https://github.com/deepset-ai/haystack.git /tmp/pip-install-hgp4llap/farm-haystack_a5633021a4be413eab5d1cce473ce034
  Resolved https://github.com/deepset-ai/haystack.git to commit a617ab950b603aab27e500bc66f40654ade69b22
  Installing build dependencies ... [?25l[?25hdone
  Getting requirements to build wheel ... [?25l[?25hdone
  Preparing metadata (pyproject.toml) ... [?25l[?25hdone
[0m

In [None]:
from haystack.utils import clean_wiki_text, convert_files_to_docs, fetch_archive_from_http, print_answers
from haystack.nodes import FARMReader, TransformersReader

## Document Store

Haystack finds answers to queries within the documents stored in a `DocumentStore`. The current implementations of `DocumentStore` include `ElasticsearchDocumentStore`, `FAISSDocumentStore`,  `SQLDocumentStore`, and `InMemoryDocumentStore`.

**Here:** We recommended Elasticsearch as it comes preloaded with features like [full-text queries](https://www.elastic.co/guide/en/elasticsearch/reference/current/full-text-queries.html), [BM25 retrieval](https://www.elastic.co/elasticon/conf/2016/sf/improved-text-scoring-with-bm25), and [vector storage for text embeddings](https://www.elastic.co/guide/en/elasticsearch/reference/7.6/dense-vector.html).

**Alternatives:** If you are unable to setup an Elasticsearch instance, then follow the [Tutorial 3](https://github.com/deepset-ai/haystack/blob/master/tutorials/Tutorial3_Basic_QA_Pipeline_without_Elasticsearch.ipynb) for using SQL/InMemory document stores.

**Hint**: This tutorial creates a new document store instance with Wikipedia articles on Game of Thrones. However, you can configure Haystack to work with your existing document stores.

### Start an Elasticsearch server
You can start Elasticsearch on your local machine instance using Docker. If Docker is not readily available in your environment (e.g. in Colab notebooks), then you can manually download and execute Elasticsearch from source.

In [None]:
# Recommended: Start Elasticsearch using Docker via the Haystack utility function
from haystack.utils import launch_es

launch_es()



In [None]:
# In Colab / No Docker environments: Start Elasticsearch from source
! wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.2-linux-x86_64.tar.gz -q
! tar -xzf elasticsearch-7.9.2-linux-x86_64.tar.gz
! chown -R daemon:daemon elasticsearch-7.9.2

import os
from subprocess import Popen, PIPE, STDOUT

es_server = Popen(
    ["elasticsearch-7.9.2/bin/elasticsearch"], stdout=PIPE, stderr=STDOUT, preexec_fn=lambda: os.setuid(1)  # as daemon
)
# wait until ES has started
! sleep 30

In [None]:
# Connect to Elasticsearch

from haystack.document_stores import ElasticsearchDocumentStore

document_store = ElasticsearchDocumentStore(host="localhost", username="", password="", index="document")

## Preprocessing of documents

Haystack provides a customizable pipeline for:
 - converting files into texts
 - cleaning texts
 - splitting texts
 - writing them to a Document Store

In this tutorial, we download Wikipedia articles about Game of Thrones, apply a basic cleaning function, and index them in Elasticsearch.

In [None]:
from google.colab import drive
drive.mount('/content/drive')

Drive already mounted at /content/drive; to attempt to forcibly remount, call drive.mount("/content/drive", force_remount=True).


In [None]:
import pandas as pd

# Let's first fetch some documents that we want to query
# Here: data about the Dragon Ball Series
doc_dir = "/content/drive/MyDrive/dragonball_chapters.csv"
data = pd.read_csv(doc_dir)

data.columns = ['title', 'chapter_number', 'content']
data_dics = data.to_dict('records')

# Haytsack needs the following structure for data:
# [

# ]

docs = []
for dic in data_dics:
  dic_transf = {'content': dic['content'], 'meta': {'title': dic['title'], 'chapter_number': dic['chapter_number']}}

  docs.append(dic_transf)

docs[0]

{'content': "The cover features Goku riding with Bulma while she is driving the Capsule #9 Motorcycle through a forest. The cover art for this chapter is also featured in the Akira Toriyama - The World artbooks and Daizenshuu 1.\n Goku in the woods A young boy by the name Goku is seen rolling a tree stump down to his house while waving to some monkeys. When he gets home Goku throws the tree stump in the air to break it into firewood. After chopping wood he greets his grandfather's artifact and heads off through the woods, hunting for his next meal. While looking through the woods he decides to jump down a cliff to hunt for a Giant Fish which he catches by skinny dipping and luring it with his tail and then kills by kicking it.\n On his way back home, he is suddenly hit by the car of a girl named Bulma. Goku first thinks Bulma and her car to be monsters and ends up destroying the car. Bulma gets angry and starts shooting at him with a gun but after finding that it is not effective she q

In [None]:
# Let's have a look at the first 3 entries:
print(docs[0])

# Now, let's write the dicts containing documents to our DB.
document_store.write_documents(docs)

{'content': "The cover features Goku riding with Bulma while she is driving the Capsule #9 Motorcycle through a forest. The cover art for this chapter is also featured in the Akira Toriyama - The World artbooks and Daizenshuu 1.\n Goku in the woods A young boy by the name Goku is seen rolling a tree stump down to his house while waving to some monkeys. When he gets home Goku throws the tree stump in the air to break it into firewood. After chopping wood he greets his grandfather's artifact and heads off through the woods, hunting for his next meal. While looking through the woods he decides to jump down a cliff to hunt for a Giant Fish which he catches by skinny dipping and luring it with his tail and then kills by kicking it.\n On his way back home, he is suddenly hit by the car of a girl named Bulma. Goku first thinks Bulma and her car to be monsters and ends up destroying the car. Bulma gets angry and starts shooting at him with a gun but after finding that it is not effective she q

## Initalize Retriever, Reader,  & Pipeline

### Retriever

Retrievers help narrowing down the scope for the Reader to smaller units of text where a given question could be answered.
They use some simple but fast algorithm.

**Here:** We use Elasticsearch's default BM25 algorithm

**Alternatives:**

- Customize the `BM25Retriever`with custom queries (e.g. boosting) and filters
- Use `TfidfRetriever` in combination with a SQL or InMemory Document store for simple prototyping and debugging
- Use `EmbeddingRetriever` to find candidate documents based on the similarity of embeddings (e.g. created via Sentence-BERT)
- Use `DensePassageRetriever` to use different embedding models for passage and query (see Tutorial 6)

In [None]:
from haystack.nodes import BM25Retriever

retriever = BM25Retriever(document_store=document_store)

In [None]:
# Alternative: An in-memory TfidfRetriever based on Pandas dataframes for building quick-prototypes with SQLite document store.

# from haystack.nodes import TfidfRetriever
# retriever = TfidfRetriever(document_store=document_store)

### Reader

A Reader scans the texts returned by retrievers in detail and extracts the k best answers. They are based
on powerful, but slower deep learning models.

Haystack currently supports Readers based on the frameworks FARM and Transformers.
With both you can either load a local model or one from Hugging Face's model hub (https://huggingface.co/models).

**Here:** a medium sized RoBERTa QA model using a Reader based on FARM (https://huggingface.co/deepset/roberta-base-squad2)

**Alternatives (Reader):** TransformersReader (leveraging the `pipeline` of the Transformers package)

**Alternatives (Models):** e.g. "distilbert-base-uncased-distilled-squad" (fast) or "deepset/bert-large-uncased-whole-word-masking-squad2" (good accuracy)

**Hint:** You can adjust the model to return "no answer possible" with the no_ans_boost. Higher values mean the model prefers "no answer possible"

#### FARMReader

In [None]:
from haystack.nodes import BM25Retriever

retriever = BM25Retriever(document_store=document_store)

# Load a  local model or any of the QA models on
# Hugging Face's model hub (https://huggingface.co/models)

reader = FARMReader(model_name_or_path="deepset/roberta-base-squad2", use_gpu=True)

INFO - haystack.modeling.utils -  Using devices: CUDA:0
INFO - haystack.modeling.utils -  Number of GPUs: 1
INFO - haystack.modeling.model.language_model -  LOADING MODEL
INFO - haystack.modeling.model.language_model -  Could not find deepset/roberta-base-squad2 locally.
INFO - haystack.modeling.model.language_model -  Looking on Transformers Model Hub (in local cache and online)...
INFO - haystack.modeling.model.language_model -  Loaded deepset/roberta-base-squad2
INFO - haystack.modeling.utils -  Using devices: CUDA
INFO - haystack.modeling.utils -  Number of GPUs: 1
INFO - haystack.modeling.infer -  Got ya 2 parallel workers to do inference ...
INFO - haystack.modeling.infer -   0     0  
INFO - haystack.modeling.infer -  /w\   /w\ 
INFO - haystack.modeling.infer -  /'\   / \ 


#### TransformersReader

In [None]:
# Alternative:
# reader = TransformersReader(model_name_or_path="distilbert-base-uncased-distilled-squad", tokenizer="distilbert-base-uncased", use_gpu=-1)

### Pipeline

With a Haystack `Pipeline` you can stick together your building blocks to a search pipeline.
Under the hood, `Pipelines` are Directed Acyclic Graphs (DAGs) that you can easily customize for your own use cases.
To speed things up, Haystack also comes with a few predefined Pipelines. One of them is the `ExtractiveQAPipeline` that combines a retriever and a reader to answer our questions.
You can learn more about `Pipelines` in the [docs](https://haystack.deepset.ai/docs/latest/pipelinesmd).

In [None]:
from haystack.pipelines import ExtractiveQAPipeline

pipe = ExtractiveQAPipeline(reader, retriever)

## Voilà! Ask a question!

In [None]:
# You can configure how many candidates the reader and retriever shall return
# The higher top_k_retriever, the better (but also the slower) your answers.
#prediction = pipe.run(
#    query="Who gathered the Dragon Balls?", params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
#)
#prediction = pipe.run(
#    query="Who kills Raditz?", params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
#)
prediction = pipe.run(
    query="Who was Goku's mentor?", params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
)

  start_indices = flat_sorted_indices // max_seq_len
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  4.65 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  4.66 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  9.66 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 23.92 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 24.23 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 16.77 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 24.00 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 32.01 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.85 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 17.36 Batches/s]


In [None]:
prediction['answers'][0].answer

'Master Roshi'

In [None]:
# Now you can either print the object directly...
from pprint import pprint
prediction = pipe.run(
    query="Who was Goku's mentor?", params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
)
pprint(prediction)

  start_indices = flat_sorted_indices // max_seq_len
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  7.59 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  5.89 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 17.77 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.77 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 21.85 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 17.27 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 21.32 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 32.64 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.98 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 16.81 Batches/s]

{'answers': [<Answer {'answer': 'Master Roshi', 'type': 'extractive', 'score': 0.8219457268714905, 'context': 'Goku fires his Super Kamehameha Goku is charging a Super Kamehameha. Master Roshi then yells out at Goku to wait; if he kills Piccolo, Kami will also ', 'offsets_in_document': [{'start': 161, 'end': 173}], 'offsets_in_context': [{'start': 69, 'end': 81}], 'document_id': 'b22bc24d156b626cd75ea201c775e738', 'meta': {'title': 'The Super Kamehameha', 'chapter_number': '185'}}>,
             <Answer {'answer': 'King Kai', 'type': 'extractive', 'score': 0.7411094009876251, 'context': "'s office in the Other World, with his mentor for the past few months, King Kai amazed at Goku's progress and power. Teleporting to the office, Kami o", 'offsets_in_document': [{'start': 2303, 'end': 2311}], 'offsets_in_context': [{'start': 71, 'end': 79}], 'document_id': 'a7753c674e5421a87f3b690408e11dc0', 'meta': {'title': 'Back From the Other Side', 'chapter_number': '220 (DBZ 26)'}}>,
             




In [None]:
def get_answers(input_question):
  prediction = pipe.run(
    query=input_question, params={"Retriever": {"top_k": 10}, "Reader": {"top_k": 5}}
)
  
  answers_list = []
  print('\n\n', input_question)

  for model_answer in prediction['answers']:
    answers_dic = {}
    answers_dic['answer'] = model_answer.answer
    answers_dic['score'] = model_answer.score
    answers_dic['chapter_number']= model_answer.meta['chapter_number']
    answers_dic['title']= model_answer.meta['title']
    print(answers_dic)

In [None]:
#prediction = pipe.run(query="Who created Buu?", params={"Reader": {"top_k": 5}})
#prediction = pipe.run(query="Who is Vegeta married to?", params={"Reader": {"top_k": 5}})
get_answers('Who is Vegeta married to?')

  start_indices = flat_sorted_indices // max_seq_len
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  8.22 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  3.20 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  4.25 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  3.70 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 13.52 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.35 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.84 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 27.76 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 18.67 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 20.88 Batches/s]



 Who is Vegeta married to?
{'answer': 'Bulma', 'score': 0.6816011667251587, 'chapter_number': 'Series', 'title': 'Dragon Ball Super '}
{'answer': 'Chi-Chi and Bulma', 'score': 0.4590454697608948, 'chapter_number': '517 (DBZ 323)', 'title': 'A Happy Ending... And Then...'}
{'answer': 'Midori', 'score': 0.269614540040493, 'chapter_number': 'Series', 'title': 'Dr. Slump '}
{'answer': 'Android 18', 'score': 0.19324230402708054, 'chapter_number': '353 (DBZ 159)', 'title': 'Vegeta vs. Android #18, Round Two'}
{'answer': 'Goku', 'score': 0.188460111618042, 'chapter_number': '353 (DBZ 159)', 'title': 'Vegeta vs. Android #18, Round Two'}





In [None]:
pprint(prediction)

In [None]:
get_answers("Who created the Dragon Ball manga?")

  start_indices = flat_sorted_indices // max_seq_len
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.93 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.10 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.55 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  4.78 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  5.54 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.07 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.89 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 32.64 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 15.66 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 31.46 Batches/s]



 Who created the Dragon Ball manga?
{'answer': 'Akira Toriyama', 'score': 0.9799302220344543, 'chapter_number': 'Series', 'title': 'Dragon Ball Z '}
{'answer': 'Akira Toriyama', 'score': 0.9630944132804871, 'chapter_number': 'Series', 'title': 'Dragon Ball Z '}
{'answer': 'Akira Toriyama', 'score': 0.9487966299057007, 'chapter_number': 'Series', 'title': 'Dragon Ball '}
{'answer': 'Toei', 'score': 0.7679942846298218, 'chapter_number': 'Series', 'title': 'Dragon Ball Super '}
{'answer': 'Akira Toriyama', 'score': 0.7599756717681885, 'chapter_number': 'Series', 'title': 'Dragon Ball '}





In [None]:
get_answers("How many episodes does Dragon Ball Z have?")

  start_indices = flat_sorted_indices // max_seq_len
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.69 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  5.04 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  3.57 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.07 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.87 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.61 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  6.29 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  7.84 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 17.31 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 17.42 Batches/s]



 How many episodes does Dragon Ball Z have?
{'answer': '39', 'score': 0.8230448067188263, 'chapter_number': 'Series', 'title': 'Dragon Ball Z '}
{'answer': '53', 'score': 0.8091394603252411, 'chapter_number': 'Series', 'title': 'Dragon Ball Z '}
{'answer': '64', 'score': 0.7986690104007721, 'chapter_number': 'Series', 'title': 'Dragon Ball GT '}
{'answer': '39', 'score': 0.7823076844215393, 'chapter_number': 'Series', 'title': 'Dragon Ball Super '}
{'answer': '288 and 289', 'score': 0.7809144556522369, 'chapter_number': 'Series', 'title': 'Dragon Ball Super '}





In [None]:

get_answers("Who is Goku's grandpa?")

  start_indices = flat_sorted_indices // max_seq_len
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 10.59 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.31 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.03 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.86 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 10.50 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 18.42 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 10.24 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 35.28 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 13.57 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  8.59 Batches/s]



 Who is Goku's grandpa?
{'answer': 'Gohan', 'score': 0.9100764691829681, 'chapter_number': '12', 'title': "In Search of Kame-Sen'nin"}
{'answer': 'Gohan', 'score': 0.9060605764389038, 'chapter_number': '21', 'title': 'Full Moon'}
{'answer': 'Gohan', 'score': 0.855204164981842, 'chapter_number': '106', 'title': 'Strong vs. Strong'}
{'answer': 'Gohan', 'score': 0.7632024884223938, 'chapter_number': '110', 'title': 'The Pilaf Machine'}
{'answer': 'Gohan', 'score': 0.6940796673297882, 'chapter_number': '108', 'title': 'Son Gohan'}





In [None]:

get_answers("Who is Goku's grandfather?")

  start_indices = flat_sorted_indices // max_seq_len
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  5.87 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 15.76 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  5.27 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  6.96 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 20.94 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  6.17 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  8.34 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  7.75 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 38.36 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 35.86 Batches/s]



 Who is Goku's grandfather?
{'answer': 'Mr. Satan', 'score': 0.8372938334941864, 'chapter_number': '518 (DBZ 324)', 'title': '10 Years After'}
{'answer': 'Gohan', 'score': 0.8095550835132599, 'chapter_number': '21', 'title': 'Full Moon'}
{'answer': 'Grandpa Gohan', 'score': 0.5588533878326416, 'chapter_number': '108', 'title': 'Son Gohan'}
{'answer': 'Grandpa Gohan', 'score': 0.5538033843040466, 'chapter_number': '50', 'title': "Jackie's Shocking Secret"}
{'answer': 'Muten Rōshi', 'score': 0.534473717212677, 'chapter_number': '196 (DBZ 2)', 'title': 'Kakarrot'}





In [None]:
get_answers("What is the name of Bulma's child?")

  start_indices = flat_sorted_indices // max_seq_len
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 11.75 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  5.66 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  6.92 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 10.96 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 24.14 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 14.72 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 17.72 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 33.69 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 18.34 Batches/s]
Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00, 32.73 Batches/s]



 What is the name of Bulma's child?
{'answer': 'Trunks', 'score': 0.5983053147792816, 'chapter_number': '423 (DBZ 229)', 'title': 'A Hero Is Born!'}
{'answer': 'Chi-Chi', 'score': 0.4562258869409561, 'chapter_number': '13', 'title': 'Fanning the Flame'}
{'answer': 'Goku', 'score': 0.403253436088562, 'chapter_number': '1', 'title': 'Bloomers and the Monkey King'}
{'answer': 'Gohan', 'score': 0.21748258918523788, 'chapter_number': '197 (DBZ 3)', 'title': 'Tails of Future Not-Quite-Past'}
{'answer': 'Gohan decided to put on his new disguise, knowing that if he flies to school himself he will be faster than the Flying Nimbus and no one would know who he is. While flying, Gohan noticed a car driving recklessly on a freeway and stopped right in front of it. As the driver and his companion step out of the car and prepare to attack Gohan with a knife, they ask who Gohan was. Gohan replied by striking a pose and calling himself the Great Saiyaman', 'score': 0.21472012996673584, 'chapter_numbe




In [None]:
prediction = pipe.run(query="Who fights Gotenks?", params={"Reader": {"top_k": 5}})
pprint(prediction)

In [None]:
prediction = pipe.run(query="Who are the characters that fuse into Vegito?", params={"Reader": {"top_k": 5}})
pprint(prediction)

In [None]:
prediction = pipe.run(query="How many dragon balls are there?", params={"Reader": {"top_k": 5}})
pprint(prediction)

In [None]:
prediction = pipe.run(query="Who killed Bardock?", params={"Reader": {"top_k": 5}})
pprint(prediction)