# LAB | Abstractive Question Answering

Abstractive question-answering focuses on the generation of multi-sentence answers to open-ended questions. It usually works by searching massive document stores for relevant information and then using this information to synthetically generate answers. This notebook demonstrates how Pinecone helps you build an abstractive question-answering system. We need three main components:

- A vector index to store and run semantic search
- A retriever model for embedding context passages
- A generator model to generate answers

# Install Dependencies

In [27]:
import torch
from sentence_transformers import SentenceTransformer

# set device to GPU if available
device = 'cuda' if torch.cuda.is_available() else 'cpu'

# load the retriever model from HuggingFace. Use the flax-sentence-embeddings/all_datasets_v3_mpnet-base model
retriever = SentenceTransformer(
    "flax-sentence-embeddings/all_datasets_v3_mpnet-base",
    device=device
)

print("Retriever embedding dimension:", retriever.get_sentence_embedding_dimension())

modules.json:   0%|          | 0.00/349 [00:00<?, ?B/s]

config_sentence_transformers.json:   0%|          | 0.00/116 [00:00<?, ?B/s]

README.md: 0.00B [00:00, ?B/s]

sentence_bert_config.json:   0%|          | 0.00/53.0 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/591 [00:00<?, ?B/s]

pytorch_model.bin:   0%|          | 0.00/438M [00:00<?, ?B/s]

tokenizer_config.json:   0%|          | 0.00/383 [00:00<?, ?B/s]

vocab.txt: 0.00B [00:00, ?B/s]

tokenizer.json: 0.00B [00:00, ?B/s]

model.safetensors:   0%|          | 0.00/438M [00:00<?, ?B/s]

special_tokens_map.json:   0%|          | 0.00/239 [00:00<?, ?B/s]

config.json:   0%|          | 0.00/190 [00:00<?, ?B/s]

Retriever embedding dimension: 768


In [2]:
!pip install -qU datasets==2.16.1 pinecone-client==3.1.0 sentence-transformers torch

[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m899.7/899.7 MB[0m [31m1.9 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m594.3/594.3 MB[0m [31m2.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m10.2/10.2 MB[0m [31m147.5 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m88.0/88.0 MB[0m [31m25.6 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m954.8/954.8 kB[0m [31m61.8 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m193.1/193.1 MB[0m [31m5.7 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m1.2/1.2 MB[0m [31m53.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━[0m [32m63.6/63.6 MB[0m [31m37.0 MB/s[0m eta [36m0:00:00[0m
[2K   [90m━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━

In [3]:
from google.colab import userdata
import os

# Retrieve the API keys
OPENAI_API_KEY = userdata.get('OPENAI_API_KEY')
PINECONE_API_KEY = userdata.get('PINECONE_API_KEY')

# Set as environment variables
os.environ['OPENAI_API_KEY'] = OPENAI_API_KEY
os.environ['PINECONE_API_KEY'] = PINECONE_API_KEY

print("OpenAI API key loaded and set as environment variable.")

OpenAI API key loaded and set as environment variable.


# Load and Prepare Dataset

Our source data will be taken from the Wiki Snippets dataset, which contains over 17 million passages from Wikipedia. But, since indexing the entire dataset may take some time, we will only utilize 50,000 passages in this demo that include "History" in the "section title" column. If you want, you may utilize the complete dataset. Pinecone vector database can effortlessly manage millions of documents for you.

In [4]:
from datasets import load_dataset
# load the SQuAD dataset which contains Wikipedia contexts
# We will use streaming mode and shuffle it
wiki_data = load_dataset(
  'squad',
  split='train',
  streaming=True
).shuffle(seed=960)


README.md: 0.00B [00:00, ?B/s]

We are loading the dataset in the streaming mode so that we don't have to wait for the whole dataset to download (which is over 9GB). Instead, we iteratively download records one at a time.

In [5]:
# show the contents of a single document in the dataset
next(iter(wiki_data))

{'id': '56bf8c8aa10cfb140055116f',
 'title': 'Beyoncé',
 'context': 'In July 2002, Beyoncé continued her acting career playing Foxxy Cleopatra alongside Mike Myers in the comedy film, Austin Powers in Goldmember, which spent its first weekend atop the US box office and grossed $73 million. Beyoncé released "Work It Out" as the lead single from its soundtrack album which entered the top ten in the UK, Norway, and Belgium. In 2003, Beyoncé starred opposite Cuba Gooding, Jr., in the musical comedy The Fighting Temptations as Lilly, a single mother whom Gooding\'s character falls in love with. The film received mixed reviews from critics but grossed $30 million in the U.S. Beyoncé released "Fighting Temptation" as the lead single from the film\'s soundtrack album, with Missy Elliott, MC Lyte, and Free which was also used to promote the film. Another of Beyoncé\'s contributions to the soundtrack, "Summertime", fared better on the US charts.',
 'question': "How did the critics view the movie

In [6]:
# filter only documents with History as section_title - Replace None with your code
history = wiki_data.filter(lambda x: 'title' in x and 'History' in x['title'])
history

IterableDataset({
    features: ['id', 'title', 'context', 'question', 'answers'],
    num_shards: 1
})

Let's iterate through the dataset and apply our filter to select the 50,000 historical passages. We will extract `article_title`, `section_title` and `passage_text` from each document.

In [7]:
from tqdm.auto import tqdm  # progress bar

total_doc_count = 50000
docs = []

for d in tqdm(history, desc="Collecting historical passages"):
    extracted_doc = {
        "id": d.get("id"),  # Add this line to include the document ID
        "article_title": d.get("title"),
        "section_title": "History",
        "passage_text": d.get("context")
    }
    if extracted_doc["passage_text"]:
        docs.append(extracted_doc)
        tqdm.write(f"Collected {len(docs)} valid passages")  # optional logging
    if len(docs) >= total_doc_count:
        break

Collecting historical passages: 0it [00:00, ?it/s]

Collected 1 valid passages
Collected 2 valid passages
Collected 3 valid passages
Collected 4 valid passages
Collected 5 valid passages
Collected 6 valid passages
Collected 7 valid passages
Collected 8 valid passages
Collected 9 valid passages
Collected 10 valid passages
Collected 11 valid passages
Collected 12 valid passages
Collected 13 valid passages
Collected 14 valid passages
Collected 15 valid passages
Collected 16 valid passages
Collected 17 valid passages
Collected 18 valid passages
Collected 19 valid passages
Collected 20 valid passages
Collected 21 valid passages
Collected 22 valid passages
Collected 23 valid passages
Collected 24 valid passages
Collected 25 valid passages
Collected 26 valid passages
Collected 27 valid passages
Collected 28 valid passages
Collected 29 valid passages
Collected 30 valid passages
Collected 31 valid passages
Collected 32 valid passages
Collected 33 valid passages
Collected 34 valid passages
Collected 35 valid passages
Collected 36 valid passages
C

In [8]:
import pandas as pd

# create a pandas dataframe with the documents we extracted
df = pd.DataFrame(docs)
df.head()

Unnamed: 0,id,article_title,section_title,passage_text
0,5726ebefdd62a815002e9552,History_of_science,History,The English word scientist is relatively recen...
1,5726fa9cdd62a815002e96bd,History_of_science,History,The astronomer Aristarchus of Samos was the fi...
2,5726fa9cdd62a815002e96bf,History_of_science,History,The astronomer Aristarchus of Samos was the fi...
3,5726f1c9f1498d1400e8f0a9,History_of_science,History,Ancient Egypt made significant advances in ast...
4,5726f997f1498d1400e8f18b,History_of_science,History,The important legacy of this period included s...


# Initialize Pinecone Index

The Pinecone index stores vector representations of our historical passages which we can retrieve later using another vector (query vector). To build our vector index, we must first establish a connection with Pinecone. For this, we need an API from Pinecone. You can get one for free from [here](https://app.pinecone.io/), and after that, we initialize the connection as follows:

In [9]:
import os
from pinecone import Pinecone

# initialize connection to pinecone (get API key at app.pinecone.io)
api_key = os.environ.get('PINECONE_API_KEY') or 'PINECONE_API_KEY'

# configure client
pc = Pinecone(api_key=api_key)

Now we setup our index specification, this allows us to define the cloud provider and region where we want to deploy our index. You can find a list of all [available providers and regions here](https://docs.pinecone.io/docs/projects).

In [10]:
from pinecone import ServerlessSpec

cloud = os.environ.get('PINECONE_CLOUD') or 'aws'
region = os.environ.get('PINECONE_REGION') or 'us-east-1'

spec = ServerlessSpec(cloud=cloud, region=region)

Now we create a new index. We will name it "abstractive-question-answering" — you can name it anything we want. We specify the metric type as "cosine" and dimension as 768 because the retriever we use to generate context embeddings is optimized for cosine similarity and outputs 768-dimension vectors.

In [11]:
index_name = "abstractive-question-answering" #give your index a meaningful name

In [12]:
import time

# check if index already exists (it shouldn't if this is first time)
if index_name not in pc.list_indexes().names():
    pc.create_index(
        name=index_name,
        dimension=768,      # retriever outputs 768-d embeddings
        metric="cosine",
        spec=ServerlessSpec(
            cloud="aws",
            region="us-east-1"
        ),
    )
    # wait until the index is ready
    while not pc.describe_index(index_name).status["ready"]:
        time.sleep(1)

# connect to the index and check stats are all zeros
index = pc.Index(index_name)
print(index.describe_index_stats()) #initialize the index, and insure the stats are all zeros

{'dimension': 768,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}


# Initialize Retriever

Next, we need to initialize our retriever. The retriever will mainly do two things:

- Generate embeddings for all historical passages (context vectors/embeddings)
- Generate embeddings for our questions (query vector/embedding)

The retriever will create embeddings such that the questions and passages that hold the answers to our queries are close to one another in the vector space. We will use a SentenceTransformer model based on Microsoft's MPNet as our retriever. This model performs quite well for comparing the similarity between queries and documents. We can use Cosine Similarity to compute the similarity between query and context vectors generated by this model (Pinecone automatically does this for us).

In [None]:
import torch
from sentence_transformers import SentenceTransformer

# set device to GPU if available
device = 'cuda' if torch.cuda.is_available() else 'cpu'
# load the retriever model from huggingface model hub
retriever = SentenceTransformer(
    "flax-sentence-embeddings/all_datasets_v3_mpnet-base",
    device=device
)
#load the retriever model from HuggingFace. Use the flax-sentence-embeddings/all_datasets_v3_mpnet-base model
retriever

# Generate Embeddings and Upsert

Next, we need to generate embeddings for the context passages. We will do this in batches to help us more quickly generate embeddings and upload them to the Pinecone index. When passing the documents to Pinecone, we need an id (a unique value), context embedding, and metadata for each document representing context passages in the dataset. The metadata is a dictionary containing data relevant to our embeddings, such as the article title, section title, passage text, etc.

In [28]:
import time

# 1. Wait for the index to be ready
print(f"Waiting for index '{index_name}' to be ready...")
while True:
    try:
        # describe_index returns an IndexDescription object which has a status field
        index_description = pc.describe_index(index_name)
        if index_description.status.ready:
            print(f"Index '{index_name}' is ready.")
            break
        else:
            # Index is not ready yet, print current state and wait
            print(f"Index '{index_name}' is not ready yet. Status: {index_description.status.state}. Waiting...")
            time.sleep(10) # Wait for 10 seconds before checking again
    except Exception as e:
        # Catch potential errors if index is still being created/initialized by Pinecone
        print(f"Error checking index status: {e}. Retrying in 10 seconds...")
        time.sleep(10)

# 2. Ensure the stats are all zeros (for a freshly created/empty index)
# Use index.describe_index_stats() to get the current statistics
stats = index.describe_index_stats()

# Check if the total_vector_count is 0
if stats.total_vector_count == 0:
    print(f"Index '{index_name}' is empty (total_vector_count: {stats.total_vector_count}).")
    print(f"Dimension: {stats.dimension}")
else:
    print(f"WARNING: Index '{index_name}' is not empty. Current stats:")
    print(stats) # Print full stats if not empty

Waiting for index 'abstractive-question-answering' to be ready...
Index 'abstractive-question-answering' is ready.
Index 'abstractive-question-answering' is empty (total_vector_count: 0).
Dimension: 768


In [29]:
from pinecone import Pinecone, ServerlessSpec
from google.colab import userdata

PINECONE_API_KEY = userdata.get('PINECONE_API_KEY')

pc = Pinecone(api_key=PINECONE_API_KEY)

index_name = "abstractive-question-answering"

# if index exists but with wrong dimension, delete it first
indexes = pc.list_indexes()
if index_name in indexes.names():
    info = pc.describe_index(index_name)
    if info.dimension != 768:
        pc.delete_index(index_name)

# create index if needed
if index_name not in pc.list_indexes().names():
    pc.create_index(
        name=index_name,
        dimension=768,
        metric="cosine",
        spec=ServerlessSpec(
            cloud="aws",
            region="us-east-1"
        )
    )

index = pc.Index(index_name)
print(index.describe_index_stats())

{'dimension': 768,
 'index_fullness': 0.0,
 'namespaces': {},
 'total_vector_count': 0}


In [30]:
# We will use batches of 64
batch_size = 64

print(f"\nStarting embedding generation and upsert for {len(df)} documents into Pinecone index '{index_name}'...")

# Iterate through the DataFrame in batches
for i in tqdm(range(0, len(df), batch_size), desc="Upserting batches to Pinecone"):
    # Find the end index for the current batch
    i_end = min(i + batch_size, len(df))

    # Extract the current batch of documents from the DataFrame
    batch_df = df.iloc[i:i_end]

    # Extract the 'passage_text' for embedding generation
    batch_texts = batch_df['passage_text'].tolist()

    # Generate embeddings for the current batch of texts
    batch_embeddings = retriever.encode(batch_texts, show_progress_bar=False, device=device).tolist()

    # Prepare records for Pinecone upsert
    vectors_for_batch = []
    for j, row in enumerate(batch_df.itertuples(index=False)):
        # The unique ID for each passage comes from the 'id' column
        doc_id = str(row.id) # Changed from row.wiki_id to row.id

        # Create the metadata dictionary
        metadata = {
            'article_title': row.article_title,
            'section_title': row.section_title,
            'passage_text': row.passage_text, # Common to include the original text in metadata for retrieval
        }
        # Add any other relevant columns from your 'df' to metadata here if you wish

        vectors_for_batch.append({
            'id': doc_id,
            'values': batch_embeddings[j], # 'j' is the index within the current batch_embeddings list
            'metadata': metadata
        })

    # Upsert the prepared batch of vectors to the Pinecone index
    try:
        index.upsert(vectors=vectors_for_batch) # Corrected syntax here
    except Exception as e:
        print(f"\nError upserting batch {i}-{i_end}: {e}")
        # Consider logging the specific batch causing issues or implementing a retry mechanism

print("\nFinished generating embeddings and upserting all batches to Pinecone.")

# Final check: Describe index stats to verify the total number of vectors
print("\nFinal Pinecone Index Statistics:")
final_index_stats = index.describe_index_stats()
if '' in final_index_stats.namespaces:
    print(f"Total vectors in index: {final_index_stats.namespaces[''].vector_count}")
else:
    print(f"Total vectors in index: 0 (No default namespace found, likely empty index).")
print(final_index_stats)


Starting embedding generation and upsert for 698 documents into Pinecone index 'abstractive-question-answering'...


Upserting batches to Pinecone:   0%|          | 0/11 [00:00<?, ?it/s]


Finished generating embeddings and upserting all batches to Pinecone.

Final Pinecone Index Statistics:
Total vectors in index: 698
{'dimension': 768,
 'index_fullness': 0.0,
 'namespaces': {'': {'vector_count': 698}},
 'total_vector_count': 698}


# Initialize Generator

We will use ELI5 BART for the generator which is a Sequence-To-Sequence model trained using the ‘Explain Like I’m 5’ (ELI5) dataset. Sequence-To-Sequence models can take a text sequence as input and produce a different text sequence as output.

The input to the ELI5 BART model is a single string which is a concatenation of the query and the relevant documents providing the context for the answer. The documents are separated by a special token &lt;P>, so the input string will look as follows:

>question: What is a sonic boom? context: &lt;P> A sonic boom is a sound associated with shock waves created when an object travels through the air faster than the speed of sound. &lt;P> Sonic booms generate enormous amounts of sound energy, sounding similar to an explosion or a thunderclap to the human ear. &lt;P> Sonic booms due to large supersonic aircraft can be particularly loud and startling, tend to awaken people, and may cause minor damage to some structures. This led to prohibition of routine supersonic flight overland.

More detail on how the ELI5 dataset was built is available [here](https://arxiv.org/abs/1907.09190) and how ELI5 BART model was trained is available [here](https://yjernite.github.io/lfqa.html).

Let's initialize the BART model using transformers.

In [31]:
from transformers import BartTokenizer, BartForConditionalGeneration

# load bart tokenizer and model from huggingface
tokenizer = BartTokenizer.from_pretrained('vblagoje/bart_lfqa')
generator = BartForConditionalGeneration.from_pretrained('vblagoje/bart_lfqa').to(device)

All the components of our abstract QA system are complete and ready to be queried. But first, let's write some helper functions to retrieve context passages from Pinecone index and to format the query in the way the generator expects the input.

In [32]:
def query_pinecone(query, top_k):
    # generate embeddings for the query
    xq = retriever.encode(query).tolist()
    # search pinecone index for context passage with the answer
    xc = index.query(vector=xq, top_k=top_k, include_metadata=True)
    return xc

In [33]:
def format_query(query, context):
    # extract passage_text from Pinecone search result and add the <P> tag
    context = [f"<P> {m['metadata']['passage_text']}" for m in context]
    # concatinate all context passages
    context = " ".join(context)
    # contcatinate the query and context passages
    query = f"question: {query} context: {context}"
    return query

Let's test the helper functions. We will query the Pinecone index function we created earlier with the `query_pinecone` to get context passages and pass them to the `format_query` function.

In [34]:
query = "when was the first electric power system built?"
result = query_pinecone(query, top_k=1)
result

{'matches': [{'id': '5726fa9cdd62a815002e96c0',
              'metadata': {'article_title': 'History_of_science',
                           'passage_text': 'The astronomer Aristarchus of '
                                           'Samos was the first known person '
                                           'to propose a heliocentric model of '
                                           'the solar system, while the '
                                           'geographer Eratosthenes accurately '
                                           'calculated the circumference of '
                                           'the Earth. Hipparchus (c. 190 – c. '
                                           '120 BC) produced the first '
                                           'systematic star catalog. The level '
                                           'of achievement in Hellenistic '
                                           'astronomy and engineering is '
                               

In [35]:
from pprint import pprint

In [36]:
# format the query in the form generator expects the input
query = format_query(query, result["matches"])
pprint(query)

('question: when was the first electric power system built? context: <P> The '
 'astronomer Aristarchus of Samos was the first known person to propose a '
 'heliocentric model of the solar system, while the geographer Eratosthenes '
 'accurately calculated the circumference of the Earth. Hipparchus (c. 190 – '
 'c. 120 BC) produced the first systematic star catalog. The level of '
 'achievement in Hellenistic astronomy and engineering is impressively shown '
 'by the Antikythera mechanism (150-100 BC), an analog computer for '
 'calculating the position of planets. Technological artifacts of similar '
 'complexity did not reappear until the 14th century, when mechanical '
 'astronomical clocks appeared in Europe.')


The output looks great. Now let's write a function to generate answers.

In [37]:
def generate_answer(query):
    # tokenize the query to get input_ids
    inputs = tokenizer([query], max_length=1024, return_tensors="pt").to(device)
    # use generator to predict output ids
    ids = generator.generate(inputs["input_ids"], num_beams=2, min_length=20, max_length=40)
    # use tokenizer to decode the output ids
    answer = tokenizer.batch_decode(ids, skip_special_tokens=True, clean_up_tokenization_spaces=False)[0]
    return pprint(answer)

In [38]:
generate_answer(query)

Truncation was not explicitly activated but `max_length` is provided a specific value, please use `truncation=True` to explicitly truncate examples to max length. Defaulting to 'longest_first' truncation strategy. If you encode pairs of sequences (GLUE-style) with the tokenizer you can select this strategy more precisely by providing a specific strategy to `truncation`.


('Electricity was invented in the late 19th century. The first electric power '
 'system was a battery powered by a small amount of coal. The first batteries '
 'were invented in the early 20th century')


As we can see, the generator used the provided context to answer our question. Let's run some more queries.

In [39]:
query = "How was the first wireless message sent?"
context = query_pinecone(query, top_k=5)
query = format_query(query, context["matches"])
generate_answer(query)

('The first wireless message was sent in the early 1900s. The first wireless '
 'message was sent in the early 1900s. The first wireless message was sent in '
 'the early 1900s. The first')


To confirm that this answer is correct, we can check the contexts used to generate the answer.

In [40]:
for doc in context["matches"]:
    print(doc["metadata"]["passage_text"], end='\n---\n')

The astronomer Aristarchus of Samos was the first known person to propose a heliocentric model of the solar system, while the geographer Eratosthenes accurately calculated the circumference of the Earth. Hipparchus (c. 190 – c. 120 BC) produced the first systematic star catalog. The level of achievement in Hellenistic astronomy and engineering is impressively shown by the Antikythera mechanism (150-100 BC), an analog computer for calculating the position of planets. Technological artifacts of similar complexity did not reappear until the 14th century, when mechanical astronomical clocks appeared in Europe.
---
The astronomer Aristarchus of Samos was the first known person to propose a heliocentric model of the solar system, while the geographer Eratosthenes accurately calculated the circumference of the Earth. Hipparchus (c. 190 – c. 120 BC) produced the first systematic star catalog. The level of achievement in Hellenistic astronomy and engineering is impressively shown by the Antikyt

In this case, the answer looks correct. If we ask a question and no relevant contexts are retrieved, the generator will typically return nonsensical or false answers, like with this question about COVID-19:

In [41]:
query = "where did COVID-19 originate?"
context = query_pinecone(query, top_k=3)
query = format_query(query, context["matches"])
generate_answer(query)

('COVID-19 is a strain of the virus that causes the common cold. The virus is '
 'a type of retrovirus, which means that it is a type of retrovirus that')


In [42]:
for doc in context["matches"]:
    print(doc["metadata"]["passage_text"], end='\n---\n')

In 1847, Hungarian physician Ignác Fülöp Semmelweis dramatically reduced the occurrency of puerperal fever by simply requiring physicians to wash their hands before attending to women in childbirth. This discovery predated the germ theory of disease. However, Semmelweis' findings were not appreciated by his contemporaries and came into use only with discoveries by British surgeon Joseph Lister, who in 1865 proved the principles of antisepsis. Lister's work was based on the important findings by French biologist Louis Pasteur. Pasteur was able to link microorganisms with disease, revolutionizing medicine. He also devised one of the most important methods in preventive medicine, when in 1880 he produced a vaccine against rabies. Pasteur invented the process of pasteurization, to help prevent the spread of disease through milk and other foods.
---
In 1847, Hungarian physician Ignác Fülöp Semmelweis dramatically reduced the occurrency of puerperal fever by simply requiring physicians to wa

Let’s finish with a final few questions.

In [43]:
query = "what was the war of currents?"
context = query_pinecone(query, top_k=5)
query = format_query(query, context["matches"])
generate_answer(query)

('The war of currents is a term used to refer to a series of conflicts that '
 'occurred in the early medieval period between the Christian and Muslim '
 'rulers of India. The conflict was the result of a')


In [44]:
query = "who was the first person on the moon?"
context = query_pinecone(query, top_k=10)
query = format_query(query, context["matches"])
generate_answer(query)

('The first person to go to the moon was Apollo 11. The Apollo 11 astronauts '
 'landed on the moon in 1969. The Apollo 11 astronauts landed on the moon in '
 '1969. The Apollo 11 astronauts')


In [45]:
query = "what was NASAs most expensive project?"
context = query_pinecone(query, top_k=3)
query = format_query(query, context["matches"])
generate_answer(query)

("I don't know if this counts as a project, but the US Navy's Project Orion "
 'was the most expensive project in the history of the US Navy. It cost $1.2 '
 'billion')


As we can see, the model can generate some decent answers.

#### Add a few more questions