[Previous Notebook](question-answering-deployment.ipynb)
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&ensp;
[Home Page](../START_HERE_RIVA_BOOTCAMP.ipynb)

&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;
[1](question-answering-training.ipynb)
[2](question-answering-deployment.ipynb)
[3]
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;
&emsp;&emsp;&emsp;&emsp;&emsp;

<img src="http://developer.download.nvidia.com/compute/machine-learning/frameworks/nvidia_logo.png" style="width: 90px; float: right;">

# Wikipedia Question Answering

This notebook walks through using Riva NLP Services to generate an answer to an incoming user question using Riva' Question and Answering modul by querying Wikipedia for a summary of the topic in question.

First make sure the Wikipedia API is installed. Alter the bellow command if you use a different package manager. Then we import the required packages.




In [1]:
##!pip install wikipedia
##already installed in container

In [4]:
import wikipedia as wiki

import grpc

import riva_api.riva_nlp_pb2 as rnlp
import riva_api.riva_nlp_pb2_grpc as rnlp_srv

### Wikipedia Summary
Next we accept user input. We use the Wikipedia API to find the most relevant articles. For the purpose of this example, we combine the top few article summaries. You can change the number of articles with `max_articles_combine`.

In [10]:
input_query = "What is Computer Science?"

wiki_articles = wiki.search(input_query)
max_articles_combine = 3
combined_summary = ""

if len(wiki_articles) == 0:
    print("ERROR: Could not find any matching results in Wikipedia.")
else:
    for article in wiki_articles[:min(len(wiki_articles), max_articles_combine)]:
        print(f"Getting summary for: {article}")
        combined_summary += "\n" + wiki.summary(article)

Getting summary for: Computer science
Getting summary for: Outline of computer science
Getting summary for: List of unsolved problems in computer science


### Query Riva Server

Lastly, we define the GRPC channel to send a request to the Riva server, and define a corresponding request. This allows us to query the Riva server with the given input query and the context taken from Wikipedia.

Make sure you set your grpc channel to the appropriate port.

In [12]:
# Alter if using a different port
channel = grpc.insecure_channel('localhost:50051')

riva_nlp = rnlp_srv.RivaLanguageUnderstandingStub(channel)

req = rnlp.NaturalQueryRequest()

req.query = input_query
req.context = combined_summary
resp = riva_nlp.NaturalQuery(req)

print(f"Query: {input_query}")
print(f"Answer: {resp.results[0].answer}")

Query: What is Computer Science?
Answer: the study of algorithmic processes, computational machines and computation itself.


Exercise: We will now use some paragraphs from the SQUAD dataset to generate Answers to the given questions using our model

Paragraph 1:
"For a long time, it was thought that the Amazon rainforest was only ever sparsely populated, as it was impossible to sustain a large population through agriculture given the poor soil. Archeologist Betty Meggers was a prominent proponent of this idea, as described in her book Amazonia: Man and Culture in a Counterfeit Paradise. She claimed that a population density of 0.2 inhabitants per square kilometre (0.52/sq mi) is the maximum that can be sustained in the rainforest through hunting, with agriculture needed to host a larger population. However, recent anthropological findings have suggested that the region was actually densely populated. Some 5 million people may have lived in the Amazon region in AD 1500, divided between dense coastal settlements, such as that at Marajó, and inland dwellers. By 1900 the population had fallen to 1 million and by the early 1980s it was less than 200,000."

Paragraph 2: 
"The first European to travel the length of the Amazon River was Francisco de Orellana in 1542. The BBC's Unnatural Histories presents evidence that Orellana, rather than exaggerating his claims as previously thought, was correct in his observations that a complex civilization was flourishing along the Amazon in the 1540s. It is believed that the civilization was later devastated by the spread of diseases from Europe, such as smallpox. Since the 1970s, numerous geoglyphs have been discovered on deforested land dating between AD 0–1250, furthering claims about Pre-Columbian civilizations. Ondemar Dias is accredited with first discovering the geoglyphs in 1977 and Alceu Ranzi with furthering their discovery after flying over Acre. The BBC's Unnatural Histories presented evidence that the Amazon rainforest, rather than being a pristine wilderness, has been shaped by man for at least 11,000 years through practices such as forest gardening and terra preta."

Question 1: What is the name of the book written by Archeologist Betty Meggers?

Question 2: What is the maximum square miles did Betty Meggers claim that can be sustained in the rainforest?

Question 3: What would be needed to host a larger population?

Question 4: Which findings suggested that the region was densely populated?

Question 5: Who was the first European to travel the Amazon River?

Question 6: How long since it's been that geoglyphs were first discovered on deforested land?

In [14]:
## first, import required libraries and configure ports according to your connection
##next, initialize variables with the input queries and questions

##your solution goes here