# Advanced Question Analysis

The goal of this assignment is to write a more flexible version of the interactive QA system. As in the previous assignment, the system should be able to take a question in natural language (English) as input, analyse the question, and generate a SPARQL query for it.

## Assignment  // Additional requirements

* Make sure that your system can analyse at least two more question types. E.g. questions that start with *which*, *when*, where the property is expressed by a verb, etc.
* Apart from the techniques introduced last week (matching tokens on the basis of their lemma or part-of-speech), also include at least one pattern where you use the dependency relations to find the relevant property or entity in the question. 
* Include 10 examples of questions that your system can handle, and that illustrate the fact that you cover additional question types

## Examples

Here is a non-representative list of questios and question types to consider. See the list with all questions for more examples

* For what movie did Leonardo DiCaprio win an Oscar?
* How long is Pulp Fiction?
* How many episodes does Twin Peaks have?
* In what capital was the film The Fault in Our Stars, filmed?
* In what year was The Matrix released?
* When did Alan Rickman die?
* Where was Morgan Freeman born?
* Which actor played Aragorn in Lord of the Rings?
* Which actors played the role of James Bond
* Who directed The Shawshank Redemption?
* Which movies are directed by Alice Wu?


In [9]:
import spacy
from spacy import displacy

nlp = spacy.load("en_core_web_sm") # this loads the model for analysing English text
                   

## Dependency Analysis with Spacy

All the functionality of Spacy, as in the last assignment, is still available for doing question analysis. 

In addition, also use the dependency relations assigned by spacy. Note that a dependency relation is a directed, labeled, arc between two tokens in the input. In the example below, the system detects that *movie* is the subject of the passive sentence (with label nsubjpass), and that the head of which this subject is a dependent is the word *are* with lemma *be*. 


## Phrases

You can also match with the full phrase that is the subject of the sentence, or any other dependency relation, using the subtree function 


In [10]:
def phrase(word) :
    children = []
    for child in word.subtree :
        children.append(child.text)
    return " ".join(children)




## Visualisation

For a quick understanding of what the parser does, and how it assigns part-of-speech, entities, etc. you can also visualise parse results. Below, the entity visualiser and parsing visualiser is demonstrated. 
This code is for illustration only, it is not part of the assignment. 

In [11]:
from spacy import displacy

question = "Where was Morgan Freeman born?"

parse = nlp(question)

displacy.render(parse, jupyter=True, style="ent")

displacy.render(parse, jupyter=True, style="dep")

In [20]:
import spacy
import requests

nlp = spacy.load("en_core_web_sm")

def phrase(word) :
    children = []
    for child in word.subtree :
        children.append(child.text)
    return " ".join(children)

def first_qa():
    question = input('Please ask a question:\n')
    parse = nlp(question)
    
    # Find entity
    if len(parse.ents):
        entity = str(parse.ents[0])
    else:
        entity = list()
        for word in parse[1:]:
            if word.text.istitle():
                entity.append(word.text)
        entity = ' '.join(entity)
    # Find property
    verb_dict = {'direct': 'director', 
                 'die': 'death', 
                 'bear': 'birth', 
                 'produce': 'producer',
                 'release': 'release',
                'portray': 'actor',
                'long': 'duration'}
    attr_dict = {'when': 'date',
                 'how': 'cause',
                 'where': 'place'}
    prop = list()
    pass_query = False
    for word in parse:
        if word.dep_ == 'nsubj' or word.dep_ == "attr":
            subject = phrase(word)
        if word.dep_ == 'nsubjpass':
            pass_query = True
        if word.lemma_ in verb_dict:
            prop.append(verb_dict[word.lemma_])
    
    if prop and parse[0].lemma_ in attr_dict and prop[0] != 'duration':
        prop.append(attr_dict[parse[0].lemma_])
    
    if not prop:
        newsubject = subject.replace(entity, '')
        prop = newsubject.split()[1:-1]
     
    # Get property uri
    url = 'https://www.wikidata.org/w/api.php'
    params = {'action':'wbsearchentities', 
              'language':'en',
              'format':'json',
              'type':'property'}

    params['search'] = ' '.join(prop)
    json = requests.get(url,params).json()
    if json['search']:
        prop_id = json['search'][0]['id']
    else:
        return 'No answer found'
    
    # Get entity uri
    entity_ids = list()
    url = 'https://www.wikidata.org/w/api.php'
    params = {'action':'wbsearchentities', 
              'language':'en',
              'format':'json'}

    params['search'] = entity
    json = requests.get(url,params).json()
    
    if json['search']:
        entity_ids = [result['id'] for result in json['search']]
    else:
        return 'No answer found'
    
    # create SPARQL query for each key-word
    answers = list()
    for ent_id in entity_ids:
        if pass_query:
            query = "SELECT ?answerLabel WHERE { ?answer wdt:" + prop_id + " wd:" + ent_id + ". SERVICE wikibase:label { bd:serviceParam wikibase:language 'en' }}"
        else:
            query = "SELECT ?answerLabel WHERE { wd:" + ent_id + " wdt:" + prop_id + " ?answer. SERVICE wikibase:label { bd:serviceParam wikibase:language 'en' }}"    
        url = 'https://query.wikidata.org/sparql'
        results = requests.get(url, params={'query': query, 'format': 'json'}).json()
        if results['results']['bindings']:
            for result in results['results']['bindings']:
                answers.append(result['answerLabel']['value'])
    return '\n'.join(set(answers))

In [24]:
print(first_qa())

Please ask a question:
Who produced The Avengers?
Jerry Weintraub
Kevin Feige


# Questions:
* Q1: Who directed Parasite?
* Q2: Which movies were directed by Alice Wu?
* Q3: When did The Godfather release?
* Q4: How did Alan Rickman die?
* Q5: How long is Pulp Fiction?
* Q6: Which actors have portrayed the character James Bond?
* Q7: What is the country of citizenship of Black Panther?
* Q8: Where was Morgan Freeman born?
* Q9: Who was the director of photography of Fargo?
* Q10: Who produced The Avengers?