# Advanced Question Analysis

The goal of this assignment is to write a more flexible version of the interactive QA system. As in the previous assignment, the system should be able to take a question in natural language (English) as input, analyse the question, and generate a SPARQL query for it.

## Assignment  // Additional requirements

* Make sure that your system can analyse at least two more question types. E.g. questions that start with *which*, *when*, where the property is expressed by a verb, etc.
* Apart from the techniques introduced last week (matching tokens on the basis of their lemma or part-of-speech), also include at least one pattern where you use the dependency relations to find the relevant property or entity in the question. 
* Include 10 examples of questions that your system can handle, and that illustrate the fact that you cover additional question types

## Examples

Here is a non-representative list of questios and question types to consider. See the list with all questions for more examples

* For what movie did Leonardo DiCaprio win an Oscar?
* How long is Pulp Fiction?
* How many episodes does Twin Peaks have?
* In what capital was the film The Fault in Our Stars, filmed?
* In what year was The Matrix released?
* When did Alan Rickman die?
* Where was Morgan Freeman born?
* Which actor played Aragorn in Lord of the Rings?
* Which actors played the role of James Bond
* Who directed The Shawshank Redemption?
* Which movies are directed by Alice Wu?


In [1]:
import spacy

nlp = spacy.load("en_core_web_sm") # this loads the model for analysing English text
                   

## Dependency Analysis with Spacy

All the functionality of Spacy, as in the last assignment, is still available for doing question analysis. 

In addition, also use the dependency relations assigned by spacy. Note that a dependency relation is a directed, labeled, arc between two tokens in the input. In the example below, the system detects that *movie* is the subject of the passive sentence (with label nsubjpass), and that the head of which this subject is a dependent is the word *are* with lemma *be*. 


In [2]:
question = 'Which movies are directed by Alice Wu?'

parse = nlp(question) # parse the input 

for word in parse : # iterate over the token objects 
    print(word.lemma_, word.pos_, word.dep_, word.head.lemma_)

which DET det movie
movie NOUN nsubjpass direct
be AUX auxpass direct
direct VERB ROOT direct
by ADP agent direct
Alice PROPN compound Wu
Wu PROPN pobj by
? PUNCT punct direct


## Phrases

You can also match with the full phrase that is the subject of the sentence, or any other dependency relation, using the subtree function 


In [3]:
def phrase(word, remove = "") :
    children = []
    for child in word.subtree :
        children.append(child.text.replace(remove,''))
    return " ".join(children)
        
for word in parse:
    if word.dep_ == 'nsubjpass' or word.dep_ == 'agent' :
        phrase_text = phrase(word)
        print(phrase_text)
        

Which movies
by Alice Wu


## Visualisation

For a quick understanding of what the parser does, and how it assigns part-of-speech, entities, etc. you can also visualise parse results. Below, the entity visualiser and parsing visualiser is demonstrated. 
This code is for illustration only, it is not part of the assignment. 

In [4]:
from spacy import displacy

question = 'In how many films is Pulp Fiction?'

parse = nlp(question)

displacy.render(parse, jupyter=True, style="ent")

displacy.render(parse, jupyter=True, style="dep")

# Assignment Submission
### S3889807

## Code from last assignment
- Get wikidata IDs
- Generate SPARQL Queries
- Connect to wikidata endpoint to get SPARQL results

In [5]:
import requests

def get_wikidata_ids(name, search_property = False):
    """
    Returns a list of ID dictionaries (with labels and possibly descriptions)
    for a given name, either looking for entities or properties (set search_property:=True for the latter)
    Each dict contains keys: 'id', 'label', and possibly 'description'.
    If a description cannot be found, it will not be included in the dict.
    """
    all_results = []
    
    url = 'https://www.wikidata.org/w/api.php'
    params = {'action':'wbsearchentities', 
              'language':'en',
              'format':'json'}
    
    # add a param to the request if it needs to look for a property
    if search_property:
        params['type'] = 'property'
    
    params['search'] = name
    json = requests.get(url,params).json()
    
    # extract only the useful data from the json file
    try:
        for result in json['search']:
            # append an empty dictionary
            all_results.append({})
            # add the ID and label
            all_results[-1]['id'] = result['id']
            all_results[-1]['label'] = result['label']
            # add a description if it exists
            if 'description' in result.keys():
                all_results[-1]['description'] = result['description']
    except:
        # no results
        pass
    return all_results

In [6]:
def generate_sparql_query(entity_id, property_id):
    """ 
    Returns string with entity id and property id in place as a SPARQL query
    """
    query = f'''SELECT ?answerLabel WHERE {{
                wd:{entity_id} wdt:{property_id} ?answer.
                SERVICE wikibase:label {{ bd:serviceParam wikibase:language "en". }}
                }}'''
    return query

def getSPARQLresults(query):
    """
    Relates to previous assignment. Return results (string) for a SPARQL query.
    The format is arbitrary can can be changed as desired.
    """
    url = 'https://query.wikidata.org/sparql'
    results = ""
    data = requests.get(url, params={'query': query, 'format': 'json'}).json()
    for item in data['results']['bindings']:
        for var in item :
            results+=('{}\t{}\n'.format(var,item[var]['value']))
            
    return results

In [7]:
"""
Helpers
"""

def children(q, head, includeSelf):
    """
    Returns direct children and self if needed
    """
    children = []
    for token in q:
        if (token != head and token.head == head) or (includeSelf and token == head):
            children.append(token)
    return children

def get_root(doc):
    """
    Return the root of the dependency tree
    in a given nlp-parsed sentence (root)
    """
    for word in doc:
        if word.dep_ == "ROOT":
            return word
        
def phrase(word, remove = ""):
    """
    Given code: Return the phrase that the given word heads
    """
    children = []
    for child in word.subtree:
        children.append(child.text.replace(remove,''))
    return " ".join(children)

def findDep(q, dep):
    """
    Returns the first token corresponding to the dependency. Else false
    """
    for word in q:
        if word.dep_ in dep:
            return word
    return False

def nominalize(word):
    nom_dict = {
        'much' : 'quantity',
        'long' : 'duration',
        'many' : 'quantity',
        'often' : 'frequency',
        'birthday' : 'date of birth'
    }
    if word in nom_dict:
        return nom_dict[word]
    else:
        return word

In [8]:
### Entity extraction functions ###
import re
import spacy
from spacy.tokenizer import Tokenizer


def get_named_entities(doc):
    """ 
    spacy has entity recognition in-built, which might work well
    for names, but not for multi-word named entities (like movie titles)
    """
    return doc.ents

def custom_tokenizer(nlp):
    """
    spacy gives the programmer the ability to customize the tokenizer using regex.
    This one specifically looks for sets of contiguous words that all have an upper-
    case letter (i.e. that are titled). This can alternatively be done by using spacy's
    istitle() function on all combinations of words, but that is less efficient.
    e.g. "How I Met Your Mother" will be a single token using this.
    """
    token_re = re.compile(r"([A-Z][a-z']*(?:[\s][A-Z][a-z]+)*)")
    return Tokenizer(nlp.vocab, token_match = token_re.findall)

def get_entity_complex(q_str):
    """
    calls the above function on a query string
    """
    nlp.tokenizer = custom_tokenizer(nlp)
    doc = nlp(q_str)
    # return the last named entity since the needed 
    # entity is likely at the very end of the string
    return doc[-1].text

def get_closest_proper_noun(root, remove = ''):
    """
    It is often the case that the proper noun
    that is most closely associated with the root
    is the most relevent entity in question.
    This is a recursive function starting at the 
    root and doing a BFS through the tree
    """
    pn = None
    for child in root.children:
        if child.pos_ == 'PROPN':
            pn = phrase(child, remove)
            return pn
        
        pn = get_closest_proper_noun(child)
        if pn is not None:
            break
    
    return pn

### Parser for natural language query ###

def preprocess(query):
    """
    Preprocessing for looking for entities using
    non-dependency methods. This is not strictly
    necessary, but makes it slightly less brittle
    wrt the orthography of the sentence.
    """
    query = query.replace('?','')
    query = query.replace(query[0], query[0].lower(), 1)
    return query

def get_entities(query):
    """
    Return the possible entities of a given English query.
    The possibilites are:
        Proper noun phrase closest to root
        Named entities according to SPACY
        Regex-found entities (titled words in a row)
    """
    nlp = spacy.load("en_core_web_sm") # this loads the model for analysing English text
    entities = []
    
    doc = nlp(query)
    
    root = get_root(doc)
    entities.append(get_closest_proper_noun(root))
    
    query = preprocess(query)
    entities += [str(e) for e in get_named_entities(doc)]
    entities.append(get_entity_complex(query))
    
    entities = [str(e) for e in entities if " 's" not in e]
    
    return list(set(entities))

In [9]:
### Property Extraction functions ###
def reduce_based_on_ids(id_list):
    """
    If there are multiple ways of getting a list of properties,
    then they may be repeated. This simply removes duplicates,
    while not changing the relative order within the input list.
    """
    id_set = {}
    for obj in id_list:
        id_set[obj['id']] = obj

    return list(id_set.values())

def q_type_addition(doc):
    if doc[0].dep == 'prep' or doc[len(doc)-1].dep == 'prep':
        return True
    return False

def q_type_binary(doc):
    return doc[0].lemma_ in ['be', 'do', 'have']

"""Select question type"""
def questionType(doc):
    options = {
            'What' : whatOrWho,
            'Who' : whatOrWho,
            'When' : whenOrWhere,
            'Where' : whenOrWhere,
            'Howlong' : howLong,
            'Howmany' : howMany}
    if (q[0].text+q[1].text in options):
        options[q[0].text+q[1].text](q)
    elif (q[0].text in options):
        options[q[0].text](q)
    else:
        print("question type not supported, but we'll try...")
        passive(q)

def get_root_related_props(doc, entity):
    """
    Several methods to try and get properties with
    respect to the root of the question.
    
    ps <- list of possible properties
    For each child in root:
        (i) it cannot be a property if it is the entity
        (ii) it cannot be a property if it is a question word (w-word)
        (iii) if it is a nominal subject, add it to ps
        (iv) if it is a direct object, add it to ps
        (v) if it is an adjective, add it to ps
    If the root itself is not a simple word, add it to ps (e.g. if root := 'direct')
    
    return list of possible properties.
    
    Note: The lemmas and the phrases are added in order to make sure
          multi-word properties (e.g. 'voice actor') are also considered
    """
    ps = []
    root = get_root(doc)

    for child in root.children:
        if phrase(child) == entity:
            continue
        if child.text.lower() in ['who', 'what', 'when', 'how', 'which']:
            continue
        if child.dep_ == 'nsubj':
            ps.append(phrase(child))
            ps.append(child.text)
        if child.dep_ == 'dobj':
            ps.append(phrase(child))
            ps.append(child.text)
        if child.pos_ == 'ADJ':
            ps.append(nominalize(child.lemma_))
            ps.append(child.text)
            ps.append(child.lemma_)
    if root.lemma_ not in ['be', 'have', 'do']:
        ps.append(root.text)
        ps.append(root.lemma_)
    return ps

def get_properties(q, entity):
    """
    Returns list of possible properties (list of strings)
    """
    nlp = spacy.load("en_core_web_sm")
    doc = nlp(q)
    
    ps = get_root_related_props(doc, entity)

    # remove all Nones
    return [x for x in ps if x is not None]
    

In [10]:
### Functions that are majorly hueristic/custom ###

def entity_related_to_movies(entity_list):
    """
    Given a list of dictionaries with information about the entity,
    check if the description contains a word that is related to a movie.
    These have been chosen based on wordnet's synsets. This can easily be
    extended or made more complex using nltk, but kept straightforward for now
    as it works well enough. This helps remove non-relevent entities that have
    with the same name, but not related to movies (e.g. Lord of the Rings book series)
    """
    valid = []
    movie_relation = ['movie', 'film', 'picture', 'moving picture', 'motion', 'pic', 'flick', 'TV',
                      'television', 'show', 'animation', 'animation']
    character_relation = ['fiction', 'fictitious', 'character']
    actor_relation = ['actor', 'actress', 'thespian']
    music_relation = ['musician', 'music', 'score', 'compose', 'song']
    
    all_relations = []
    all_relations += movie_relation
    all_relations += character_relation
    all_relations += actor_relation
    all_relations += music_relation
    
    for word in all_relations:
        for e in entity_list:
            if 'description' in e.keys():
                if word in e['description']:
                    if e not in valid:
                        valid.append(e)
                    
    return valid
            

In [11]:
import time

def pipeline(query, moderated = False):
    """
    Combines the above functions to create a pipeline to answer questions.
    
    Input: English string of the form "Who/What was/is/were (the) X of Y?"
    Output: Result of that query if found

    Often if there are too many queries sent to the endpoint at once,
    it will return none, so an optional boolean moderation is added
    to add an artificial 0.5 seconds between each request. This can be slower
    but has a lower chance of producing a request-related error. If an error
    occurs, try running with moderated = True
    """
    result = ''
    
    # get entities
    entities = get_entities(query)

    entities += [entity.replace("the ", "") for entity in entities]
    entities += [entity.replace("'s", "").strip() for entity in entities]
    entities = list(set(entities))
    entities = sorted(entities, key=len, reverse=True)
#     print(entities)
    entity_ids = []
    for entity in entities:
        entity_ids += entity_related_to_movies(get_wikidata_ids(entity))
    
    entity_ids = reduce_based_on_ids(entity_ids)
#     entity_ids = sorted(entity_ids, key=len, reverse=True)
#     print(entity_ids)
    
    # get properties
    properties = get_properties(query, entity)
    
    property_ids = []
    for p in properties:
        property_ids += get_wikidata_ids(p, True)
     
    # remove duplicates
    property_ids = reduce_based_on_ids(property_ids)
   
    # wrap in a try/except to help with request errors
    try:
        # for each combination of entities and properties
        # it is likely that the entities and properties
        # are sorted by relevence/similarity by wikidata
        # so return the first result that it finds. This
        # is not guaranteed however
        for entity_id in entity_ids:
            for property_id in property_ids:
#                 print(entity_id['label'], property_id['label'])
                # general SPARQL query
                sparql_query = generate_sparql_query(entity_id['id'], property_id['id'])
                result = getSPARQLresults(sparql_query)

                # check if there is a result
                if result is not None and result!='':
                    print("Closest answer:")
                    print(f"        entity: {entity_id['label']}")
                    print(f"      property: {property_id['label']}\n")
                    return result

                if moderated:
                    time.sleep(0.75)
                    
    except:
        print("Error while searching!")
        if not moderated:
            print("Attempting moderated search!")
            return pipeline(query, moderated = True)
        
        else:
            pass # goes directly to final return statement
            
    return "Answer not found"

## Question handling

This QA system should be able to handle questions about movies of several types, but specifically desiged to be able to work with the following, with X being the property and Y being the entity:
- Who/What/When/etc was/is/were the/a/an X of Y? (from previous assignment, more passive, noun properties)
- Who/What/When/etc was/is/were Y X? (similar to above, more active, verb properties)
- How X is Y? (similar questions that use adjective properties)

The following are pairs of questions that the system is able to answer. These are in pairs to show that the same question that is phrased differently (as long as it follows an above format) should give the same answer. A noun property (e.g. height) can be translated to a adjective property (e.g. tall). Similarly, a verb property (acted) can be translated to a noun property (actor).

In [12]:
# qs = ['Who directed The Shawshank Redemption?'
#      ,'Who is the director of The Shawshank Redemption?'
      
#      ,'What is the birth date of Alan Rickman?'
#      ,'When was Alan Rickman born?'
      
#      ,'What is the height of Amitabh Bachchan?'
#      ,'How tall is Amitabh Bachchan?'
      
#      ,'What is the publication date of The Dark Knight?'
#      ,'When was The Dark Knight published?'
     
#      ,'Who acted as Gollum?'
#      ,'Which actor played Gollum?'
     
#      ,'What is the length of Interstellar?'
#      ,'How long does Interstellar run?'
#     ]
# q11 = '''When did Alan Rickman die?'''
# q12 = '''When was Pulp Fiction published?'''
# q13 = '''Where was Morgan Freeman born?'''
# q14 = '''Where does Home Alone originate?'''
# q15 = '''Which movies are directed by Alice Wu?'''
# q16 = '''How long is Pulp Fiction?'''
# q17 = '''How many episodes does Twin Peaks have?'''
# q18 = '''How long is Interstellar?'''
# q19 = '''Which character was married by Aragorn'''
# q20 = '''Which character did Aragorn marry?'''
# for i in range(12,20):
#     continue
#     qs.append(globals()['q'+str(i)])
    
# for q in qs:
#     print(f"Query: {q}")
#     print(pipeline(q))
#     print("\t**********\n")

In [13]:
print(pipeline("When was Kiki's Delivery Service published?"))

Closest answer:
        entity: Kiki's Delivery Service
      property: publication date

answerLabel	1989-07-29T00:00:00Z
answerLabel	1998-05-23T00:00:00Z
answerLabel	2004-03-31T00:00:00Z



In [14]:
from run_qs import get_q_list

qs = [q[0] for q in get_q_list()]

total_ans = 0
not_found = 0

f = open('answerlist.txt', 'w', encoding = 'utf-8')
f.write('No|Query|Answer\n')

for q in qs:
    print(f'{total_ans}) Query: {q}')
    ans = ''
    try:
        ans = pipeline(q)
    except:
        ans = 'Error'
        not_found += 1
    
    ans = ans.replace('answerLabel\t', "").strip()
    ans = ans.replace('\n', ", ").strip()
    
    try:
        f.write(f'{total_ans}|{q}|{ans}\n')
    except:
        print("Could not write to file")
        
    if ans == 'Answer not found':
        not_found += 1
    total_ans += 1
    
    print (f'\nAnswer: {ans}\n')

f.close()

print(f'Questions queried: {total_ans}')
print(f'Not found ratio: {not_found/total_ans}')

0) Query: What movie won best picture in the 2020 Academy Awards?

Answer: Answer not found

1) Query: Who voices Nala in the 2019 version of The Lion King?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

2) Query: What was the budget for Avengers: Endgame?
Closest answer:
        entity: The Avengers
      property: cost


Answer: 220000000

3) Query: Which is the longest Jim Carrey movie?

Answer: Answer not found

4) Query: What movie(s) feature both Jack Nicholson and Meryl Streep?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

5) Query: Which director has been nominated most for the Best Director Academy Award?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

6) Query: What is the name of the actor who's father was a hitman?

Answer: Error

7) Query: What movie(s) did M. Night Shyamalan direct before the year 2000?
Error 

Error while searching!

Answer: Answer not found

42) Query: Who are the executive producers of Suits?

Answer: Answer not found

43) Query: When did the pilot episode of The Blacklist air?

Answer: Error

44) Query: Who directed the 2018 biographical film about Queen?

Answer: Error

45) Query: In what country was Bohemian Rhapsody released on the 31st of October, 2018?
Closest answer:
        entity: Bohemian Rhapsody
      property: publication date


Answer: 2018-10-24T00:00:00Z, 2018-10-31T00:00:00Z, 2018-11-01T00:00:00Z, 2018-11-02T00:00:00Z

46) Query: What actor received an Oscar/Academy Award for their work in Bohemian Rhapsody?
Closest answer:
        entity: Rhapsody
      property: cast member


Answer: Elizabeth Taylor, Michael Chekhov, Vittorio Gassman, Celia Lovsky, Barbara Bates, John Mylong, Stuart Whitman, Louis Calhern, John Ericson, Richard Hageman, Marion Elisabeth Degler, Madge Blake, Konstantin Shayne, Stuart Holmes, Gordon Richards, Richard Lupino

47) Query: Fo

Error while searching!

Answer: Answer not found

97) Query: When did Charlie Chaplin pass away?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

98) Query: When was the subreddit for Star Wars created?
Closest answer:
        entity: Star Wars
      property: subreddit


Answer: StarWars

99) Query: What awards did Frozen receive?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

100) Query: Who is the director of the movie Midsommar?
Error while searching!
Attempting moderated search!
Closest answer:
        entity: Midsommar
      property: director


Answer: Ari Aster

101) Query: What are the main subjects of the movie Tenet?
Error while searching!
Attempting moderated search!

Answer: Answer not found

102) Query: What movie was Interstellar influenced by?
Closest answer:
        entity: Interstellar
      property: influenced by


Answer: 2001: A Space Odyssey

103) Query: Wh

Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

141) Query: When did Kung Fu Panda came out as a movie?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

142) Query: Who are the members of the Teenage Mutant Ninja Turtles?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

143) Query: Who is the director of Pulp Fiction?
Closest answer:
        entity: Pulp Fiction
      property: director


Answer: Quentin Tarantino

144) Query: Who is an enemy of Bilbo Baggins?

Answer: Answer not found

145) Query: Who is the father of Simba?
Closest answer:
        entity: Simba
      property: father


Answer: Mufasa

146) Query: What is the duration of I Am Legend?
Closest answer:
        entity: Legend
      property: instance of


Answer: film

147) Query: What genre is The Hangover?

Answer: Answer not found

148) Query: Who is Batman?

An

Closest answer:
        entity: Star Wars sequel trilogy
      property: director


Answer: J. J. Abrams, Rian Johnson

198) Query: Who played Hitler in Der Untergang?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

199) Query: Which movies did John Williams compose music for?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

200) Query: What year were the first Academy Awards hosted?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

201) Query: Who are the cast members in The Lighhouse?

Answer: Answer not found

202) Query: Where was The Seventh Seal filmed?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

203) Query: What is the original language of Memories of Murder?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

204) Q

Closest answer:
        entity: Interstellar
      property: publication date


Answer: 2014-10-26T00:00:00Z, 2014-11-05T00:00:00Z, 2014-11-06T00:00:00Z, 2014-11-07T00:00:00Z

256) Query: Where was Mark Ruffalo born?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

257) Query: How tall is Bryan Cranston?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

258) Query: What languages does Christoph Waltz speak?
Closest answer:
        entity: Christoph Waltz
      property: languages spoken, written or signed


Answer: French, German, English

259) Query: Who is the director of Fury?

Answer: Error

260) Query: How many awards has Quentin Tarentino won?

Answer: Answer not found

261) Query: When was The Shawshank Redemption released in the United States?
Closest answer:
        entity: The Shawshank Redemption
      property: publication date


Answer: 1994-09-10T00:00:00Z, 1994-09-23T

Closest answer:
        entity: Mission: Impossible – Fallout
      property: screenwriter


Answer: Christopher McQuarrie

313) Query: When did Paul Walker die?
Closest answer:
        entity: Paul Walker
      property: date of death


Answer: 2013-11-30T00:00:00Z

314) Query: How many twitter followers does Zac Efron have?
Closest answer:
        entity: Zac Efron
      property: social media followers


Answer: 15622151

315) Query: Who are the creators of How I met your mother?

Answer: Error

316) Query: Who is the main character in Hannah Montena?

Answer: Answer not found

317) Query: What year was the notebook made?

Answer: Error

318) Query: Who are the directors of Iron man

Answer: Answer not found

319) Query: Who won the first Academy Award?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

320) Query: Who won the most Academy Awards?

Answer: Answer not found

321) Query: What is the highest grossing movie of all time?


Answer: Answer not found

372) Query: What is the title of the theme music of the TV-show Friends?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

373) Query: When did Chadwick Boseman die?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

374) Query: Which actor plays the lead character in the TV-series the Mandalorian?
Closest answer:
        entity: The Mandalorian
      property: cast member


Answer: Werner Herzog, Nick Nolte, Mark Hamill, Rosario Dawson, Katee Sackhoff, Gina Carano, Ming-Na Wen, Amy Sedaris, Omid Abtahi, Carl Weathers, Temuera Morrison, Giancarlo Esposito, Bill Burr, Pedro Pascal, Emily Swallow

375) Query: Who is the creator of the TV-series the Mandalorian?
Closest answer:
        entity: The Mandalorian
      property: creator


Answer: Jon Favreau

376) Query: What are the main subjects of the Netflix miniseries the Queen's Gambit?

Answer: Answer not fo


Answer: Error

429) Query: When did the first Harry Potter movie come out?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

430) Query: Who played Frodo Baggins in The Lord of the Rings movies?
Closest answer:
        entity: Frodo Baggins
      property: performer


Answer: Ian Holm, Elijah Wood

431) Query: Where did the Oscars take place in 2020?
Error while searching!
Attempting moderated search!
Closest answer:
        entity: The Oscars
      property: country of origin


Answer: United States of America

432) Query: How long is the movie "127 Hours"?

Answer: Error

433) Query: What is the genre of the movie "Zootopia"?
Closest answer:
        entity: Zootopia
      property: genre


Answer: adventure film, family film, action comedy film, computer-animated film

434) Query: Who directed the movie "Zootopia"?
Closest answer:
        entity: Zootopia
      property: director


Answer: Byron Howard, Rich Moore, Jared Bush

435)

Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

481) Query: Who were the creators of one of the most popular TV shows of all times, 'Game of Thrones', often referred to by the fanbase as D and D?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

482) Query: What are the names of all three parts of the Lord of the Rings trilogy?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

483) Query: What is the release year of 'Die Hard'?

Answer: Error

484) Query: What is the country of origin of the animated series 'Death Note'?

Answer: Answer not found

485) Query: Who composed the music for the epic science fantasy space opera saga and media franchise 'Star Wars'?
Closest answer:
        entity: Star Wars: Episode IV – A New Hope
      property: genre


Answer: fantasy film, action film, adventure film, space opera, science fiction fil

Closest answer:
        entity: Ender's Game
      property: screenwriter


Answer: Gavin Hood

527) Query: Who is the oldest producer of the "Dark Knight"?

Answer: Answer not found

528) Query: When was Quentin Tarantino born?
Closest answer:
        entity: Quentin Tarantino
      property: date of birth


Answer: 1963-03-27T00:00:00Z

529) Query: What's the name of the character played by Samuel L. Jackson in The Avengers Saga?
Closest answer:
        entity: Samuel L. Jackson
      property: given name


Answer: Samuel, Leroy

530) Query: What's the name of the character played by Samuel L. Jackson in The Avengers Saga?
Closest answer:
        entity: Samuel L. Jackson
      property: given name


Answer: Samuel, Leroy

531) Query: Who created Star Wars?
Closest answer:
        entity: Star Wars: Episode IV – A New Hope
      property: original language of film or TV show


Answer: English

532) Query: What were the filming locations of The Matrix?

Answer: Answer not found

533) 

Closest answer:
        entity: Chandler Bing
      property: performer


Answer: Matthew Perry

577) Query: When did the Avatar film hit cinemas in Germany?
Closest answer:
        entity: Avatar
      property: genre


Answer: science fiction, action film, adventure film, science fiction film, military science fiction, live-action animated film

578) Query: What is the first movie in the Harry Potter series?

Answer: Answer not found

579) Query: What is the original language of Harry Potter and the Chamber of Secrets?
Closest answer:
        entity: Harry Potter and the Chamber of Secrets
      property: original language of film or TV show


Answer: English

580) Query: When is J.K.Rowling's birthday?

Answer: Error

581) Query: What is the series ordinal of Detective Conan: The Private Eyes' Requiem?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

582) Query: Who is the director of Detective Conan: The Private Eyes' Requiem?
Cl

Closest answer:
        entity: Terence Hill
      property: date of birth


Answer: 1939-03-29T00:00:00Z

639) Query: What is the title of Pulp Fiction?
Error while searching!
Attempting moderated search!
Closest answer:
        entity: Pulp Fiction
      property: title


Answer: Pulp Fiction

640) Query: What is Pulp Fiction?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

641) Query: What are the given names of Leonardo DiCaprio?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

642) Query: Who is the mother of Leonardo DiCaprio?
Closest answer:
        entity: Leonardo DiCaprio
      property: mother


Answer: Irmelin DiCaprio

643) Query: How many people live in Hollywood?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

644) Query: What is Schindler's List about?

Answer: Answer not found

645) Query: Who are the directors

Closest answer:
        entity: Reservoir Dogs
      property: director


Answer: Quentin Tarantino

692) Query: Who is Hans Zimmer?

Answer: Answer not found

693) Query: Who won the oscar for Best Actress between 2012 - 2016
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

694) Query: Who played Wilm Hosenfeld in The Pianist?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

695) Query: Who is the voice of Shrek?
Closest answer:
        entity: Shrek
      property: voice actor


Answer: Eddie Murphy, Cameron Diaz, Mike Myers, Vincent Cassel, Chris Miller, Jim Cummings, John Lithgow, Cody Cameron, Conrad Vernon, Aron Warner, Christopher Knights

696) Query: How many awards did Christopher Nolan receive?
Closest answer:
        entity: Christopher Nolan
      property: award received


Answer: Order of the British Empire, Saturn Award, Sundance Film Festival, Edgar Awards, Commande

Closest answer:
        entity: Monsters, Inc.
      property: followed by


Answer: Monsters University

741) Query: In what format is David Attenborough: A Life on Our Planet available?

Answer: Answer not found

742) Query: Where was David Attenborough educated?

Answer: Answer not found

743) Query: When premiered the prequel to Incredibles 2?
Closest answer:
        entity: Incredibles 2
      property: follows


Answer: The Incredibles

744) Query: How long does Home Alone take?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

745) Query: What is the site of the 1997 Titanic movie?
Closest answer:
        entity: Titanic
      property: official website


Answer: http://www.titanicmovie.com

746) Query: Of how many movies consists the Harry Potter series?
Closest answer:
        entity: Harry Potter
      property: child


Answer: Albus Severus Potter, Lily Luna Potter, James Sirius Potter

747) Query: Which genre is Johnny Eng

Closest answer:
        entity: Rocky IV
      property: screenwriter


Answer: Sylvester Stallone

795) Query: Which production company produced Twister?
Error while searching!
Attempting moderated search!
Closest answer:
        entity: Twister
      property: production company


Answer: Amblin Entertainment

796) Query: What is the country of origin of "10 Things I Hate About You"?

Answer: Error

797) Query: Which song is the theme music of "No Time to Die"?

Answer: Error

798) Query: What genres is the movie Grease?

Answer: Answer not found

799) Query: What is the duration of The Fast and the Furious?
Closest answer:
        entity: The Fast and the Furious
      property: duration


Answer: 103

800) Query: Who voices the character Po in Kung Fu Panda?
Closest answer:
        entity: Panda
      property: has quality


Answer: debut single

801) Query: What is the publication date of The Road to El Dorado?
Closest answer:
        entity: El Dorado
      property: publication 

Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

851) Query: Which movies, directed by David O. Russell, did Jennifer Lawrence act in?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

852) Query: When was Quentin Tarantino's most recent movie released?
Closest answer:
        entity: Death Proof
      property: director


Answer: Quentin Tarantino

853) Query: What are the nicknames of the lead character of The Big Lebowski?

Answer: Answer not found

854) Query: When were the fiftieth Oscars held?

Answer: Answer not found

855) Query: In what year did Meryl Streep start acting?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

856) Query: When did Rushmore come out?
Closest answer:
        entity: Rushmore
      property: followed by


Answer: The Royal Tenenbaums

857) Query: What are the notable works of composer John Williams

Closest answer:
        entity: Pulp Fiction
      property: duration


Answer: 154

912) Query: Where was Joker filmed?
Closest answer:
        entity: Filmed in Supermarionation
      property: genre


Answer: documentary film

913) Query: What genre is Schindler's List?

Answer: Answer not found

914) Query: When was James Cameron born?
Closest answer:
        entity: James Cameron
      property: date of birth


Answer: 1954-08-16T00:00:00Z

915) Query: How much did The LEGO Batman Movie make at box office?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

916) Query: What movies has Ben Affleck directed?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

917) Query: What genre is Argo?

Answer: Answer not found

918) Query: Who are the cast members of The Blair Witch Project?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

919

Closest answer:
        entity: Hotel
      property: title


Answer: Hotel

967) Query: What is the duration of The Grand Budapest Hotel?
Closest answer:
        entity: Hotel
      property: duration


Answer: 114

968) Query: Where is the headquarter of Marvel Studios?
Closest answer:
        entity: Marvel Studios
      property: headquarters location


Answer: Burbank

969) Query: What character does Bruce Willis play in Die Hard?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

970) Query: Who created the Star Trek franchise?
Closest answer:
        entity: Star Trek is…
      property: media franchise


Answer: Star Trek

971) Query: Who is the composer of Star Trek III: The Search for Spock?
Closest answer:
        entity: Star Trek III: The Search for Spock
      property: composer


Answer: James Horner

972) Query: In what year was the documentary Miss Americana released?

Answer: Answer not found

973) Query: What species


Answer: Answer not found

1022) Query: Where was Roger Federer born?

Answer: Answer not found

1023) Query: What is the gender of Elon Musk?

Answer: Answer not found

1024) Query: Who founded LG Electronics?

Answer: Answer not found

1025) Query: When was the Playstation 5 released?

Answer: Answer not found

1026) Query: What was the life expectancy in The Netherlands in 1999?

Answer: Answer not found

1027) Query: Who is the composer of Iron Man?
Closest answer:
        entity: Iron Man
      property: composer


Answer: Ramin Djawadi

1028) Query: When was Jurassic Park first released?
Closest answer:
        entity: Jurassic Park
      property: publication date


Answer: 1993-06-11T00:00:00Z, 1993-09-02T00:00:00Z, 1993-09-03T00:00:00Z, 1993-10-20T00:00:00Z

1029) Query: Where was The Lord of the Rings: The Fellowship of the Ring filmed?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

1030) Query: What movie genre is How to


Answer: Error

1075) Query: When was Anne Hathaway born?
Error while searching!
Attempting moderated search!
Closest answer:
        entity: Anne Hathaway
      property: date of birth


Answer: 1982-11-12T00:00:00Z

1076) Query: What time periods is City of God set in?

Answer: Answer not found

1077) Query: On what book is Apocalypse Now based?

Answer: Error

1078) Query: How much did Titanic cost to make?
Error while searching!
Attempting moderated search!
Closest answer:
        entity: Titanic
      property: cost


Answer: 200000000

1079) Query: What language is spoken in Parasite?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

1080) Query: How many children does David Lynch have?
Closest answer:
        entity: David Lynch
      property: child


Answer: Jennifer Lynch, Austin Jack Lynch, Lula Boginia Lynch, Riley Sweeney Lynch

1081) Query: Who directed The Big Lebowski?
Closest answer:
        entity: The Big Lebowski: 

Closest answer:
        entity: Fight Club
      property: director


Answer: David Fincher

1129) Query: When was the movie Fight Club released for the first time?
Closest answer:
        entity: Fight Club
      property: director


Answer: David Fincher

1130) Query: When was the academy awards Oscars organised for the first time?

Answer: Error

1131) Query: Who is the founder and creator of Star Wars?
Closest answer:
        entity: Star Wars
      property: founded by


Answer: George Lucas

1132) Query: Who directed The Shawshank Redemption?
Closest answer:
        entity: The Shawshank Redemption
      property: director


Answer: Frank Darabont

1133) Query: What's the name of Batman?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

1134) Query: When was Brad Pitt born?
Error while searching!
Attempting moderated search!
Error while searching!

Answer: Answer not found

1135) Query: Where was Christopher Nolan born?
Closest 

In [15]:
print(pipeline("Is Leonardo DiCaprio an actor?"))

Answer not found


In [17]:
import pandas as pd
df = pd.read_csv('answerlist.txt', sep='|')

print('Questions that could not be answered:')
filtered = (df[df['Answer']=='Answer not found'])

for index, row in filtered.iterrows():
    print(row["Query"])

Questions that could not be answered:
What movie won best picture in the 2020 Academy Awards?
Who voices Nala in the 2019 version of The Lion King?
Which is the longest Jim Carrey movie?
What movie(s) feature both Jack Nicholson and Meryl Streep?
Which director has been nominated most for the Best Director Academy Award?
What movie(s) did M. Night Shyamalan direct before the year 2000?
Which actors have portrayed the character James Bond?
Where was Titanic shot?
What kind of film is Iron Man?
How much did it cost to film Tenet?
In what year did the first Harry Potter movie come out?
What are the titles of the Harry Potter movies?
What is the Kijkwijzer rating of The Hitman's Bodygard?
When was the last Lord of the Rings movie released
Which awards did Blade Runner receive
In which aspect ratio was Kingsman: The Secret Service filmed
What university did Adam Sandler attend?
How many Pokémon episodes are there?
Who are the executive producers of Suits?
What is the Netflix ID of the film 