# Assignment no.2: 
Implementation of information retrieval system (TREC Collection)

## 💬 Commentary
This solution was supposed to implement similar solution to the one from first assignment 
(https://github.com/adamhospodka/Information_Retrieval_System/blob/main/Implementation_of_information_retrieval_system.ipynb) 
adding the ML capabilities.

However due to time/performance issues I had to downgrade from the additional ML process. It only relies on TF-IDF model.

Unfortunatelly, even this simple solution still scores *0.00 MAP*. I give up.



Steps:

1. List every word from every document using ```multi_buildPairs()``` function
2. Remove duplicates using the ```uniq()``` function
3. Based on these pairs create inverted index using the ```buildInvertedIndex()``` function
4. Use knowledge of inverted index to build frequency index (inv. index with word frequencies) using the ```multi_buildFrequencyIndex()``` function
5. Build index that contains length of each document using the ```buildDocumentsLengthIndex()``` function
6. Prepare Pandas DataFrame with document instances ready to be ranked using the ```buildRankingDf()``` function
7. Initialize the search with ```IRSystem.search()```
8. Clean the given query with the ```cleanQuery()``` function
9. Rank the relevant documents with the ```rank()``` function

## 📚 Imports

In [None]:
! pip install git+https://gitlab.fi.muni.cz/xstefan3/pv211-utils.git@master | grep '^Successfully'
! pip install pandas
! pip install numpy
! pip install nltk

In [1]:
import math
import time
import pickle
import json
import numpy as np
import pandas as pd

import nltk
from nltk.corpus import stopwords

from tqdm import tqdm
from collections import OrderedDict
from itertools import chain

from pv211_utils.trec.loader import load_documents
from pv211_utils.trec.entities import TrecQueryBase
from pv211_utils.trec.loader import load_queries
from pv211_utils.trec.entities import TrecDocumentBase
from pv211_utils.trec.loader import load_judgements
from pv211_utils.trec.leaderboard import TrecLeaderboard
from pv211_utils.trec.eval import TrecEvaluation
from pv211_utils.trec.irsystem import TrecIRSystemBase


In [None]:
stemmer = nltk.PorterStemmer()
stopwords = stopwords.words('english')

In [None]:
nltk.download('punkt')
nltk.download('stopwords')

## 🏗 Classes & Instances loading

#### Documents/Corpus

In [2]:
class Document(TrecDocumentBase):
    """
    A preprocessed TREC collection document
    """

    def __init__(self, document_id: str, body: str):        
        super().__init__(document_id, body)

In [3]:
documents = load_documents(Document, cache_download='/var/tmp/pv211/trec_documents.json.gz')

Computing MD5: /var/tmp/pv211/trec_documents.json.gz
MD5 matches: /var/tmp/pv211/trec_documents.json.gz


In [4]:
CORPUS = documents

#### Queries

In [5]:
class Query(TrecQueryBase):
    """
    A preprocessed TREC collection query
    """    

    def __init__(self, query_id: int, title: str, body: str, narrative: str):

        super().__init__(query_id, title, body, narrative)

In [6]:
train_queries = load_queries(Query, 'train')
validation_queries = load_queries(Query, 'validation')

bigger_train_queries = OrderedDict(chain(train_queries.items(), validation_queries.items()))

#### Judgements

In [7]:
train_judgements = load_judgements(train_queries, documents, 'train')
validation_judgements = load_judgements(validation_queries, documents, 'validation')

bigger_train_judgements = train_judgements | validation_judgements

---

## ⚙️ Preprocessing

#### Tokenizing bodies

In [2]:
def tokenize(document):
    """
    Creates a dictionary with document ids (keys) and list of stemmed, tokenized terms (values).
    Constructed to be compatible with the pool.map() function.
    """
    
    doc_id = document.document_id
    body = document.body
    body = nltk.word_tokenize(body)
    body = [stemmer.stem(token) for token in body]
    
    return {doc_id: body}

In [None]:
# Parallel processing
with Pool(None) as pool:
    tokenized_bodies = pool.map(tokenize, tqdm(CORPUS.values()) )

In [None]:
# Reforming
tokenized_bodies_unzipped = {}
for document in tokenized_bodies:
    key = list(document.keys())[0]
    tokenized_bodies_unzipped[key] = document[key]

##### Save and load pickle

In [None]:
# Save
with open('/data/xhospodk/tokenized_bodies_unzipped.pkl', 'wb') as f:
    pickle.dump(tokenized_bodies_unzipped, f)

In [8]:
# Load
with open('/data/xhospodk/tokenized_bodies_unzipped.pkl', 'rb') as f:
    tokenized_bodies_unzipped = pickle.load(f)

#### Build Pairs

In [None]:
def multi_buildPairs(document):
    """
    Crawls every document instance and creates a tuples -> (token, document id). 
    It only does it so when token obeys certain conditions.
    """
    
    pairs = []
    punc = '''!(),---[]{};::'"\ <>./?@#$%^&*_␣~\n\n'''
       
    tokens = tokenized_bodies_unzipped[document.document_id]

    token = ""
    doc_id = ""

    for token in tokens:
        if len(token) > 2:
            if token not in stopwords:
                for letter in token:
                    if letter in punc:
                        token = token.replace(letter, "")
                            
                doc_id = document.document_id
                pairs.append((token, doc_id))
    
    return pairs

In [None]:
# Parallel processing
with Pool() as pool:
    multi_pairs = pool.map(multi_buildPairs, tqdm(CORPUS.values() ))

In [None]:
# Reforming
multi_pairs_unzipped = []
for array_of_tuples in tqdm(multi_pairs):
    for tuple in array_of_tuples:
        multi_pairs_unzipped.append(tuple)

#### Uniq

In [None]:
def uniq(sorted_list):
    """
    Reduces the list of pairs only to unique tuples (a term can be only once in a document)
    """
    
    if len(sorted_list) <= 1:
        return sorted_list

    uniq_list = sorted_list[:1]
    previous_value = sorted_list[0]

    for value in sorted_list[1:]:
        if value != previous_value:
            uniq_list.append(value)
            previous_value = value
                
    return uniq_list

In [None]:
uniq_pairs = uniq(sorted(tqdm(multi_pairs_unzipped), key=lambda x: (x[0].lower(), x[1])))

#### Inverted Index

In [None]:
def buildInvertedIndex(uniq_pairs):
      """ 
      Creates inverted index using the list of unique pairs (token, document id).
      Returns inverted index (dictionary)
      """

    inverted_index = {}

    for term, document_id in tqdm(multi_pairs_unzipped):
        if term not in inverted_index:
            inverted_index[term] = []

        inverted_index[term].append(document_id)
    
    return inverted_index

In [None]:
inverted_index = buildInvertedIndex(uniq_pairs)

##### Save and load pickle

In [None]:
# Save
with open('/data/xhospodk/inverted_index.pkl', 'wb') as f:  # Save
    pickle.dump(inverted_index, f)

In [9]:
# Load
with open('/data/xhospodk/inverted_index.pkl', 'rb') as f:
    inverted_index = pickle.load(f)

#### Frequency Index

In [None]:
def multi_buildFrequencyIndex(key):
    """ 
    Creates frequency index (inverted index with term frequencies)
    using inverted index as heuristics for crawling.
    Returns frequency index (dictinoary).
    """
  
    local_list = {}
    relevant_documents = inverted_index[key]

    for doc_id in relevant_documents:
      body = tokenized_bodies_unzipped[doc_id]
      frequency = body.count(key)
      local_list[doc_id] = frequency

    return local_list

In [None]:
# Parallel processing
with Pool(processes=60) as pool:
    frequency_index = pool.map(multi_buildFrequencyIndex, tqdm(inverted_index.keys()) )

##### Save and load pickle

In [None]:
with open('/data/xhospodk/frequency_index.pkl', 'wb') as f:  # Save
    pickle.dump(frequency_index, f)

In [None]:
with open('frequency_index.pkl', 'rb') as f:
    frequency_index = pickle.load(f)

#### Documents Length Index

In [10]:
def buildDocumentsLengthIndex():
    """ 
    Creates index containing information about documents lengths.
    Returns documents length index (dictionary).
    """

    documents_length_index = {}

    for key in tqdm(CORPUS.keys()):
        documents_length_index[key] = len(tokenized_bodies_unzipped[key])
    
    return documents_length_index

In [11]:
documents_length_index = buildDocumentsLengthIndex()

100%|██████████| 527890/527890 [00:01<00:00, 480614.34it/s]


## 🧮 Ranking Dataframe

In [20]:
def buildRankingDf():
    """ 
    Prepares dataframe (table) with document instances loaded.
    Ranking and sorting will be happening within this structure once called.
    Returns dataframe.
    """
    
    id_as_list = [key for key, value in tqdm(CORPUS.items())]
    text_as_list = [value.body[:10] for key, value in tqdm(CORPUS.items())]

    df = pd.DataFrame({"ID":  id_as_list, "Values": text_as_list, "Rank": 0, "Matches": "" }) 
    df = df.set_index("ID")

    return df

In [21]:
ranking_df = buildRankingDf()

100%|██████████| 527890/527890 [00:00<00:00, 1108061.72it/s]
100%|██████████| 527890/527890 [00:02<00:00, 263076.86it/s]


In [25]:
ranking_df.shape

(527890, 3)

## 🧹 Query Cleaning

In [57]:
def cleanQuery(query):
  """ 
  Cleans passed query's body.
  Tokenization, stemming, stopwords and (words< 1) removed.
  Returns clean query body (array).
  """        

  query = query.body
  query = nltk.word_tokenize(query)
  query = [stemmer.stem(token) for token in query]
  query = [token for token in query if len(token) > 1 if token not in stopwords]

  return query

___

## 💯 Ranking Mechanism

In [134]:
def rank(query):
    """ 
    For every token in query ranks documents +1 for every token appearing in the document
    """

    global ranking_df
    ranking_df["Rank"] = 0.0
    ranking_df["Matches"] = ""

    for term in query:
        if term in inverted_index:
            for relevant_document_id in inverted_index[term]:
                if term not in ranking_df.at[relevant_document_id, "Matches"]:
                    ranking_df.at[relevant_document_id, "Rank"] += 1
                    ranking_df.at[relevant_document_id, "Matches"] +=  " " + term


    ranking_df = ranking_df.sort_values("Rank", ascending = False)
    sorted_documents = ranking_df["Values"].tolist()

    return sorted_documents

## 🚧 Information retrieval system and evaluation

In [135]:
class IRSystem(TrecIRSystemBase):

    def __init__(self, print_matrix):
        self.print_matrix = print_matrix
        random_documents = sorted(set(document for query, document in train_judgements))
        self.random_documents = random_documents

    def search(self, query: Query) -> Iterable[Document]:
        query = cleanQuery(query)
        sorted_documents = rank(query)

        if self.print_matrix == True:
          print("Query: ",query)
          print(ranking_df.head(20))

        return(sorted_documents)

In [136]:
submit_result = False
author_name = "Hospodka, Adam"
system = IRSystem(print_matrix = True)


print('Initializing your system ...', end='', flush=True)
test_queries = load_queries(Query, 'test')
test_judgements = load_judgements(test_queries, documents, 'test')
evaluation = TrecEvaluation(system, test_judgements, TrecLeaderboard(), author_name)
print(end='\r', flush=True)
evaluation.evaluate(tqdm(test_queries.values(), desc="Querying your system, brother", leave=False), submit_result)

Initializing your system ...

Querying your system, brother:   0%|          | 0/50 [00:00<?, ?it/s]

<Query 401 “What language and cultural differences impede the  ...”>
<class '__main__.Query'>
Query:  ['languag', 'cultur', 'differ', 'imped', 'integr', 'foreign', 'minor', 'germani']
                        Values  Rank  \
ID                                     
FBIS4-20472         Policy (Wh   8.0   
FBIS3-42766         Language:    8.0   
FBIS3-36078         Language:    8.0   
FBIS3-30533         Language:    7.0   
FBIS3-41666         Federation   7.0   
FT924-3751          FOR THE tr   7.0   
FBIS3-20098         Language:    7.0   
FBIS3-31672         Language:    7.0   
FBIS4-46969         CSO [Round   7.0   
FBIS3-23561         Article      7.0   
FBIS3-24361         Language:    7.0   
FBIS3-44708         Language:    7.0   
FBIS3-35195         Language:    7.0   
FBIS3-31620         Language:    7.0   
FBIS4-46766         CSO [Dialo   7.0   
FR940602-1-00023  Thursday\n\n   7.0   
FBIS3-19645         Language:    7.0   
FBIS3-7981          Language:    7.0   
FBIS4-40131     

Querying your system, brother:   2%|▏         | 1/50 [00:25<21:04, 25.80s/it]

<Query 402 “What is happening in the field of behavioral genet ...”>
<class '__main__.Query'>
Query:  ['happen', 'field', 'behavior', 'genet', 'studi', 'rel', 'influenc', 'genet', 'environment', 'factor', 'individu', "'s", 'behavior', 'person']
                      Values  Rank  \
ID                                   
FBIS4-20472       Policy (Wh  11.0   
LA100790-0251     Everything  10.0   
FBIS3-60342       the Former  10.0   
FBIS4-68893       CSO [Forew  10.0   
FR940104-0-00034  21 CFR Par  10.0   
FBIS3-37947       Language:   10.0   
FBIS3-24469       Language:   10.0   
LA062590-0042     At a confe  10.0   
LA122489-0017     THE WORLD   10.0   
FR940727-2-00007  National O  10.0   
FBIS3-54460       Language:   10.0   
FBIS3-14832       Language:   10.0   
FBIS3-36078       Language:   10.0   
FBIS3-61193       Language:   10.0   
FBIS4-29          Foreign     10.0   
FBIS3-41666       Federation  10.0   
FBIS3-40450       Language:    9.0   
FBIS3-58841       Language:    9.

Querying your system, brother:   4%|▍         | 2/50 [00:53<21:21, 26.70s/it]

<Query 403 “Find information on the effects of the dietary int ...”>
<class '__main__.Query'>
Query:  ['find', 'inform', 'effect', 'dietari', 'intak', 'potassium', 'magnesium', 'fruit', 'veget', 'determin', 'bone', 'miner', 'densiti', 'elderli', 'men', 'women', 'thu', 'prevent', 'osteoporosi', 'bone', 'decay']
                         Values  Rank  \
ID                                      
FBIS4-20472          Policy (Wh  15.0   
FR940112-1-00044    Wednesday\n  15.0   
FR940104-0-00034     21 CFR Par  14.0   
LA091489-0031        Many of us  14.0   
LA033089-0013        The scient  14.0   
FR940104-0-00033     DEPARTMENT  14.0   
FR940204-1-00053     DEPARTMENT  13.0   
FBIS3-41666          Federation  13.0   
LA020989-0063        The carica  12.0   
FBIS3-22119          Language:   12.0   
FBIS4-66382          Undertakin  12.0   
LA020490-0136        With the h  12.0   
LA111689-0032        Are you am  12.0   
FR940104-0-00032     DEPARTMENT  12.0   
FR941130-0-00122    Wednesday\n 

Querying your system, brother:   6%|▌         | 3/50 [01:28<23:56, 30.55s/it]

<Query 404 “How often were the peace talks in Ireland delayed  ...”>
<class '__main__.Query'>
Query:  ['often', 'peac', 'talk', 'ireland', 'delay', 'disrupt', 'result', 'act', 'violenc']
                     Values  Rank  \
ID                                  
FBIS4-19453      BFN [Inter   8.0   
FBIS3-31843      Language:    8.0   
FBIS3-37947      Language:    8.0   
FBIS3-207        Language:    8.0   
FBIS4-43233      BFN [Speec   8.0   
FBIS4-33915      BFN [Inter   8.0   
FBIS3-14832      Language:    8.0   
FBIS3-36078      Language:    8.0   
LA011589-0005    What follo   8.0   
FBIS4-67533    CSO \n\n       8.0   
FBIS4-15599      BFN [Inter   8.0   
FBIS4-44356      BFN [EC Do   7.0   
FBIS4-26002      BFN ["Text   7.0   
FBIS3-15218      Language:    7.0   
FBIS3-33775      Language:    7.0   
FBIS3-1164       Language:    7.0   
FBIS4-46649      1-464 FOR    7.0   
LA090389-0004    THE mornin   7.0   
FT934-5486       JANUARY 11   7.0   
FBIS4-63176      BFN [Inter   7.0   

Querying your system, brother:   8%|▊         | 4/50 [01:58<23:15, 30.34s/it]

<Query 405 “What unexpected or unexplained cosmic events or ce ...”>
<class '__main__.Query'>
Query:  ['unexpect', 'unexplain', 'cosmic', 'event', 'celesti', 'phenomena', 'radiat', 'supernova', 'outburst', 'new', 'comet', 'detect']
                        Values  Rank  \
ID                                     
FBIS3-42400         88) pp 1-2   7.0   
FBIS4-66382         Undertakin   7.0   
FBIS3-40352         Language:    7.0   
FBIS3-41666         Federation   6.0   
FBIS3-59752         Language:    6.0   
FBIS4-46649         1-464 FOR    6.0   
FBIS3-40388         OFFICIAL U   6.0   
FBIS3-43042         Language:    6.0   
FBIS3-43004         Language:    6.0   
FBIS4-46469         CSO [Book:   6.0   
FT944-864           Lo, the St   6.0   
FBIS4-66185       CSO \n\n       5.0   
FBIS4-45649         CSO ["Draf   5.0   
LA040590-0149       The upturn   5.0   
FBIS4-38            BOOKS,   J   5.0   
FBIS3-21268         Updated 29   5.0   
FR940602-1-00023  Thursday\n\n   5.0   
LA090889

Querying your system, brother:  10%|█         | 5/50 [02:21<20:55, 27.89s/it]

<Query 406 “What is being done to treat the symptoms of Parkin ...”>
<class '__main__.Query'>
Query:  ['done', 'treat', 'symptom', 'parkinson', "'s", 'diseas', 'keep', 'patient', 'function', 'long', 'possibl']
                         Values  Rank  \
ID                                      
FBIS3-37947          Language:    9.0   
FR940602-1-00023   Thursday\n\n   9.0   
FBIS3-41666          Federation   9.0   
FR940822-0-00067  Monday\n\n\nA   9.0   
LA101090-0146        Every morn   9.0   
FR940203-1-00069   Thursday\n\n   9.0   
LA030190-0065        These days   8.0   
FR940118-0-00013     One commen   8.0   
FT932-7262           The drugs    8.0   
FR940406-0-00190    Wednesday\n   8.0   
LA081290-0070        KIMON BEAZ   8.0   
LA110490-0018        Dr. Robert   8.0   
LA101090-0147        Although A   8.0   
LA061490-0184        John arriv   8.0   
FBIS3-60342          the Former   8.0   
LA111289-0076        A common g   8.0   
LA031890-0074        Don't suga   8.0   
FBIS3-36078

Querying your system, brother:  12%|█▏        | 6/50 [02:45<19:25, 26.48s/it]

<Query 407 “What is the impact of poaching on the world's vari ...”>
<class '__main__.Query'>
Query:  ['impact', 'poach', 'world', "'s", 'variou', 'wildlif', 'preserv']
                        Values  Rank  \
ID                                     
FBIS3-41666         Federation   6.0   
FBIS3-60342         the Former   6.0   
LA021990-0093       Almost eve   5.0   
LA091390-0020       Just west    5.0   
LA050889-0063       This is wh   5.0   
LA061889-0100       Don David    5.0   
LA121089-0204       Kaolelo Ul   5.0   
FBIS3-207           Language:    5.0   
FBIS4-67533       CSO \n\n       5.0   
FBIS4-57817         BFN [State   5.0   
LA051389-0024       Madagascar   5.0   
LA122690-0005       Like an in   5.0   
FBIS4-27085         BFN [Repor   5.0   
FBIS4-62092         BFN [Concl   4.0   
FT942-8138          In an over   4.0   
LA092189-0193       More than    4.0   
FBIS4-66185       CSO \n\n       4.0   
FT933-5757          NO ISSUE d   4.0   
FR940831-2-00002    DEPARTMENT 

Querying your system, brother:  14%|█▍        | 7/50 [02:59<15:59, 22.32s/it]

<Query 408 “What tropical storms (hurricanes and typhoons) hav ...”>
<class '__main__.Query'>
Query:  ['tropic', 'storm', 'hurrican', 'typhoon', 'caus', 'signific', 'properti', 'damag', 'loss', 'life']
                        Values  Rank  \
ID                                     
FBIS4-20472         Policy (Wh   9.0   
FBIS4-67533       CSO \n\n       9.0   
FBIS3-41666         Federation   8.0   
FBIS4-66382         Undertakin   8.0   
FBIS3-36078         Language:    8.0   
FR941026-2-00082    NUCLEAR RE   8.0   
FT923-4394          KLAUS CONR   8.0   
LA120489-0058       Two months   7.0   
FBIS4-68275         CSO [Book    7.0   
FBIS4-47642         Media        7.0   
FBIS4-26773         BFN [By re   7.0   
FBIS4-45041       CSO \n\n  [T   7.0   
FBIS3-42811         Language:    7.0   
FBIS3-42399         Language:    7.0   
FR941221-2-00104    Biweekly N   7.0   
FBIS3-22119         Language:    7.0   
FBIS3-54461         Language:    7.0   
FBIS3-42459         Article      7.0  

Querying your system, brother:  16%|█▌        | 8/50 [03:20<15:26, 22.06s/it]

<Query 409 “What legal actions have resulted from the destruct ...”>
<class '__main__.Query'>
Query:  ['legal', 'action', 'result', 'destruct', 'pan', 'flight', '103', 'lockerbi', 'scotland', 'decemb', '21', '1988']
                     Values  Rank  \
ID                                  
FBIS3-29         Foreign      9.0   
FBIS4-43233      BFN [Speec   9.0   
LA031789-0067    Airline an   9.0   
LA091190-0052    Secretary    8.0   
LA091590-0045    Secretary    8.0   
FBIS3-24678      Table        8.0   
FT923-13692      A US jury    8.0   
FBIS3-60342      the Former   8.0   
FBIS4-19450      BFN [Unatt   8.0   
LA121589-0082    The State    8.0   
LA031190-0197    Last week    8.0   
FBIS3-61336      Language:    8.0   
FBIS3-23561      Article      8.0   
FBIS3-61335      Language:    8.0   
FBIS4-20504      Program 11   8.0   
FBIS3-67         Foreign      8.0   
FBIS4-42         Table        8.0   
FBIS4-66185    CSO \n\n       8.0   
FBIS4-22027      BFN [Unatt   8.0   
FBIS3-4

Querying your system, brother:  18%|█▊        | 9/50 [03:46<15:44, 23.04s/it]

<Query 410 “Who is involved in the Schengen agreement to elimi ...”>
<class '__main__.Query'>
Query:  ['involv', 'schengen', 'agreement', 'elimin', 'border', 'control', 'western', 'europ', 'hope', 'accomplish']
                   Values  Rank  \
ID                                
FBIS3-29       Foreign      9.0   
FBIS4-31159    BFN [Inter   9.0   
FBIS3-43186    JAPAN:       9.0   
FBIS3-58       Table        9.0   
FBIS3-49035    Language:    9.0   
FBIS4-46469    CSO [Book:   9.0   
FBIS4-59772    BFN [Inter   9.0   
FBIS3-43142    Europe       9.0   
FBIS4-68893    CSO [Forew   9.0   
FBIS4-66185  CSO \n\n       9.0   
FBIS3-40524    Article Ty   9.0   
FBIS3-43160    Foreign      9.0   
FBIS3-43214    Foreign      9.0   
FBIS4-68275    CSO [Book    9.0   
FBIS4-23662    BFN [Speec   9.0   
FBIS4-43047    BFN [Repor   9.0   
FBIS4-19       Table        9.0   
FBIS4-67533  CSO \n\n       9.0   
FBIS3-43132    JAPAN:       9.0   
FBIS3-41975    OFFICIAL U   9.0   

                  

Querying your system, brother:  20%|██        | 10/50 [04:18<17:12, 25.81s/it]

<Query 411 “Find information on shipwreck salvaging: the recov ...”>
<class '__main__.Query'>
Query:  ['find', 'inform', 'shipwreck', 'salvag', 'recoveri', 'attempt', 'recoveri', 'treasur', 'sunken', 'ship']
                         Values  Rank  \
ID                                      
FBIS4-66382          Undertakin   8.0   
FBIS4-67533        CSO \n\n       8.0   
FBIS3-24469          Language:    7.0   
FBIS4-66185        CSO \n\n       7.0   
FBIS3-41666          Federation   6.0   
FR941019-0-00079    Wednesday\n   6.0   
FT922-4205           ON THE sta   6.0   
FBIS4-47609          Table        6.0   
LA011589-0005        What follo   6.0   
FBIS3-61057          Language:    6.0   
FBIS3-43132          JAPAN:       6.0   
FR940927-0-00113  Tuesday\n\n\n   6.0   
FBIS3-43186          JAPAN:       6.0   
FR940511-0-00076     (b) Applic   6.0   
FR941206-1-00134  Tuesday\n\n\n   6.0   
FT934-5058           Mr Deputy    5.0   
LA062689-0025        Despite it   5.0   
FBIS4-46649  

Querying your system, brother:  22%|██▏       | 11/50 [04:37<15:33, 23.94s/it]

<Query 412 “What security measures are in effect or are propos ...”>
<class '__main__.Query'>
Query:  ['secur', 'measur', 'effect', 'propos', 'go', 'effect', 'airport']
                      Values  Rank                              Matches
ID                                                                     
FBIS4-66382       Undertakin   5.0   secur measur effect propos airport
LA121089-0204     Kaolelo Ul   5.0      measur effect propos go airport
FBIS3-31670       Language:    5.0   secur measur effect propos airport
FBIS4-2734        BFN [Xinji   5.0   secur measur effect propos airport
FBIS4-46857       BFN [Comme   5.0   secur measur effect propos airport
FBIS3-20481       Language:    5.0   secur measur effect propos airport
FBIS4-4375        BFN [Speec   5.0   secur measur effect propos airport
FBIS3-40497       Low-Pollut   5.0   secur measur effect propos airport
LA020989-0133     When crude   5.0   secur measur effect propos airport
FR940324-2-00137  Passenger    5.0   se

Querying your system, brother:  24%|██▍       | 12/50 [05:06<16:11, 25.56s/it]

<Query 413 “What are new methods of producing steel?”>
<class '__main__.Query'>
Query:  ['new', 'method', 'produc', 'steel']
                     Values  Rank                   Matches
ID                                                         
FBIS4-66382      Undertakin   4.0   new method produc steel
FBIS3-40065      Language:    4.0   new method produc steel
FT923-9281       US STEEL,    4.0   new method produc steel
LA062589-0181    Mark and C   4.0   new method produc steel
FBIS3-21263      Updated 29   4.0   new method produc steel
FT932-7004       Next Septe   4.0   new method produc steel
FT932-2344       China's ne   4.0   new method produc steel
FBIS4-5167     BFN \n\n  [T   4.0   new method produc steel
FBIS4-1627       BFN ["Repo   4.0   new method produc steel
LA051390-0165    If there w   4.0   new method produc steel
FBIS3-45822      Language:    4.0   new method produc steel
FT922-9093       ALONG THE    4.0   new method produc steel
FT923-1121       THE FIRST    4.0  

Querying your system, brother:  26%|██▌       | 13/50 [05:33<15:57, 25.88s/it]

<Query 414 “How much sugar does Cuba export and which countrie ...”>
<class '__main__.Query'>
Query:  ['much', 'sugar', 'doe', 'cuba', 'export', 'countri', 'import']
                     Values  Rank                                     Matches
ID                                                                           
LA031490-0104    Sandinista   7.0   much sugar doe cuba export countri import
LA022790-0098    The Bush A   7.0   much sugar doe cuba export countri import
FBIS3-32536      Language:    7.0   much sugar doe cuba export countri import
FBIS4-33295      CSO [Artic   7.0   much sugar doe cuba export countri import
FBIS4-34971    BFN \n\n       7.0   much sugar doe cuba export countri import
FBIS4-47156      CSO [Comme   7.0   much sugar doe cuba export countri import
FBIS3-50510      Article      7.0   much sugar doe cuba export countri import
FBIS4-68893      CSO [Forew   7.0   much sugar doe cuba export countri import
FBIS4-68782      [Signed to   7.0   much sugar doe cub

Querying your system, brother:  28%|██▊       | 14/50 [06:08<17:11, 28.64s/it]

<Query 415 “What is known about drug trafficking in the "Golde ...”>
<class '__main__.Query'>
Query:  ['known', 'drug', 'traffick', "''", 'golden', 'triangl', "''", 'area', 'burma', 'thailand', 'lao', 'meet']
                     Values  Rank  \
ID                                  
FT921-3152       THE BURMES  10.0   
FBIS3-21886      Language:   10.0   
FBIS3-41211      Language:   10.0   
FBIS3-28711      Language:    9.0   
FBIS4-26800    BFN \n\n  [T   9.0   
FBIS3-41250      Language:    9.0   
FBIS4-66983      CSO [Repor   9.0   
LA010189-0079    In northea   8.0   
FBIS3-33963      Language:    8.0   
FBIS3-41174      Language:    8.0   
FBIS4-67209      BFN [By Su   8.0   
LA043090-0142    At first b   8.0   
FBIS4-30503    BFN \n\n  [T   8.0   
FBIS4-26397    BFN \n\n  [T   8.0   
FT921-3261       BURMA, Tha   8.0   
FBIS3-60083      Language:    8.0   
FBIS3-60189      Language:    8.0   
FBIS4-1796     BFN \n\n       8.0   
FBIS3-21611      Language:    8.0   
FBIS3-9994    

Querying your system, brother:  30%|███       | 15/50 [06:31<15:39, 26.84s/it]

<Query 416 “What is the status of The Three Gorges Project?”>
<class '__main__.Query'>
Query:  ['statu', 'three', 'gorg', 'project']
                        Values  Rank                    Matches
ID                                                             
FBIS4-47629         Table        4.0   statu three gorg project
FBIS4-67533       CSO \n\n       4.0   statu three gorg project
FBIS3-26359         Language:    4.0   statu three gorg project
FBIS4-46649         1-464 FOR    4.0   statu three gorg project
FBIS3-11          SUMMARY \n\n   4.0   statu three gorg project
FBIS4-1843          BFN [Sichu   4.0   statu three gorg project
FBIS4-25951         BFN [By Zh   4.0   statu three gorg project
FBIS3-41666         Federation   4.0   statu three gorg project
FBIS4-20472         Policy (Wh   4.0   statu three gorg project
FBIS4-66185       CSO \n\n       4.0   statu three gorg project
LA031989-0066       Shortly af   4.0   statu three gorg project
FT944-6438          A bronze s   4.

Querying your system, brother:  32%|███▏      | 16/50 [06:47<13:24, 23.67s/it]

<Query 417 “Find ways of measuring creativity.”>
<class '__main__.Query'>
Query:  ['find', 'way', 'measur', 'creativ']
                    Values  Rank                   Matches
ID                                                        
FBIS3-59506     Language:    4.0   find way measur creativ
LA032689-0053   Values flo   4.0   find way measur creativ
LA091690-0207   IT'S LATE    4.0   find way measur creativ
FBIS3-21193     Language:    4.0   find way measur creativ
FBIS3-43003     Language:    4.0   find way measur creativ
FBIS4-46584     CSO [Artic   4.0   find way measur creativ
FBIS4-46897     CSO [Inter   4.0   find way measur creativ
LA070289-0152   While the    4.0   find way measur creativ
FBIS3-30086     Language:    4.0   find way measur creativ
FBIS3-23987     Language:    4.0   find way measur creativ
FBIS4-44407     CSO [Artic   4.0   find way measur creativ
FT932-15955     IF ALL the   4.0   find way measur creativ
LA110289-0085   Five candi   4.0   find way measur crea

Querying your system, brother:  34%|███▍      | 17/50 [07:05<12:05, 21.99s/it]

<Query 418 “In what ways have quilts been used to generate inc ...”>
<class '__main__.Query'>
Query:  ['way', 'quilt', 'use', 'gener', 'incom']
                     Values  Rank                     Matches
ID                                                           
FBIS4-24522      BFN [Speec   5.0   way quilt use gener incom
LA052090-0246    THERE IS A   5.0   way quilt use gener incom
LA083190-0132    Marine Mas   5.0   way quilt use gener incom
FBIS3-21892      Language:    5.0   way quilt use gener incom
LA050489-0034    Retirement   5.0   way quilt use gener incom
FBIS3-59506      Language:    4.0         way use gener incom
FBIS4-56568      BFN [Speec   4.0         way use gener incom
FBIS4-19981      BFN [Inter   4.0         way use gener incom
LA122690-0029    Tens of mi   4.0         way use gener incom
FT933-5770       Some are i   4.0         way use gener incom
LA053189-0120    The Depart   4.0         way use gener incom
FBIS3-4871       Language:    4.0         way use 

Querying your system, brother:  36%|███▌      | 18/50 [07:40<13:42, 25.70s/it]

<Query 419 “What new uses have been developed for old automobi ...”>
<class '__main__.Query'>
Query:  ['new', 'use', 'develop', 'old', 'automobil', 'tire', 'mean', 'tire', 'recycl']
                       Values  Rank  \
ID                                    
LA020489-0135      Having she   8.0   
LA010289-0021      There's go   8.0   
LA011090-0115      Orange Cou   8.0   
FBIS3-24678        Table        8.0   
FBIS4-20472        Policy (Wh   8.0   
FBIS4-44893        Show 94FE0   8.0   
LA072090-0133      Ferrari bu   7.0   
FBIS4-32817        BFN [News    7.0   
FBIS3-22420        Language:    7.0   
LA042489-0048      Oliver Cou   7.0   
FR940831-2-00166  Appendix \n   7.0   
LA082690-0231      In an inst   7.0   
FBIS3-43132        JAPAN:       7.0   
FBIS3-43186        JAPAN:       7.0   
FBIS3-29           Foreign      7.0   
FBIS3-42810        Language:    7.0   
FBIS4-22629        CSO [Artic   7.0   
FBIS3-40450        Language:    7.0   
FBIS4-66162        CSO [Artic   7.0   

Querying your system, brother:  38%|███▊      | 19/50 [08:29<17:01, 32.95s/it]

<Query 420 “How widespread is carbon monoxide poisoning on a g ...”>
<class '__main__.Query'>
Query:  ['widespread', 'carbon', 'monoxid', 'poison', 'global', 'scale']
                         Values  Rank  \
ID                                      
FR940602-1-00023   Thursday\n\n   6.0   
FBIS3-41666          Federation   6.0   
FT921-8131           Environmen   5.0   
FBIS3-40497          Low-Pollut   5.0   
FBIS4-20472          Policy (Wh   5.0   
FR940822-0-00067  Monday\n\n\nA   5.0   
FT941-12578          The scale    5.0   
FR941130-0-00122    Wednesday\n   5.0   
FBIS4-66382          Undertakin   5.0   
LA121089-0232        HEAD Landf   5.0   
FBIS4-67533        CSO \n\n       4.0   
FBIS4-20504          Program 11   4.0   
FT934-3768           Anyone in    4.0   
FBIS4-23114          Table        4.0   
LA043089-0156        Thriving i   4.0   
FBIS3-60342          the Former   4.0   
FBIS4-66185        CSO \n\n       4.0   
FBIS3-24469          Language:    4.0   
LA111689-0032

Querying your system, brother:  40%|████      | 20/50 [08:34<12:09, 24.30s/it]

<Query 421 “How is the disposal of industrial waste being acco ...”>
<class '__main__.Query'>
Query:  ['dispos', 'industri', 'wast', 'accomplish', 'industri', 'manag', 'throughout', 'world']
                        Values  Rank  \
ID                                     
FR940602-1-00023  Thursday\n\n   7.0   
FBIS4-44724         CSO [Artic   7.0   
FBIS4-31787         BFN [Text    7.0   
FBIS3-60336         Decisions    7.0   
FBIS3-13309         Language:    7.0   
FBIS4-1866          BFN [Shand   7.0   
FBIS3-43186         JAPAN:       7.0   
FBIS3-24648         BOOKS,  JO   7.0   
FBIS3-4210          Language:    7.0   
FBIS3-23561         Article      7.0   
FBIS3-43196         Europe       7.0   
FBIS3-37944         Language:    7.0   
FBIS3-13223         Language:    7.0   
FBIS3-43142         Europe       7.0   
FR940804-1-00057  Thursday\n\n   7.0   
FBIS4-58121         BFN [Speec   7.0   
FBIS4-22629         CSO [Artic   7.0   
FBIS4-49021         BFN [Repor   7.0   
FBIS3-29 

Querying your system, brother:  42%|████▏     | 21/50 [09:05<12:51, 26.60s/it]

<Query 422 “What incidents have there been of stolen or forged ...”>
<class '__main__.Query'>
Query:  ['incid', 'stolen', 'forg', 'art']
                        Values  Rank                 Matches
ID                                                          
LA092489-0079       Today an e   4.0   incid stolen forg art
FBIS3-24145         Language:    4.0   incid stolen forg art
LA041689-0134       When veter   4.0   incid stolen forg art
LA123190-0042       There's go   4.0   incid stolen forg art
FT944-14427         UNDER MY S   3.0          incid forg art
LA020790-0088       Listen to    3.0         stolen forg art
FBIS3-24152         Language:    3.0         stolen forg art
LA031989-0036       Of course,   3.0         stolen forg art
FBIS4-24633       BFN \n\n       3.0         stolen forg art
LA010990-0001       Jesse Jame   3.0       incid stolen forg
FBIS3-24197         Language:    3.0         stolen forg art
FBIS4-1843          BFN [Sichu   3.0          incid forg art
FT922-365

Querying your system, brother:  44%|████▍     | 22/50 [09:10<09:16, 19.86s/it]

<Query 423 “Find references to Milosevic's wife, Mirjana Marko ...”>
<class '__main__.Query'>
Query:  ['find', 'refer', 'milosev', "'s", 'wife', 'mirjana', 'markov']
                   Values  Rank                                  Matches
ID                                                                      
FBIS4-32111    BFN [Artic   6.0   find refer milosev wife mirjana markov
FT943-7734     With an un   6.0   find refer milosev wife mirjana markov
FT942-13554    Duga, the    5.0         find milosev wife mirjana markov
FT942-11666    Calls for    5.0         find milosev wife mirjana markov
FBIS3-30431    Language:    5.0        refer milosev wife mirjana markov
FBIS4-8497     CSO [Artic   5.0        refer milosev wife mirjana markov
FBIS4-30835    BFN [Inter   4.0                  find refer milosev wife
FT944-8386     We are in    4.0                   find refer wife markov
FBIS4-32072  CSO \n\n       4.0              milosev wife mirjana markov
FBIS3-8272     Language:    4.0

Querying your system, brother:  46%|████▌     | 23/50 [09:19<07:32, 16.77s/it]

<Query 424 “Give examples of alleged suicides that aroused sus ...”>
<class '__main__.Query'>
Query:  ['give', 'exampl', 'alleg', 'suicid', 'arous', 'suspicion', 'death', 'actual', 'murder']
                   Values  Rank  \
ID                                
LA020189-0108  When she w   8.0   
FBIS3-13962    Language:    8.0   
FBIS4-68275    CSO [Book    8.0   
FBIS4-31681    CSO [Inter   7.0   
LA091690-0206  SMURF C WA   7.0   
FT922-7713     THE arcane   7.0   
FBIS4-58121    BFN [Speec   7.0   
LA020490-0003  The Hollyw   7.0   
LA093090-0066  ' "This ca   7.0   
LA012490-0173  On a sprin   7.0   
FBIS4-47008    CSO [Inter   6.0   
FBIS3-24469    Language:    6.0   
FBIS3-39883    Language:    6.0   
FBIS4-1628     BFN ["Work   6.0   
LA050789-0008  JAMES A. V   6.0   
LA040890-0248  THE GUNS B   6.0   
LA060489-0006  IN HER AIR   6.0   
LA011090-0058  The full s   6.0   
FBIS3-60959    Language:    6.0   
LA092390-0185  When Marti   6.0   

                                      

Querying your system, brother:  48%|████▊     | 24/50 [09:36<07:17, 16.84s/it]

<Query 425 “What counterfeiting of money is being done in mode ...”>
<class '__main__.Query'>
Query:  ['counterfeit', 'money', 'done', 'modern', 'time']
                   Values  Rank                              Matches
ID                                                                  
FBIS4-20770    CSO [Artic   5.0   counterfeit money done modern time
FBIS3-67       Foreign      5.0   counterfeit money done modern time
FBIS4-2721     BFN [Repor   5.0   counterfeit money done modern time
FBIS4-58610    CSO [All f   5.0   counterfeit money done modern time
FBIS4-49107    BFN ["Exce   5.0   counterfeit money done modern time
FBIS3-46614    Language:    5.0   counterfeit money done modern time
FBIS4-20472    Policy (Wh   5.0   counterfeit money done modern time
FBIS4-46889    CSO [Accou   5.0   counterfeit money done modern time
FBIS4-1843     BFN [Sichu   5.0   counterfeit money done modern time
FBIS3-24145    Language:    5.0   counterfeit money done modern time
FT931-12092    PARI

Querying your system, brother:  50%|█████     | 25/50 [10:01<08:00, 19.21s/it]

<Query 426 “Provide information on the use of dogs worldwide f ...”>
<class '__main__.Query'>
Query:  ['provid', 'inform', 'use', 'dog', 'worldwid', 'law', 'enforc', 'purpos']
                        Values  Rank  \
ID                                     
FBIS4-66382         Undertakin   8.0   
FBIS4-66185       CSO \n\n       8.0   
FBIS3-36078         Language:    8.0   
FR940602-1-00023  Thursday\n\n   8.0   
FR941130-0-00122   Wednesday\n   7.0   
FR940324-2-00020    Office of    7.0   
FBIS4-404           BFN [State   7.0   
FBIS3-43132         JAPAN:       7.0   
FR940112-1-00044   Wednesday\n   7.0   
FBIS3-51695         Language:    7.0   
FBIS3-48483         Language:    7.0   
FBIS3-22833         Language:    7.0   
FBIS4-9354          BFN ["Text   7.0   
FR940406-0-00093    DEPARTMENT   7.0   
FBIS3-43186         JAPAN:       7.0   
FBIS4-22377         Narcotics    7.0   
FBIS4-1860          BFN [Guang   7.0   
FBIS3-20900         Language:    7.0   
FBIS3-72            This

Querying your system, brother:  52%|█████▏    | 26/50 [10:40<10:01, 25.05s/it]

<Query 427 “Find documents that discuss the damage ultraviolet ...”>
<class '__main__.Query'>
Query:  ['find', 'document', 'discuss', 'damag', 'ultraviolet', 'uv', 'light', 'sun', 'eye']
                        Values  Rank  \
ID                                     
FBIS4-66382         Undertakin   7.0   
FBIS3-43132         JAPAN:       7.0   
FBIS3-43186         JAPAN:       7.0   
LA081990-0198       The road t   7.0   
FBIS3-42399         Language:    7.0   
FR940112-1-00044   Wednesday\n   7.0   
FBIS4-67533       CSO \n\n       7.0   
FBIS3-46291         Language:    7.0   
FBIS3-23561         Article      7.0   
FBIS4-46469         CSO [Book:   7.0   
LA081290-0070       KIMON BEAZ   6.0   
FBIS4-67877       CSO \n\n       6.0   
FBIS3-54461         Language:    6.0   
FBIS4-50901         BFN [Gover   6.0   
LA090789-0051       "Silence E   6.0   
FBIS4-20472         Policy (Wh   6.0   
FBIS4-20736         CSO [Part    6.0   
FBIS3-40450         Language:    6.0   
FBIS3-35189  

Querying your system, brother:  54%|█████▍    | 27/50 [10:58<08:52, 23.14s/it]

<Query 428 “Do any countries other than the U.S. and China hav ...”>
<class '__main__.Query'>
Query:  ['ani', 'countri', 'u.s.', 'china', 'declin', 'birth', 'rate']
                     Values  Rank                               Matches
ID                                                                     
LA041590-0153    In the 20    6.0   ani countri china declin birth rate
FBIS4-49862      BFN [Artic   6.0   ani countri china declin birth rate
FBIS3-4209       Language:    6.0   ani countri china declin birth rate
FBIS3-2516       Language:    6.0   ani countri china declin birth rate
FBIS3-30086      Language:    6.0   ani countri china declin birth rate
FBIS3-43174      Foreign      6.0   ani countri china declin birth rate
FBIS3-14832      Language:    6.0   ani countri china declin birth rate
FBIS4-54622    BFN \n\n  BO   6.0   ani countri china declin birth rate
FBIS3-59343      Language:    6.0   ani countri china declin birth rate
LA042990-0214    Fifteen ye   6.0   ani cou

Querying your system, brother:  56%|█████▌    | 28/50 [11:33<09:47, 26.70s/it]

<Query 429 “Identify outbreaks of Legionnaires' disease.”>
<class '__main__.Query'>
Query:  ['identifi', 'outbreak', 'legionnair', 'diseas']
                      Values  Rank                               Matches
ID                                                                      
FR940202-2-00107  G. Other P   4.0   identifi outbreak legionnair diseas
FR940202-2-00110  In hospita   4.0   identifi outbreak legionnair diseas
FR940817-0-00002  DEPARTMENT   3.0              identifi outbreak diseas
FBIS3-22561       Article      3.0              identifi outbreak diseas
FBIS3-60454       Language:    3.0              identifi outbreak diseas
FR940318-0-00006  Accordingl   3.0              identifi outbreak diseas
LA060890-0001     Southern C   3.0              identifi outbreak diseas
LA091790-0001     Henry Voss   3.0              identifi outbreak diseas
LA040689-0078     Thorough h   3.0              identifi outbreak diseas
LA110689-0060     Stooped an   3.0              identifi

Querying your system, brother:  58%|█████▊    | 29/50 [11:37<06:55, 19.79s/it]

<Query 430 “Identify instances of attacks on humans by African ...”>
<class '__main__.Query'>
Query:  ['identifi', 'instanc', 'attack', 'human', 'african', 'killer', 'bee']
                     Values  Rank  \
ID                                  
LA011589-0005    What follo   7.0   
LA080389-0111    Something    6.0   
LA060890-0001    Southern C   6.0   
FBIS4-66185    CSO \n\n       6.0   
FBIS4-1214     BFN \n\n  [T   5.0   
LA020590-0004    Trapped in   5.0   
FBIS3-1509       Language:    5.0   
LA112189-0045    It wasn't    5.0   
FBIS3-13962      Language:    5.0   
LA031190-0011    THERE IS S   5.0   
FBIS4-67533    CSO \n\n       5.0   
FBIS3-36078      Language:    5.0   
LA061689-0067    Agricultur   5.0   
FBIS3-60342      the Former   5.0   
FBIS3-43563      Article      5.0   
LA061789-0067    A homeless   5.0   
FBIS3-60454      Language:    5.0   
LA050789-0065    A bespecta   5.0   
LA082889-0040    When Prome   4.0   
LA092490-0070    President    4.0   

            

Querying your system, brother:  60%|██████    | 30/50 [11:46<05:32, 16.63s/it]

<Query 431 “What are the latest developments in robotic techno ...”>
<class '__main__.Query'>
Query:  ['latest', 'develop', 'robot', 'technolog']
                     Values  Rank                          Matches
ID                                                                
FBIS3-20888      Language:    4.0   latest develop robot technolog
FT911-129        FOR all th   4.0   latest develop robot technolog
FT932-3748       Steven Spi   4.0   latest develop robot technolog
FT943-8066       Ever since   4.0   latest develop robot technolog
FT922-4420       THE world    4.0   latest develop robot technolog
FBIS3-43166      JAPAN:       4.0   latest develop robot technolog
FBIS4-67877    CSO \n\n       4.0   latest develop robot technolog
LA082290-0041    George Jet   4.0   latest develop robot technolog
FBIS3-58         Table        4.0   latest develop robot technolog
FBIS4-20652      Conceptual   4.0   latest develop robot technolog
FT923-12755      An interna   4.0   latest develop

Querying your system, brother:  62%|██████▏   | 31/50 [12:01<05:07, 16.18s/it]

<Query 432 “Do police departments use "profiling" to stop moto ...”>
<class '__main__.Query'>
Query:  ['polic', 'depart', 'use', '``', 'profil', "''", 'stop', 'motorist']
                   Values  Rank                          Matches
ID                                                              
LA062490-0138  Extraction   5.0     polic depart use profil stop
LA021289-0138  Some well-   5.0   polic depart use stop motorist
LA081990-0095  Not so lon   5.0   polic depart use stop motorist
LA012590-0157  Los Angele   5.0   polic depart use stop motorist
LA051090-0134  Facing wha   5.0   polic depart use stop motorist
LA083190-0008  San Diegan   5.0   polic depart use stop motorist
LA041389-0067  Inglewood'   5.0     polic depart use profil stop
LA012089-0090  For the la   5.0   polic depart use stop motorist
FT934-5058     Mr Deputy    5.0   polic depart use stop motorist
FBIS3-45197    Language:    5.0     polic depart use profil stop
FT944-4458     Government   5.0   polic depart us

Querying your system, brother:  64%|██████▍   | 32/50 [12:26<05:35, 18.62s/it]

<Query 433 “Is there contemporary interest in the Greek philos ...”>
<class '__main__.Query'>
Query:  ['contemporari', 'interest', 'greek', 'philosophi', 'stoicism']
                   Values  Rank                                  Matches
ID                                                                      
FBIS3-59837    Language:    4.0   contemporari interest greek philosophi
LA011589-0005  What follo   4.0   contemporari interest greek philosophi
LA031989-0196  What the h   4.0       interest greek philosophi stoicism
FT933-3941     THE Royal    4.0   contemporari interest greek philosophi
FBIS3-60342    the Former   4.0   contemporari interest greek philosophi
FBIS3-42399    Language:    4.0   contemporari interest greek philosophi
LA012890-0190  IT IS DECE   4.0   contemporari interest greek philosophi
FBIS3-36078    Language:    4.0   contemporari interest greek philosophi
FT934-11393    One of the   3.0              contemporari interest greek
LA111989-0213  After six    3.0

Querying your system, brother:  66%|██████▌   | 33/50 [12:36<04:34, 16.18s/it]

<Query 434 “What is the state of the economy of Estonia?”>
<class '__main__.Query'>
Query:  ['state', 'economi', 'estonia']
                     Values  Rank                 Matches
ID                                                       
FBIS3-61258      p 8 944K05   3.0   state economi estonia
FBIS3-42851      Language:    3.0   state economi estonia
FBIS3-38333      Language:    3.0   state economi estonia
FBIS3-42766      Language:    3.0   state economi estonia
FT942-14636      Shortly af   3.0   state economi estonia
FBIS4-46790      CSO [Artic   3.0   state economi estonia
FBIS3-55112      Language:    3.0   state economi estonia
FT934-17320      Western Eu   3.0   state economi estonia
FT924-1740       IT WAS bac   3.0   state economi estonia
FBIS4-64177    BFN \n\n  [T   3.0   state economi estonia
FBIS4-68662      CSO [Artic   3.0   state economi estonia
FT934-11297      THE CENTRA   3.0   state economi estonia
FT911-3287       REBELLIOUS   3.0   state economi estonia
FT943-

Querying your system, brother:  68%|██████▊   | 34/50 [12:56<04:38, 17.40s/it]

<Query 435 “What measures have been taken worldwide and what c ...”>
<class '__main__.Query'>
Query:  ['measur', 'taken', 'worldwid', 'countri', 'effect', 'curb', 'popul', 'growth']
                     Values  Rank  \
ID                                  
FBIS3-67         Foreign      8.0   
FBIS3-29         Foreign      8.0   
FT941-12411      According    8.0   
FBIS3-17354      Language:    8.0   
FT944-6732       Fears abou   8.0   
FBIS3-24653      Table        8.0   
FBIS3-24678      Table        8.0   
FBIS4-44468      CSO [Artic   8.0   
FBIS3-10291      Language:    8.0   
FBIS4-67533    CSO \n\n       8.0   
LA041590-0153    In the 20    8.0   
FBIS3-37947      Language:    8.0   
FBIS4-4249       BFN [Artic   8.0   
FBIS3-5105       Language:    7.0   
FBIS4-22685      CSO [Inter   7.0   
FT932-1558       One year a   7.0   
FBIS4-50133      BFN [Artic   7.0   
FBIS3-28303      Language:    7.0   
FBIS4-25085      BFN ["Spec   7.0   
FBIS3-24679      Europe       7.0   

   

Querying your system, brother:  70%|███████   | 35/50 [13:29<05:28, 21.88s/it]

<Query 436 “What are the causes of railway accidents throughou ...”>
<class '__main__.Query'>
Query:  ['caus', 'railway', 'accid', 'throughout', 'world']
                 Values  Rank                               Matches
ID                                                                 
FT933-13657  ON FEBRUAR   5.0   caus railway accid throughout world
FBIS3-2516   Language:    5.0   caus railway accid throughout world
FBIS4-50901  BFN [Gover   5.0   caus railway accid throughout world
FBIS4-49832  BFN [Yunna   5.0   caus railway accid throughout world
FBIS4-1860   BFN [Guang   5.0   caus railway accid throughout world
FBIS4-4067   BFN [Gover   5.0   caus railway accid throughout world
FBIS3-40200  Language:    5.0   caus railway accid throughout world
FBIS3-41666  Federation   5.0   caus railway accid throughout world
FBIS4-20472  Policy (Wh   5.0   caus railway accid throughout world
FBIS4-66382  Undertakin   5.0   caus railway accid throughout world
FBIS4-2439   BFN [Jiang   5.0 

Querying your system, brother:  72%|███████▏  | 36/50 [13:43<04:35, 19.65s/it]

<Query 437 “What has been the experience of residential utilit ...”>
<class '__main__.Query'>
Query:  ['ha', 'experi', 'residenti', 'util', 'custom', 'follow', 'deregul', 'ga', 'electr']
                         Values  Rank  \
ID                                      
FBIS4-67533        CSO \n\n       7.0   
FBIS4-19             Table        7.0   
FBIS3-23             Table        7.0   
FBIS3-43166          JAPAN:       7.0   
FBIS4-20472          Policy (Wh   7.0   
FBIS3-43220          JAPAN:       7.0   
FBIS3-24653          Table        7.0   
FBIS3-58             Table        7.0   
FR940304-1-00115     The rest o   6.0   
FBIS4-66162          CSO [Artic   6.0   
FBIS3-40450          Language:    6.0   
FBIS4-31029        BFN \n\n  ST   6.0   
FBIS4-47609          Table        6.0   
FBIS3-13309          Language:    6.0   
FBIS4-23084          Table        6.0   
FT941-2928           In a taste   6.0   
FBIS3-24678          Table        6.0   
FBIS3-43174          Foreign      

Querying your system, brother:  74%|███████▍  | 37/50 [14:02<04:10, 19.27s/it]

<Query 438 “What countries are experiencing an increase in tou ...”>
<class '__main__.Query'>
Query:  ['countri', 'experienc', 'increas', 'tourism']
                        Values  Rank                             Matches
ID                                                                      
FBIS3-32488         Language:    4.0   countri experienc increas tourism
FBIS3-3164          Language:    4.0   countri experienc increas tourism
FBIS4-7494          BFN [Inter   4.0   countri experienc increas tourism
LA073190-0040       Expansion    4.0   countri experienc increas tourism
FBIS3-51398         Language:    4.0   countri experienc increas tourism
FBIS3-13220         Language:    4.0   countri experienc increas tourism
FBIS4-25051         BFN ["Spee   4.0   countri experienc increas tourism
FBIS4-66770         CSO [Inter   4.0   countri experienc increas tourism
FBIS4-34168         BFN ["High   4.0   countri experienc increas tourism
FT933-4753          Israeli op   4.0   countri e

Querying your system, brother:  76%|███████▌  | 38/50 [14:22<03:55, 19.62s/it]

<Query 439 “What new inventions or scientific discoveries have ...”>
<class '__main__.Query'>
Query:  ['new', 'invent', 'scientif', 'discoveri', 'made']
                   Values  Rank                              Matches
ID                                                                  
FBIS4-68445    CSO [Artic   5.0   new invent scientif discoveri made
LA022389-0081  There's no   5.0   new invent scientif discoveri made
FBIS3-59903    Language:    5.0   new invent scientif discoveri made
FBIS3-42399    Language:    5.0   new invent scientif discoveri made
FT923-5533     for money,   5.0   new invent scientif discoveri made
FBIS3-21348    Language:    5.0   new invent scientif discoveri made
LA010990-0070  Discoverin   5.0   new invent scientif discoveri made
FBIS3-24469    Language:    5.0   new invent scientif discoveri made
FBIS4-20472    Policy (Wh   5.0   new invent scientif discoveri made
FT941-6376     Europe is    5.0   new invent scientif discoveri made
FT921-12579    THE 

Querying your system, brother:  78%|███████▊  | 39/50 [14:53<04:12, 22.99s/it]

<Query 440 “What steps are being taken by governments or corpo ...”>
<class '__main__.Query'>
Query:  ['step', 'taken', 'govern', 'corpor', 'elimin', 'abus', 'child', 'labor']
                       Values  Rank  \
ID                                    
FBIS4-33740        BFN [Speec   8.0   
FBIS4-68893        CSO [Forew   8.0   
FBIS3-42677        Language:    8.0   
FBIS4-23089        Europe       8.0   
FBIS3-1849         Article      8.0   
FBIS4-50706        BFN [Fujia   8.0   
FBIS4-2049         BFN ["Exce   7.0   
FR941102-1-00119  Wednesday\n   7.0   
FBIS3-20796        Language:    7.0   
FBIS4-47609        Table        7.0   
FBIS4-1861         BFN [Hebei   7.0   
FBIS3-8249         Language:    7.0   
FBIS3-24145        Language:    7.0   
FBIS3-32560        Language:    7.0   
FBIS4-1860         BFN [Guang   7.0   
FBIS4-49021        BFN [Repor   7.0   
FBIS4-1843         BFN [Sichu   7.0   
FBIS3-59334        Language:    7.0   
LA011589-0005      What follo   7.0   
FBIS4

Querying your system, brother:  80%|████████  | 40/50 [15:22<04:08, 24.81s/it]

<Query 441 “How do you prevent and treat Lyme disease?”>
<class '__main__.Query'>
Query:  ['prevent', 'treat', 'lyme', 'diseas']
                      Values  Rank                     Matches
ID                                                            
FT923-453         DISEASE-MO   4.0   prevent treat lyme diseas
FBIS4-10326       BFN [Secon   3.0        prevent treat diseas
LA112090-0144     You are lo   3.0        prevent treat diseas
LA070189-0089     More than    3.0         prevent lyme diseas
FBIS3-1945        Language:    3.0        prevent treat diseas
FT943-1589        From the c   3.0        prevent treat diseas
LA090690-0116     The image    3.0        prevent treat diseas
FBIS3-48296       Language:    3.0        prevent treat diseas
FT932-1044        A 51-year-   3.0        prevent treat diseas
LA112689-0092     Feral cats   3.0        prevent treat diseas
LA050690-0016     By any sta   3.0        prevent treat diseas
FR941107-2-00231  Table 3_Ca   3.0         prevent l

Querying your system, brother:  82%|████████▏ | 41/50 [15:27<02:50, 18.92s/it]

<Query 442 “Find accounts of selfless heroic acts by individua ...”>
<class '__main__.Query'>
Query:  ['find', 'account', 'selfless', 'heroic', 'act', 'individu', 'small', 'group', 'benefit', 'caus']
                   Values  Rank  \
ID                                
FBIS3-42399    Language:   10.0   
FBIS3-37947    Language:   10.0   
FBIS3-43196    Europe       9.0   
FBIS4-22690    CSO [Inter   9.0   
FBIS4-57286    BFN [Debat   9.0   
FBIS3-43142    Europe       9.0   
FBIS3-4209     Language:    9.0   
FBIS4-66308    CSO [Repor   9.0   
FBIS3-2516     Language:    9.0   
FBIS4-49245    BFN [Anhui   9.0   
FBIS4-49021    BFN [Repor   9.0   
FBIS3-42400    88) pp 1-2   9.0   
FBIS3-5642     Language:    9.0   
FBIS3-24469    Language:    9.0   
FBIS4-1844     BFN [Tianj   9.0   
FBIS4-66382    Undertakin   9.0   
FBIS4-66161    CSO [Artic   9.0   
FBIS4-27085    BFN [Repor   9.0   
FBIS4-25261  BFN \n\n  [T   9.0   
FBIS4-11145    CSO [Artic   8.0   

                             

Querying your system, brother:  84%|████████▍ | 42/50 [16:09<03:26, 25.86s/it]

<Query 443 “What is the extent of U.S. (government and private ...”>
<class '__main__.Query'>
Query:  ['extent', 'u.s.', 'govern', 'privat', 'invest', 'sub-saharan', 'africa']
                     Values  Rank                              Matches
ID                                                                    
FBIS3-1510       Language:    5.0   extent govern privat invest africa
FT941-16502      For a grou   5.0   extent govern privat invest africa
FT921-6710       The Intern   5.0   extent govern privat invest africa
FBIS3-61089      Language:    5.0   extent govern privat invest africa
LA021690-0060    Jesse Jack   5.0   extent govern privat invest africa
FBIS4-20400      CSO [Artic   5.0   extent govern privat invest africa
FBIS4-68893      CSO [Forew   5.0   extent govern privat invest africa
FBIS3-14832      Language:    5.0   extent govern privat invest africa
FBIS3-22757      Language:    5.0   extent govern privat invest africa
FBIS3-20446      Language:    5.0   extent 

Querying your system, brother:  86%|████████▌ | 43/50 [16:36<03:02, 26.06s/it]

<Query 444 “What are the potential uses for supercritical flui ...”>
<class '__main__.Query'>
Query:  ['potenti', 'use', 'supercrit', 'fluid', 'environment', 'protect', 'measur']
                         Values  Rank  \
ID                                      
FBIS4-44913        CSO \n\n  [T   7.0   
FBIS4-20472          Policy (Wh   7.0   
FBIS4-20504          Program 11   6.0   
LA121089-0225        Southern C   6.0   
FBIS3-58             Table        6.0   
FR941026-2-00082     NUCLEAR RE   6.0   
FBIS4-44815          Developmen   6.0   
FBIS3-22119          Language:    6.0   
FBIS3-20670          Language:    6.0   
FR941206-1-00134  Tuesday\n\n\n   6.0   
FR941107-2-00226     Immunocomp   6.0   
FBIS3-20948          153-157, 1   6.0   
FBIS3-10291          Language:    6.0   
FR940705-2-00151     Appendix Q   6.0   
FR940817-2-00266     5.2.3. Exp   6.0   
FBIS3-43040          Language:    6.0   
FR940104-0-00032     DEPARTMENT   6.0   
FBIS3-24469          Language:    6.0   
F

Querying your system, brother:  88%|████████▊ | 44/50 [17:02<02:36, 26.13s/it]

<Query 445 “What other countries besides the United States are ...”>
<class '__main__.Query'>
Query:  ['countri', 'besid', 'unit', 'state', 'consid', 'approv', 'women', 'clergi', 'person']
                     Values  Rank  \
ID                                  
FBIS3-41666      Federation   9.0   
FBIS3-24678      Table        8.0   
LA030189-0034    Each year    8.0   
LA042389-0005    KHALIL TAB   8.0   
FBIS4-49809      BFN ["Exce   8.0   
FBIS3-6216       Language:    8.0   
FBIS3-24145      Language:    8.0   
FBIS3-1411       Language:    8.0   
FBIS3-23832      Language:    8.0   
LA040289-0106    When asked   8.0   
FBIS4-45041    CSO \n\n  [T   8.0   
FBIS4-34315      BFN ["Excl   8.0   
FBIS4-46588      BFN ["Pole   8.0   
FBIS3-14452      Language:    8.0   
FBIS3-43160      Foreign      8.0   
FBIS4-68545      CSO [Artic   8.0   
FBIS4-50901      BFN [Gover   8.0   
FBIS4-50993      BFN [Repor   8.0   
FBIS4-68275      CSO [Book    8.0   
FBIS3-43214      Foreign      8.0 

Querying your system, brother:  90%|█████████ | 45/50 [17:54<02:49, 33.97s/it]

<Query 446 “Where are tourists likely to be subjected to acts  ...”>
<class '__main__.Query'>
Query:  ['tourist', 'like', 'subject', 'act', 'violenc', 'caus', 'bodili', 'harm', 'death']
                        Values  Rank  \
ID                                     
LA020689-0027       It came to   8.0   
FBIS3-37947         Language:    8.0   
FBIS3-60941         Language:    8.0   
FBIS3-24145         Language:    8.0   
FBIS3-36078         Language:    8.0   
FBIS4-42            Table        7.0   
FBIS4-46775         CSO [Artic   7.0   
FBIS4-66185       CSO \n\n       7.0   
LA121190-0075       Jesse B. S   7.0   
FBIS4-28712         BFN [Artic   7.0   
LA100889-0040       When I hea   7.0   
FBIS3-207           Language:    7.0   
FR940817-0-00065  (2) \n\nDead   7.0   
FBIS4-68893         CSO [Forew   7.0   
FBIS3-13653         Language:    7.0   
FBIS4-29            Foreign      7.0   
FBIS4-34471         CSO [Inter   7.0   
FBIS3-34820         Language:    7.0   
FBIS3-35329   

Querying your system, brother:  92%|█████████▏| 46/50 [18:21<02:07, 31.95s/it]

<Query 447 “What new developments and applications are there f ...”>
<class '__main__.Query'>
Query:  ['new', 'develop', 'applic', 'stirl', 'engin']
                        Values  Rank                          Matches
ID                                                                   
FT933-6707          It's a bad   5.0   new develop applic stirl engin
FBIS3-21198         Material 1   5.0   new develop applic stirl engin
FT923-9701          In the pub   5.0   new develop applic stirl engin
FBIS3-40450         Language:    5.0   new develop applic stirl engin
FT931-8135          SPECULATIO   4.0          new develop stirl engin
FBIS4-28316       BFN \n\n  [T   4.0         new develop applic engin
FR940505-1-00065    A bump-up    4.0         new develop applic engin
FBIS3-40507        on \n Aircr   4.0         new develop applic engin
FR940516-1-00114    While adeq   4.0         new develop applic engin
FT934-14189         SINCE THE    4.0         new develop applic engin
FT923-1186 

Querying your system, brother:  94%|█████████▍| 47/50 [18:56<01:38, 32.81s/it]

<Query 448 “Identify instances in which weather was a main or  ...”>
<class '__main__.Query'>
Query:  ['identifi', 'instanc', 'weather', 'wa', 'main', 'contribut', 'factor', 'loss', 'ship', 'sea']
                        Values  Rank  \
ID                                     
FBIS4-66382         Undertakin   9.0   
FBIS3-23561         Article      9.0   
FBIS4-67533       CSO \n\n       9.0   
FBIS4-46469         CSO [Book:   9.0   
FBIS4-20472         Policy (Wh   9.0   
FBIS3-42400         88) pp 1-2   9.0   
FBIS4-19            Table        9.0   
FBIS3-58            Table        9.0   
FBIS4-46649         1-464 FOR    9.0   
FBIS3-60342         the Former   9.0   
FBIS3-24678         Table        8.0   
FBIS3-43132         JAPAN:       8.0   
FBIS3-42399         Language:    8.0   
FBIS3-29            Foreign      8.0   
FR940602-1-00023  Thursday\n\n   8.0   
FBIS4-67877       CSO \n\n       8.0   
FBIS3-41666         Federation   8.0   
FBIS3-43186         JAPAN:       8.0   
FBI

Querying your system, brother:  96%|█████████▌| 48/50 [19:16<00:57, 28.98s/it]

<Query 449 “What has caused the current ineffectiveness of ant ...”>
<class '__main__.Query'>
Query:  ['ha', 'caus', 'current', 'ineffect', 'antibiot', 'infect', 'prognosi', 'new', 'drug']
                        Values  Rank  \
ID                                     
FR940112-1-00044   Wednesday\n   7.0   
FBIS4-33295         CSO [Artic   7.0   
LA022689-0006       The 1980s    7.0   
LA030190-0139       A major Ha   6.0   
LA070489-0051       The 65-yea   6.0   
LA120890-0059       Milk and m   6.0   
FT932-16160         The virus    6.0   
LA081290-0070       KIMON BEAZ   6.0   
LA111190-0179       SOMETIMES    6.0   
LA052190-0106       Puppeteer    6.0   
LA092689-0119       Health wor   6.0   
FT933-7438          A virulent   6.0   
FR940203-1-00069  Thursday\n\n   6.0   
LA061390-0002       Every week   6.0   
FR940202-2-00103    V. Risk Fa   6.0   
LA021590-0211       Americans    6.0   
FR941027-1-00030  Thursday\n\n   6.0   
LA092689-0080       Health wor   6.0   
LA062590-00

Querying your system, brother:  98%|█████████▊| 49/50 [19:48<00:29, 29.81s/it]

<Query 450 “How significant a figure over the years was the la ...”>
<class '__main__.Query'>
Query:  ['signific', 'figur', 'year', 'wa', 'late', 'jordanian', 'king', 'hussein', 'peac', 'middl', 'east']
                     Values  Rank  \
ID                                  
FT944-14204     MONDAY 24\n   9.0   
FT943-13705    18\nMONDAY\n   9.0   
LA033189-0105    During his   9.0   
FBIS4-66185    CSO \n\n       8.0   
FT924-8277       JORDAN'S r   8.0   
FBIS4-67533    CSO \n\n       8.0   
FT943-12422     MONDAY 25\n   8.0   
LA021289-0041    A one-time   8.0   
FT911-1699       A joint in   8.0   
FBIS3-36078      Language:    8.0   
FBIS3-37947      Language:    8.0   
LA082790-0013    As I write   8.0   
LA021989-0037    What a dif   8.0   
FBIS4-36533      BFN [Studi   8.0   
FBIS4-37832      BFN [Inter   8.0   
FT932-12056      WHEN King    8.0   
FT921-7455       Talking of   8.0   
FT944-15625      Just how d   8.0   
LA053190-0100    Arab leade   8.0   
LA040189-0010    Kin

                                                                              

Your system achieved **0.00% MAP score**.

You need at least **10%** to pass. 😢

Try playing with the preprocessing of queries and documents! 💡

Set `submit_result = True` and write your name to the `author_name` variable to submit your result to [the leaderboard](https://docs.google.com/spreadsheets/d/e/2PACX-1vQ33YdFZtGH6g2bDbkD9aLozLdVVGNuP09sRh-F9d_EY9nWntOrLHSyNATFsXw4v9lw3UA3vOzl5l0s/pubhtml). 🏆

The best submissions on the leaderboard will receive *small awards during the semester*, and some *__seriously big__ awards* after the personal check at the end of the competition (2021-05-16). Please be polite, do not spoil the game for the others, and **have fun!** 😉