**Team Project: Web Search and Information Retrieval - Topic 4 Effiecient Vector Retrieval**

# Retrieval Time & MAP Evaluation Notebook -- last updated 05/26/2019

**Purpose:** Test VSM retrieval model on **doc_dump.txt** text collection.

# Evaluation

### Structure
- 1. Evaluate 'vanilla'
- 2. Evaluate 'optimal'/ 'invertedIndex'
- 3. Evaluate 'postingMergeIntersection'
- 4. Evaluate 'preclustering'
- 5. Evaluate 'tieredIndex'

### Test Retrieval Performance on Full Document Collection - 'doc_dump.txt'

### Note: Here we use doc_dump.txt which means the raw/unpreprocessed data!

In [1]:
# read in raw train.docs text and split
raw_texts = open('data/raw/doc_dump.txt', encoding="utf-8").read()
doc_list = raw_texts.split("\n")
len(doc_list)

5371

Show typical example of document in doc_dump.txt

In [2]:
# Show typical example of document in doc_dump.txt
doc_list[1].split("\t")

['MED-2',
 'http://www.ncbi.nlm.nih.gov/pubmed/22809476',
 'A statistical regression model for the estimation of acrylamide concentrations in French fries for excess lifetime cancer risk assessment. - PubMed - NCBI',
 'Abstract Human exposure to acrylamide (AA) through consumption of French fries and other foods has been recognized as a potential health concern. Here, we used a statistical non-linear regression model, based on the two most influential factors, cooking temperature and time, to estimate AA concentrations in French fries. The R(2) of the predictive model is 0.83, suggesting the developed model was significant and valid. Based on French fry intake survey data conducted in this study and eight frying temperature-time schemes which can produce tasty and visually appealing French fries, the Monte Carlo simulation results showed that if AA concentration is higher than 168 ppb, the estimated cancer risk for adolescents aged 13-18 years in Taichung City would be already higher t

In [3]:
# create document collection D
doc_collection = dict()

for i in range(len(doc_list)):
    list_ = doc_list[i].split("\t")
    if len(list_) == 4:
        key_ = list_[0]
        title_ = list_[2]
        text_ = list_[3]
        value_ = title_ + " " + text_
        doc_collection.update({key_: value_})
    else:
        continue

Use same list of documents that were used for testing in original paper

In [4]:
# load list of documents used for testing
raw_texts = open('data/raw/test.docs.ids').read()
test_list = raw_texts.split("\n")
len(test_list)

3163

In [5]:
# note that the last element in list is simply an empty string and will be removed
test_list[3162]
test_list.pop()
len(test_list)

3162

In [6]:
# create test document collection that will be used to compute speed, MAP
doc_collection_test = dict()
for idx in doc_collection.keys():
    # print(idx)
    if idx in test_list:
        key_ = idx
        value_ = doc_collection[idx]
        doc_collection_test.update({key_: value_})
    else:
        continue
        
len(doc_collection_test)

3162

In [7]:
# delete full doc_collection to avoid confusion!
del doc_collection

### *LOAD OWN IMPLEMENTTAION OF VECTOR SPACE RETRIEVAL FUNCTIONS*

In [8]:
# load own implementations of VSM
from VSM_functions import *

[nltk_data] Downloading package wordnet to /home/roman/nltk_data...
[nltk_data]   Package wordnet is already up-to-date!


### *INDEXING*

In [9]:
# compute global idf scores on D
idfs = compute_Idf(doc_collection_test)

Idf computation done in 18.0459s.


In [10]:
# Compute tfidf scores
tfidfs = compute_TfIdf(doc_collection_test, idfs)

Tf-idf computation done in 24.4949s.


In [11]:
# Construct inverted index
inverted_index = construct_invertedIndex(doc_collection_test, idfs, tfidfs)

InvertedIndex construction done in 0.2234s.


In [12]:
# construct term document matrix
tdm = create_tdm(doc_collection_test, tfidfs)

TDM construction done in 8.9105s.


In [13]:
# Compute dict that stores vector norm length for each document d in D
doc_lengths = construct_docLengthDict(doc_collection_test, tfidfs)

DocLength index construction done in 0.1214s.


In [14]:
# Compute preclustering with sqrt(n) random leaders
clusters = pre_cluster(doc_collection_test)

Preclustering done in 22.6574s.


In [15]:
# Construct tiered index
tiered_index = construct_tiered_index(doc_collection_test, inverted_index, t=0.5)

TieredIndex construction done in 0.2100s.


Full indexing takes ca. 1min - 2min.

### *QUERYING*

### Load in query collection - Using test.nontopic-titles.queries

here we use nontopic-titles.queries. Same as in the paper.

In [16]:
# read in raw text and split
raw_texts = open('data/test/test.nontopic-titles.queries', encoding="utf-8").read()
query_list = raw_texts.split("\n")
len(query_list)

145

In [17]:
# create dictionary that holds the queryID + queryText
query_col = dict()

for i in range(len(query_list)):
    if len(query_list[i].split("\t")) == 2:
        key_, value_ = query_list[i].split("\t")
        query_col.update({key_ : value_})
        
len(query_col)

144

An example of a typical non.topic page title resembles a query an average user would type

In [18]:
# display an example query
query_col["PLAIN-2590"]

'do vegetarians get enough protein ?'

### Load in gold standard for relevance judgements - Using test.2-1-0.qrel

In [19]:
# read in raw text and split
raw_texts = open('data/test/test.2-1-0.qrel').read()
rel_list = raw_texts.split("\n")
len(rel_list)

12335

In [20]:
# create dictionary that holds queryID + emptyList for all queries in Q
gold_col = dict()

for i in range(len(rel_list)):
    list_ = rel_list[i].split("\t")
    if len(list_)==4:
        key_ = list_[0]
        value_ = list()
        if key_ in query_col.keys():
            gold_col.update({key_ : value_})
    else:
        continue

In [21]:
for i in range(len(rel_list)):
    list_ = rel_list[i].split("\t")
    if len(list_)==4:
        key_ = list_[0]
        value_ = list_[2]
        if key_ in gold_col.keys():
            gold_col[key_].append(value_)
    else:
        continue
        
# show length of gold_col
len(gold_col)

144

Show relevant documents for query

In [22]:
# show list of relevant documents for MED-3254
# To measure MAP all documents in this list are considered to be of equal relevance
# See nDCG evaluation for ordered relevance judgments!
gold_col["PLAIN-2590"]

['MED-2288',
 'MED-3137',
 'MED-2290',
 'MED-2291',
 'MED-2292',
 'MED-2293',
 'MED-2294',
 'MED-2295',
 'MED-2296',
 'MED-2498',
 'MED-2517',
 'MED-2519',
 'MED-2501',
 'MED-2502',
 'MED-2513',
 'MED-2504',
 'MED-2505',
 'MED-2506',
 'MED-2507',
 'MED-5239',
 'MED-2509',
 'MED-2510',
 'MED-2511',
 'MED-2512',
 'MED-3000',
 'MED-2765',
 'MED-2997',
 'MED-3001',
 'MED-2999',
 'MED-4313',
 'MED-3148',
 'MED-3149',
 'MED-3242',
 'MED-3243',
 'MED-3244',
 'MED-3245',
 'MED-3270',
 'MED-3271',
 'MED-3272',
 'MED-3273',
 'MED-3274',
 'MED-3275',
 'MED-3276',
 'MED-3277',
 'MED-3278',
 'MED-3279',
 'MED-3280',
 'MED-3281',
 'MED-3282',
 'MED-3283',
 'MED-3580',
 'MED-3581',
 'MED-3582',
 'MED-3583',
 'MED-3584',
 'MED-3858',
 'MED-4094',
 'MED-3860',
 'MED-3862',
 'MED-4107',
 'MED-4299',
 'MED-4298',
 'MED-4600']

# Document Retrieval

- Use example query to show topK retrieval, where k = 10
- loop 10 times over all documents in query test collection to measure retrieval speed (i.e. 1440 iterations)
- loop over all documents in test collection to measure MAP, nDCG

### Example query

In [23]:
query_col["PLAIN-2590"]

'do vegetarians get enough protein ?'

In [24]:
query1 = query_col["PLAIN-2590"]

# Approach 1: 'vanilla' (using Term-Document Matrix)

### topK document retrieval with k=10 

Show / return documents once!

In [25]:
top_k_retrieval(q = query1, TDM=tdm, idfDict=idfs,
                D = doc_collection_test, k = 10, strategy="vanilla",
                show_documents=True)

Retrieval time ca. 0.50370526 seconds.
Highest cosine similarity:
	MED-4984 : 0.23064
	MED-5340 : 0.22193
	MED-1771 : 0.20181
	MED-4163 : 0.19685
	MED-1540 : 0.19446
	MED-1541 : 0.19444
	MED-1135 : 0.18747
	MED-2294 : 0.17669
	MED-2939 : 0.17560
	MED-2290 : 0.17405

Vegetarian and vegan diets in type 2 diabetes management. - PubMed - NCBI Abstract Vegetarian and vegan diets offer significant benefits for diabetes management. In observational studies, individuals following vegetarian diets are about half as likely to develop diabetes, compared with non-vegetarians. In clinical trials in individuals with type 2 diabetes, low-fat vegan diets improve glycemic control to a greater extent than conventional diabetes diets. Although this effect is primarily attributable to greater weight loss, evidence also suggests that reduced intake of saturated fats and high-glycemic-index foods, increased intake of dietary fiber and vegetable protein, reduced intramyocellular lipid concentrations, and dec

### Evaluation - Mean Average Precision (MAP)

In [26]:
# Correct evaluation means to rank all 3612 documents!'
len(doc_collection_test)

3162

In [27]:
# Evaluate MAP 
AP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    
    topK_scores = top_k_retrieval(q = query, TDM=tdm, idfDict=idfs,
                        D = doc_collection_test, k = 3162, strategy="vanilla",
                        show_documents=False, print_scores=False,
                       return_results=True, return_speed=False)
    
    avg_precision = evaluate_AveragePrecision(y_pred=topK_scores, y_true=gold_list)
    AP.append(avg_precision)

In [28]:
MAP = np.mean(AP)
print("MAP: {:.4f}".format(MAP))

MAP: 0.1447


### Evaluation - r-precision

In [29]:
# Evaluate MAP 
RP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list = gold_col[qIDX]
    K = len(gold_list)
    
    topK_scores = top_k_retrieval(q = query, TDM=tdm, idfDict=idfs,
                        D = doc_collection_test, k = K, strategy="vanilla",
                        show_documents=False, print_scores=False,
                       return_results=True, return_speed=False)
    
    r_precision = evaluate_pAtRank(y_pred=topK_scores, y_true=gold_list, atRank=K)
    RP.append(r_precision)

In [30]:
RP_avg = np.mean(RP)
print("R-precision: {:.4f}".format(RP_avg))

R-precision: 0.1608


### Retrieval Speed

In [31]:
speed_list = list()

for i in range(10):
    
    for qIDX, qTEXT in query_col.items():
        query = qTEXT

        topK_scores, speed = top_k_retrieval(q = query, TDM=tdm, idfDict=idfs,
                            D = doc_collection_test, k = 10, strategy="vanilla",
                            show_documents=False, print_scores=False,
                           return_results=True, return_speed=True)


        speed_list.append(speed)  

In [32]:
speed_avg = np.mean(speed_list)
print("Retrieval time in sec. (average over 10 iterations): {:.4f}".format(speed_avg))

Retrieval time in sec. (average over 10 iterations): 0.5623


# Approach 2: 'standard' (using invertedIndex)

### topK document retrieval with k=10 

Note: No documents are shown here to keep the notebook more clearly arranged

In [33]:
top_k_retrieval(q = query1, D = doc_collection_test, k = 10, strategy="standard",
                idfDict = idfs, invertedIdx = inverted_index,
                lengthIdx = doc_lengths,
                show_documents=False)

Retrieval time ca. 0.00351310 seconds.
Highest cosine similarity:
	MED-4984 : 0.36616
	MED-5340 : 0.35233
	MED-1771 : 0.32038
	MED-4163 : 0.31251
	MED-1540 : 0.30872
	MED-1541 : 0.30870
	MED-1135 : 0.29762
	MED-2294 : 0.28051
	MED-2939 : 0.27878
	MED-2290 : 0.27632



### Evaluation - Mean Average Precision (MAP)

In [34]:
# Evaluate MAP 
AP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = 3162,
                                idfDict = idfs, invertedIdx = inverted_index,
                                lengthIdx = doc_lengths,
                                show_documents=False, return_results=True, print_scores=False)
    
    avg_precision = evaluate_AveragePrecision(y_pred=topK_scores, y_true=gold_list)
    AP.append(avg_precision)

In [35]:
MAP = np.mean(AP)
print("MAP: {:.4f}".format(MAP))

MAP: 0.1447


### Evaluation - r-precision

In [36]:
# Evaluate MAP 
RP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    K = len(gold_list)
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = K,
                                idfDict = idfs, invertedIdx = inverted_index,
                                lengthIdx = doc_lengths,
                                show_documents=False, return_results=True, print_scores=False)
    
    r_precision = evaluate_pAtRank(y_pred=topK_scores, y_true=gold_list, atRank=K)
    RP.append(r_precision)

In [37]:
RP_avg = np.mean(RP)
print("R-precision: {:.4f}".format(RP_avg))

R-precision: 0.1608


### Retrieval Speed

In [38]:
speed_list = list()

for i in range(10):
    
    for qIDX, qTEXT in query_col.items():
        query = qTEXT

        topK_scores, speed = top_k_retrieval(q = query, D = doc_collection_test, k = 10,
                            idfDict = idfs, invertedIdx = inverted_index,
                            lengthIdx = doc_lengths,
                            show_documents=False, print_scores=False,
                                             return_results=True, return_speed=True
                                            )
        
        speed_list.append(speed)


In [39]:
speed_avg = np.mean(speed_list)
print("Retrieval time in sec. (avg. over 10 iterations): {:.4f}".format(speed_avg))

Retrieval time in sec. (avg. over 10 iterations): 0.0030


# Approach 3: 'postingMerge Intersection'

### topK document retrieval with k=10 

In [40]:
top_k_retrieval(q = query1, D = doc_collection_test, k = 10, strategy="intersection",
                idfDict = idfs, invertedIdx = inverted_index,
                lengthIdx = doc_lengths,
                show_documents=False)

Retrieval time ca. 0.00361586 seconds.
Highest cosine similarity:
	MED-4984 : 0.36616
	MED-5340 : 0.35233
	MED-1771 : 0.32038
	MED-4163 : 0.31251
	MED-1135 : 0.29762
	MED-2294 : 0.28051
	MED-2939 : 0.27878
	MED-2290 : 0.27632
	MED-1723 : 0.26479
	MED-1613 : 0.25572



### Evaluation - Mean Average precision (MAP)

In [41]:
# Evaluate MAP 
AP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = 3162, strategy="intersection",
                                    idfDict = idfs, invertedIdx = inverted_index,
                                    lengthIdx = doc_lengths,
                                    show_documents=False, print_scores=False,
                                    return_results=True, return_speed=False)
    
    avg_precision = evaluate_AveragePrecision(y_pred=topK_scores, y_true=gold_list)
    AP.append(avg_precision)

In [42]:
MAP = np.mean(AP)
print("MAP: {:.4f}".format(MAP))

MAP: 0.0329


### Evaluation - r-precision

In [43]:
# Evaluate r-precision 
RP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    K = len(gold_list)
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = K, strategy="intersection",
                                    idfDict = idfs, invertedIdx = inverted_index,
                                    lengthIdx = doc_lengths,
                                    show_documents=False, print_scores=False,
                                    return_results=True, return_speed=False)
    
    r_precision = evaluate_pAtRank(y_pred=topK_scores, y_true=gold_list, atRank=K)
    RP.append(r_precision)

In [44]:
RP_avg = np.mean(RP)
print("R-precision: {:.4f}".format(RP_avg))

R-precision: 0.0266


### Retrieval Speed

In [45]:
speed_list = list()

for i in range(10):
    
    for qIDX, qTEXT in query_col.items():
        query = qTEXT

        topK_scores, speed = top_k_retrieval(q = query, D = doc_collection_test, k = 10,
                                             strategy="intersection",
                                            idfDict = idfs, invertedIdx = inverted_index,
                                            lengthIdx = doc_lengths,
                                            show_documents=False, print_scores=False,
                                            return_results=True, return_speed=True)

        speed_list.append(speed)

In [46]:
speed_avg = np.mean(speed_list)
print("Retrieval time in sec. (avg. over 10 iterations): {:.4f}".format(speed_avg))

Retrieval time in sec. (avg. over 10 iterations): 0.0029


# Approach 4: 'preclustering'

### topK document retrieval with k=10 

In [47]:
top_k_retrieval(q = query1, D = doc_collection_test, k = 10, strategy="preclustering",
                idfDict = idfs, invertedIdx = inverted_index,
                lengthIdx = doc_lengths, preClusterDict=clusters,
                show_documents=False)

Retrieval time ca. 0.02604771 seconds.
Highest cosine similarity:
	MED-5340 : 0.35233
	MED-4163 : 0.31251
	MED-1541 : 0.30870
	MED-2294 : 0.28051
	MED-2290 : 0.27632
	MED-3236 : 0.27344
	MED-3779 : 0.24989
	MED-4637 : 0.24899
	MED-3557 : 0.24485
	MED-1530 : 0.22907



### Evaluation - Mean Average precision (MAP)

In [48]:
# Evaluate MAP
AP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = 3162, strategy="preclustering",
                                         idfDict = idfs, invertedIdx = inverted_index,
                                         lengthIdx = doc_lengths, preClusterDict=clusters,
                                         show_documents=False, print_scores=False,
                                         return_results=True, return_speed=False)
    
    avg_precision = evaluate_AveragePrecision(y_pred=topK_scores, y_true=gold_list)
    AP.append(avg_precision)

In [49]:
MAP = np.mean(AP)
print("MAP: {:.4f}".format(MAP))

MAP: 0.0638


### Evaluation - r-precision

In [50]:
# Evaluate r-precision
RP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    K = len(gold_list)
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = K, strategy="preclustering",
                                         idfDict = idfs, invertedIdx = inverted_index,
                                         lengthIdx = doc_lengths, preClusterDict=clusters,
                                         show_documents=False, print_scores=False,
                                         return_results=True, return_speed=False)
    
    r_precision = evaluate_pAtRank(y_pred=topK_scores, y_true=gold_list, atRank=K)
    RP.append(r_precision)

In [51]:
RP_avg = np.mean(RP)
print("R-precision: {:.4f}".format(RP_avg))

R-precision: 0.0848


### Retrieval Speed

In [52]:
speed_list = list()

for i in range(10):
    
    for qIDX, qTEXT in query_col.items():
        query = qTEXT

        topK_scores, speed = top_k_retrieval(q = query, D = doc_collection_test, k = 10, strategy="preclustering",
                                             idfDict = idfs, invertedIdx = inverted_index,
                                             lengthIdx = doc_lengths, preClusterDict=clusters,
                                             show_documents=False, print_scores=False,
                                             return_results=True, return_speed=True)

        speed_list.append(speed)

In [53]:
speed_avg = np.mean(speed_list)
print("Retrieval time in sec. (avg. over 10 iterations): {:.4f}".format(speed_avg))

Retrieval time in sec. (avg. over 10 iterations): 0.0324


# Approach 5: 'tiered_index'

### topK document retrieval with k=10 

In [54]:
top_k_retrieval(q = query1, D = doc_collection_test, k = 10, strategy="tiered",
                idfDict = idfs, invertedIdx = inverted_index,
                lengthIdx = doc_lengths, preClusterDict=clusters, tieredIdx = tiered_index,
                show_documents=False)

Retrieval time ca. 0.00654340 seconds.
Highest cosine similarity:
	MED-5340 : 0.35233
	MED-1771 : 0.32038
	MED-4163 : 0.31251
	MED-1540 : 0.30872
	MED-1541 : 0.30870
	MED-4984 : 0.30004
	MED-1135 : 0.29762
	MED-2939 : 0.27878
	MED-3236 : 0.27344
	MED-1723 : 0.26479



### Evaluation - Mean Average precision (MAP)

In [55]:
# Evaluate MAP 
AP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = 3162, strategy="tiered",
                                         idfDict = idfs, invertedIdx = inverted_index,
                                         lengthIdx = doc_lengths, tieredIdx = tiered_index,
                                         show_documents=False, print_scores=False,
                                         return_results=True, return_speed=False)
    
    avg_precision = evaluate_AveragePrecision(y_pred=topK_scores, y_true=gold_list)
    AP.append(avg_precision)

In [56]:
MAP = np.mean(AP)
print("MAP: {:.4f}".format(MAP))

MAP: 0.1432


### Evaluation - r-precision

In [57]:
# Evaluate r-precision 
RP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    K = len(gold_list)
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = K, strategy="tiered",
                                         idfDict = idfs, invertedIdx = inverted_index,
                                         lengthIdx = doc_lengths, tieredIdx = tiered_index,
                                         show_documents=False, print_scores=False,
                                         return_results=True, return_speed=False)
    
    r_precision = evaluate_pAtRank(y_pred=topK_scores, y_true=gold_list, atRank=K)
    RP.append(r_precision)

In [58]:
RP_avg = np.mean(RP)
print("R-precision: {:.4f}".format(RP_avg))

R-precision: 0.1601


### Retrieval Speed

In [59]:
speed_list = list()

for i in range(10):
    
    for qIDX, qTEXT in query_col.items():
        query = qTEXT

        topK_scores, speed = top_k_retrieval(q = query, D = doc_collection_test, k = 10, strategy="tiered",
                                             idfDict = idfs, invertedIdx = inverted_index,
                                             lengthIdx = doc_lengths, tieredIdx = tiered_index,
                                             show_documents=False, print_scores=False,
                                             return_results=True, return_speed=True)

        speed_list.append(speed)

In [60]:
speed_avg = np.mean(speed_list)
print("Retrieval time in sec. (avg. over 10 iterations): {:.4f}".format(speed_avg))

Retrieval time in sec. (avg. over 10 iterations): 0.0034


# Approach 5b: 'tiered_index' t=0.8

In [61]:
# Construct tiered index
tiered_index = construct_tiered_index(doc_collection_test, inverted_index, t=0.8)

TieredIndex construction done in 0.1575s.


### topK document retrieval with k=10 

In [62]:
top_k_retrieval(q = query1, D = doc_collection_test, k = 10, strategy="tiered",
                idfDict = idfs, invertedIdx = inverted_index,
                lengthIdx = doc_lengths, preClusterDict=clusters, tieredIdx = tiered_index,
                show_documents=False)

Retrieval time ca. 0.00424910 seconds.
Highest cosine similarity:
	MED-4163 : 0.31251
	MED-1540 : 0.30872
	MED-1541 : 0.30870
	MED-4984 : 0.30004
	MED-3236 : 0.27344
	MED-4644 : 0.26043
	MED-5340 : 0.25942
	MED-4632 : 0.25273
	MED-3779 : 0.24989
	MED-4637 : 0.24899



### Evaluation - Mean Average precision (MAP)

In [63]:
# Evaluate MAP 
AP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = 3162, strategy="tiered",
                                         idfDict = idfs, invertedIdx = inverted_index,
                                         lengthIdx = doc_lengths, tieredIdx = tiered_index,
                                         show_documents=False, print_scores=False,
                                         return_results=True, return_speed=False)
    
    avg_precision = evaluate_AveragePrecision(y_pred=topK_scores, y_true=gold_list)
    AP.append(avg_precision)

In [64]:
MAP = np.mean(AP)
print("MAP: {:.4f}".format(MAP))

MAP: 0.1369


### Evaluation - r-precision

In [65]:
# Evaluate r-precision 
RP = list()

for qIDX, qTEXT in query_col.items():
    query = qTEXT
    gold_list=gold_col[qIDX]
    K = len(gold_list)
    
    topK_scores = top_k_retrieval(q = query, D = doc_collection_test, k = K, strategy="tiered",
                                         idfDict = idfs, invertedIdx = inverted_index,
                                         lengthIdx = doc_lengths, tieredIdx = tiered_index,
                                         show_documents=False, print_scores=False,
                                         return_results=True, return_speed=False)
    
    r_precision = evaluate_pAtRank(y_pred=topK_scores, y_true=gold_list, atRank=K)
    RP.append(r_precision)

In [66]:
RP_avg = np.mean(RP)
print("R-precision: {:.4f}".format(RP_avg))

R-precision: 0.1571


### Retrieval Speed

In [67]:
speed_list = list()

for i in range(10):
    
    for qIDX, qTEXT in query_col.items():
        query = qTEXT

        topK_scores, speed = top_k_retrieval(q = query, D = doc_collection_test, k = 10, strategy="tiered",
                                             idfDict = idfs, invertedIdx = inverted_index,
                                             lengthIdx = doc_lengths, tieredIdx = tiered_index,
                                             show_documents=False, print_scores=False,
                                             return_results=True, return_speed=True)

        speed_list.append(speed)

In [68]:
speed_avg = np.mean(speed_list)
print("Retrieval time in sec. (avg. over 10 iterations): {:.4f}".format(speed_avg))

Retrieval time in sec. (avg. over 10 iterations): 0.0029
