# Performance metrics

With the help of this notebook, we will evaluate a search system's ability to provide us with relevant retrieved results through search queries. This is done through established metrics of precision and recall.

## Query

Let's start by documenting the documents from our queries that we deem relevant.

Each document are represented by **1** (a relevant document) or **0** (an irrelevant document).

The relevance assessment for each search are saved in a list that we call **query**; a container that we can use for computation.

### Query with relevance assessment

In [None]:
query = [0,1,0,0,1,0,1,0,0,1,0,0,1,0,0,0,0,0,0,1]

## Precision

Precision is the proportion of retrieved documents that are relevant.

Answers the question *how precise is the query?*.

Computed through relevant retrieved documents divided by all retrieved documents.

In [None]:
relevant_docs_in_query = query.count(1)
print("Antal relevanta dokument:", relevant_docs_in_query)

retrieved_docs = len(query)
print("Antal återvunna dokument:", retrieved_docs)

precision = relevant_docs_in_query/retrieved_docs
print("Precision för sökningen =", precision, "≈", round(precision * 100), "%.")

## Recall

Recall is the proportion relevant documents that are retrieved.

Answers the question *how many of the relevant documents has been retrieved from the query?*.

Computed through relevant retrieved documents dievided by all relevant documents in the document set.

Consider that there are 15 relevant documents in the document set.

In [None]:
all_relevant_docs = 15

recall = relevant_docs_in_query / all_relevant_docs

print("Recall för sökningen är", recall, "≈", round(recall * 100), "%.")

## Precision at n

Precision at n (or p @ n) states the number of relevant documents at a given level (n).

Computed at recall levels of 5, 10 and 20 since these levels usually are present in the number of documents that are presented of the first search query page.

Below, we'll compute p @ n where n = 10.

In [None]:
n = 10

In [None]:
relevant_docs_in_query_at_n = sum(query[:n])
print("Antal relevanta dokument =", relevant_docs_in_query_at_n)

p_at_n = relevant_docs_in_query_at_n / n
print("Precision at n =", p_at_n, "≈", round(p_at_n * 100), "%.")

## R-precision

Precision at the Rth position in the retrieved set, where R is the total number of relevant documents for a query.

In [None]:
print("Totalt antal relevanta dokument för sökfrågan (R):", relevant_docs_in_query)

r_position = sum(query[:relevant_docs_in_query])
print("Antalet relevanta dokument vid R:", r_position)

r_precision = r_position / relevant_docs_in_query
print("R-precision =", r_precision, "≈", round(r_precision * 100), "%.")

## Average precision at document cut off value

Used to compute ranking efficiency for a query.

Below, we'll compute AP/DCV where DCV = 10.

In [1]:
dcv = 10

We'll start by fetch a numbered list of the relevant documents.

In [None]:
relevant_docs_in_query_at_dcv = sum(query[:dcv])
index_zero_list = list(range(relevant_docs_in_query_at_dcv))

doc_list = []

for document_position in range(len(index_zero_list)):
  doc_list.append(index_zero_list[document_position] + 1)

print("Dokumentlista för sökningen:", doc_list)

Then, we'll investigate the positions of each relevant retrieved document.

In [None]:
recall_levels = []

for document, relevance in enumerate(query[:dcv]):
  if relevance == 1: 
    recall_levels.append(document + 1)

print("Recallnivåer för sökningen:", recall_levels)

Next, we'll compute precision at each recall level.

In [None]:
precision_at_recall = []

for each_precision, each_recall in zip(doc_list, recall_levels):
  precision_at_recall.append(each_precision / each_recall)

print("Precision vid varje recallnivå:", precision_at_recall)

We'll sum each precision value at each recall level.

In [None]:
sum_precision = sum(precision_at_recall)

print("Sammanlagd precision:", sum_precision)

We'll compute the total number of relevant documents at DCV.

In [None]:
relevant_docs_at_dcv = query[:dcv].count(1)

print("Relevanta dokument vid document cut off value:", relevant_docs_at_dcv)

Finally, we'll compute AP/DCV for the query.

In [None]:
ap_dcv = sum_precision/dcv

print("AP/DCV för sökningen =", ap_dcv, "≈", round(ap_dcv * 100), "%.")

## F1 measure

The F1 measure is a metric that combines recall and precision to compute the harmonic mean between precision and recall.

2 / ((1/r) + (1/p))

In [None]:
recall_f1 = 1/recall

precision_f1 = 1/precision

precision_recall_f1 = recall_f1 + precision_f1

f1 = 2 / precision_recall_f1

print("F1-måttet för sökningen =", f1, "≈", round(f1 * 100), "%.")

## Summary

In [None]:
print("Precision ≈", round(precision * 100), "%.")

print("Recall ≈", round(recall * 100), "%.")

print("Precision @ n ≈", round(p_at_n * 100), "%.")

print("R-precision ≈", round(r_precision * 100), "%.")

print("AP/DCV ≈", round(ap_dcv * 100), "%")

print("F1 ≈", round(f1 * 100), "%.")