# Neural search for question answering

## Tasks

1. Read the documentation of the [document store](https://docs.haystack.deepset.ai/docs/document_store) and
   the [retriever](https://docs.haystack.deepset.ai/docs/retriever) in the 
   [Haystack framework](https://haystack.deepset.ai/).

2. Install Haystack framework (e.g. with `pip install 'farm-haystack[all]'`).

3. Configure a document store based on Faiss supported by multilingual E5 model:
   1. For Faiss use [multilingual E5](https://huggingface.co/intfloat/multilingual-e5-base) or [silver retriever base](https://huggingface.co/ipipan/silver-retriever-base-v1) encoder.
   3. **Warning:** If you use E5, make sure to [properly configure](https://github.com/deepset-ai/haystack/issues/5242) the store.
   4. In the case you have problems using Faiss, you can use `InMemoryDocumentStore`, but this will require to re-index
      all documents each time the script is run, which is time consuming.

In [3]:
!python --version

Python 3.10.16


In [70]:
from torch import Tensor
from transformers import AutoTokenizer, AutoModel
from sentence_transformers import SentenceTransformer
import faiss
import torch
import torch.nn.functional as F
import numpy as np
from haystack import Pipeline
from haystack.document_stores import FAISSDocumentStore
from haystack.nodes import EmbeddingRetriever, PromptNode
from haystack.document_stores import InMemoryDocumentStore

In [72]:
from haystack import Pipeline
from haystack.document_stores import FAISSDocumentStore
from haystack.nodes import EmbeddingRetriever, PromptNode

document_store = FAISSDocumentStore(faiss_index_factory_str="Flat")


                1. delete_all_documents() method is deprecated, please use delete_documents method
                For more details, please refer to the issue: https://github.com/deepset-ai/haystack/issues/1045
                


In [73]:
retriever = EmbeddingRetriever(
    document_store=document_store,
    embedding_model="intfloat/multilingual-e5-base",
    model_format="transformers",
    pooling_strategy="reduce_mean",
    top_k=5,
    max_seq_len=512,
)

In [44]:

document_store_in_memory = InMemoryDocumentStore(similarity="cosine", embedding_dim=768)

retriever_for_in_memory = EmbeddingRetriever(
    document_store=document_store_in_memory,
    embedding_model="intfloat/multilingual-e5-base",
    model_format="transformers",
    pooling_strategy="reduce_mean",
    top_k=5,
    max_seq_len=512,
)

In [46]:
document_store_in_memory.update_embeddings(retriever_for_in_memory)

Updating Embedding: 0 docs [00:00, ? docs/s]


In [74]:
print(retriever.batch_size)

32


4. Load the documents (passages) from the FiQA corpus.

In [8]:
from datasets import load_dataset
import pandas as pd
import numpy as np

def modify_df(df, labels):
    for label in labels:
        if label[0] == '_':
            df = df.rename(columns={label: label[1:]})
            label = label[1:]
        df[label] = df[label].astype(np.int64)
    return df

ds = load_dataset("clarin-knext/fiqa-pl", "corpus")
df_corpus = pd.DataFrame(ds['corpus'])
df_corpus = modify_df(df_corpus, ['_id'])
print(df_corpus.head())

ds_q = load_dataset("clarin-knext/fiqa-pl", "queries")
ds_q = pd.DataFrame(ds_q['queries'])
ds_q = modify_df(ds_q, ['_id'])
print(ds_q.head())

ds_qrels = load_dataset("clarin-knext/fiqa-pl-qrels")
df_qrels = pd.DataFrame(ds_qrels['train'])
df_qrels_test = pd.DataFrame(ds_qrels['test'])
df_qrels = modify_df(df_qrels, ['query-id', 'corpus-id', 'score'])
df_qrels_test = modify_df(df_qrels_test, ['query-id', 'corpus-id', 'score'])
print(df_qrels.head())
print(df_qrels_test.head())

   id title                                               text
0   3        Nie mówię, że nie podoba mi się też pomysł szk...
1  31        Tak więc nic nie zapobiega fałszywym ocenom po...
2  56        Nigdy nie możesz korzystać z FSA dla indywidua...
3  59        Samsung stworzył LCD i inne technologie płaski...
4  63        Oto wymagania SEC: Federalne przepisy dotycząc...
   id title                                               text
0   0        Co jest uważane za wydatek służbowy w podróży ...
1   4        Wydatki służbowe - ubezpieczenie samochodu pod...
2   5                        Rozpoczęcie nowego biznesu online
3   6           „Dzień roboczy” i „termin płatności” rachunków
4   7        Nowy właściciel firmy – Jak działają podatki d...
   query-id  corpus-id  score
0         0      18850      1
1         4     196463      1
2         5      69306      1
3         6     560251      1
4         6     188530      1
   query-id  corpus-id  score
0         8     566392      1
1   

In [9]:
ds_q_mapper = dict()
ds_corpus_mapper = dict()

for idx, query in ds_q.iterrows():
    ds_q_mapper[query['id']] = query['text']

for idx, corpus in df_corpus.iterrows():
    ds_corpus_mapper[corpus['id']] = corpus['text']


print(ds_q_mapper[5993])

Dlaczego ktokolwiek miałby chcieć najpierw spłacić swoje długi w inny sposób niż „najwyższe odsetki”?


In [10]:
df_joined_test = pd.merge(df_corpus, df_qrels_test, left_on='id', right_on='corpus-id', how='left').drop(columns=['corpus-id'])
df_joined_test.fillna({'score': 0, 'query-id': -1}, inplace=True)
df_joined_test['query-id'] = df_joined_test['query-id'].astype(np.int64)
print(df_joined_test.head())

   id title                                               text  query-id  \
0   3        Nie mówię, że nie podoba mi się też pomysł szk...        -1   
1  31        Tak więc nic nie zapobiega fałszywym ocenom po...        -1   
2  56        Nigdy nie możesz korzystać z FSA dla indywidua...        -1   
3  59        Samsung stworzył LCD i inne technologie płaski...        -1   
4  63        Oto wymagania SEC: Federalne przepisy dotycząc...        -1   

   score  
0    0.0  
1    0.0  
2    0.0  
3    0.0  
4    0.0  


In [20]:
passages = []
for idx, data in df_corpus.iterrows():
    id = data['id']
    title = data['title']
    txt = data['text']
    passages.append({'content': f"passage: {txt}", 'meta': {'id': data['id']}})
    

In [75]:
print(passages[0:4])

[{'content': 'passage: Nie mówię, że nie podoba mi się też pomysł szkolenia w miejscu pracy, ale nie możesz oczekiwać, że firma to zrobi. Szkolenie pracowników to nie ich praca – oni tworzą oprogramowanie. Być może systemy edukacyjne w Stanach Zjednoczonych (lub ich studenci) powinny trochę martwić się o zdobycie umiejętności rynkowych w zamian za ich ogromne inwestycje w edukację, zamiast wychodzić z tysiącami zadłużonych studentów i narzekać, że nie są do niczego wykwalifikowani.', 'meta': {'id': 3}}, {'content': 'passage: Tak więc nic nie zapobiega fałszywym ocenom poza dodatkową kontrolą ze strony rynku/inwestorów, ale istnieją pewne nowsze kontrole, które uniemożliwiają instytucjom korzystanie z nich. W ramach DFA banki nie mogą już polegać wyłącznie na ratingach kredytowych jako należytej staranności przy zakupie instrumentu finansowego, więc to jest plus. Intencją jest to, że jeśli instytucje finansowe wykonują swoją własną pracę, to *być może* dojdą do wniosku, że określony CDO

In [76]:
document_store.delete_all_documents()
document_store.write_documents(passages)

                1. delete_all_documents() method is deprecated, please use delete_documents method
                For more details, please refer to the issue: https://github.com/deepset-ai/haystack/issues/1045
                
Writing Documents: 60000it [01:51, 539.68it/s]                           


In [47]:
document_store_in_memory.delete_documents()
document_store_in_memory.write_documents(passages)

In [77]:
document_store.update_embeddings(retriever)

Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.34 Batches/s]
Inferencing Samples: 100%|██████████| 313/313 [58:21<00:00, 11.19s/ Batches]
Inferencing Samples: 100%|██████████| 313/313 [57:51<00:00, 11.09s/ Batches]/s]
Inferencing Samples: 100%|██████████| 313/313 [57:57<00:00, 11.11s/ Batches]cs/s]
Inferencing Samples: 100%|██████████| 313/313 [57:52<00:00, 11.09s/ Batches]cs/s]
Inferencing Samples: 100%|██████████| 313/313 [59:55<00:00, 11.49s/ Batches]cs/s]
Inferencing Samples: 100%|██████████| 238/238 [45:58<00:00, 11.59s/ Batches]/s]  
Documents Processed: 60000 docs [5:38:51,  2.95 docs/s]                        


In [48]:
document_store_in_memory.update_embeddings(retriever_for_in_memory)

Inferencing Samples: 100%|██████████| 313/313 [59:37<00:00, 11.43s/ Batches]
Inferencing Samples: 100%|██████████| 313/313 [59:23<00:00, 11.38s/ Batches]/s]
Inferencing Samples: 100%|██████████| 313/313 [59:27<00:00, 11.40s/ Batches]cs/s]
Inferencing Samples: 100%|██████████| 313/313 [59:19<00:00, 11.37s/ Batches]cs/s]
Inferencing Samples: 100%|██████████| 313/313 [59:38<00:00, 11.43s/ Batches]cs/s]
Inferencing Samples: 100%|██████████| 238/238 [45:18<00:00, 11.42s/ Batches]/s]  
Documents Processed: 60000 docs [5:43:10,  2.91 docs/s]                        


6m38s 1000 samples in 256 batchsize

In [78]:
document_store.save(index_path="./data/index2/document_index.faiss")

In [80]:
document_store.load(index_path="./data/index2/document_index.faiss")

<haystack.document_stores.faiss.FAISSDocumentStore at 0x7f9fd4434370>

In [81]:
retriever.retrieve(query="What is haystack?", top_k=5)

Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.81 Batches/s]


[<Document: {'content': 'passage: Dla Maca to zdecydowanie iFinance.', 'content_type': 'text', 'score': 0.9072425449685203, 'meta': {'id': 281322, 'vector_id': '22006'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '6bcefb03b1b61c71fb5364758f5ee8d1'}>,
 <Document: {'content': 'passage: ', 'content_type': 'text', 'score': 0.9049649822622359, 'meta': {'id': 597929, 'vector_id': '19650'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': '61f526150704962cb6b420e009e00ac9'}>,
 <Document: {'content': 'passage: Tak, możesz. nazywa się Odd Lot', 'content_type': 'text', 'score': 0.9029244803001298, 'meta': {'id': 194404, 'vector_id': '47622'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'd68b76e74408bc6055d29a5f7145d1de'}>,
 <Document: {'content': 'passage: 2 rzeczy:', 'content_type': 'text', 'score': 0.8984933453295506, 'meta': {'id': 339473, 'vector_id': '44576'}, 'id_hash_keys': ['content'], 'embedding': None, 'id': 'ca27e31718841de4a9c5e3f96b62d78b'}>,
 <Document: {'

In [50]:
print(document_store.get_document_by_id(id=3))

None


5. Use the set of questions and the scorings defined in this corpus, to compute NDCG@5 for the dense retriever.

In [27]:
corpus_query_mapping = {} # query_id -> [corpus_ids]

for i, row in df_joined_test.iterrows():
    if row['query-id'] == -1:
        continue
    if row['query-id'] not in corpus_query_mapping:
        corpus_query_mapping[int(row['query-id'])] = []
    corpus_query_mapping[int(row['query-id'])].append(row['id'])

ranking = []
maxi = 0
id_maxi = -1
for query_id, corpus_ids in corpus_query_mapping.items():
    ranking.append((corpus_ids, query_id))
    if len(corpus_ids) > maxi:
        maxi = len(corpus_ids)
        id_maxi = query_id

ranking = sorted(ranking, key=lambda x: len(x[0]), reverse=True)
best_queries = []
for idx, (corpus_ids, query_id) in enumerate(ranking):
    print("query_id: ", query_id, " amount of documents: ", len(corpus_ids), "documents: ", corpus_ids)
    best_queries.append(int(query_id))
    if idx == 10: break

query_id:  5993  amount of documents:  15 documents:  [5827, 55084, 63501, 63690, 94373, 128574, 160193, 224918, 230215, 272866, 287571, 352638, 367375, 426120, 431212]
query_id:  2348  amount of documents:  15 documents:  [134864, 146441, 211622, 211867, 247486, 265874, 268261, 306430, 352271, 381757, 410166, 447619, 474234, 543714, 566573]
query_id:  6005  amount of documents:  13 documents:  [73310, 135415, 149500, 176498, 270856, 345895, 384626, 390689, 414288, 414534, 478457, 507544, 572272]
query_id:  6131  amount of documents:  12 documents:  [2460, 170204, 218088, 235452, 252534, 258465, 326094, 334111, 365263, 368806, 381720, 416679]
query_id:  776  amount of documents:  12 documents:  [10440, 124027, 127263, 220127, 332373, 467044, 496899, 583640, 591516, 592680, 597247, 597880]
query_id:  6002  amount of documents:  11 documents:  [34389, 154181, 233472, 273501, 359862, 390642, 391819, 404605, 516848, 519346, 593434]
query_id:  5511  amount of documents:  10 documents:  [127

In [40]:
def calculate_ndcg_for_QA(result, searching_query_id, n):
    amount_of_one = len(corpus_query_mapping[int(searching_query_id)])

    DCG = 0
    for idx, answer in enumerate(result):
        # print(answer.meta)
        gain = 0
        corpus_id = answer.meta['id']

        if corpus_id in corpus_query_mapping[int(searching_query_id)]:
            gain = 1

        DCG += gain / np.log2(idx + 2)
        if idx == n: break

    IDCG = 0
    for idx in range(min(amount_of_one, n)):
        IDCG += 1 / np.log2(idx + 2)

    if IDCG == 0:
        nDCG = 0
    else: 
        nDCG = DCG / IDCG

    return nDCG

In [82]:
encoded_query_dict = {} # query_id -> encoded_query
n = 5
scores_5 = {}
scores_10 = {}
for id, (coupus_id, query_id) in enumerate(ranking):
    result = retriever.retrieve(query="query: "+ds_q_mapper[query_id], top_k=n)
    ndcg_5 = calculate_ndcg_for_QA(result, query_id, 5)
    ndcg_10 = calculate_ndcg_for_QA(result, query_id, 10)
    scores_5[query_id] = ndcg_5
    scores_10[query_id] = ndcg_10
    print(f"query_id: {query_id}, ndcg5: {ndcg_5}, ndcg10: {ndcg_10}")
    if id == 20: break

Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.08 Batches/s]


query_id: 5993, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.86 Batches/s]


query_id: 2348, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.33 Batches/s]


query_id: 6005, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.25 Batches/s]


query_id: 6131, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.94 Batches/s]


query_id: 776, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.14 Batches/s]


query_id: 6002, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.84 Batches/s]


query_id: 5511, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.28 Batches/s]


query_id: 659, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.24 Batches/s]


query_id: 10497, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.78 Batches/s]


query_id: 3909, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.35 Batches/s]


query_id: 4409, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.30 Batches/s]


query_id: 3724, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.72 Batches/s]


query_id: 2075, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.18 Batches/s]


query_id: 5951, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.30 Batches/s]


query_id: 2296, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.00 Batches/s]


query_id: 2204, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.33 Batches/s]


query_id: 8974, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.36 Batches/s]


query_id: 2685, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.18 Batches/s]


query_id: 11039, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.15 Batches/s]


query_id: 2376, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.79 Batches/s]

query_id: 6221, ndcg5: 0.0, ndcg10: 0.0





In [83]:
encoded_query_dict = {} # query_id -> encoded_query
n = 5
scores_5 = {}
scores_10 = {}
for id, (coupus_id, query_id) in enumerate(ranking):
    result = retriever_for_in_memory.retrieve(query="query: "+ds_q_mapper[query_id], top_k=n)
    ndcg_5 = calculate_ndcg_for_QA(result, query_id, 5)
    ndcg_10 = calculate_ndcg_for_QA(result, query_id, 10)
    scores_5[query_id] = ndcg_5
    scores_10[query_id] = ndcg_10
    print(f"query_id: {query_id}, ndcg5: {ndcg_5}, ndcg10: {ndcg_10}")
    if id == 20: break

Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.77 Batches/s]


query_id: 5993, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.91 Batches/s]


query_id: 2348, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.05 Batches/s]


query_id: 6005, ndcg5: 0.48522855511632257, ndcg10: 0.31488013066763093


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.96 Batches/s]


query_id: 6131, ndcg5: 0.16958010263680806, ndcg10: 0.11004588314904008


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.47 Batches/s]


query_id: 776, ndcg5: 0.3391602052736161, ndcg10: 0.22009176629808017


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.78 Batches/s]


query_id: 6002, ndcg5: 0.6548086577531307, ndcg10: 0.424926013816671


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.46 Batches/s]


query_id: 5511, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.07 Batches/s]


query_id: 659, ndcg5: 0.5087403079104241, ndcg10: 0.33013764944712026


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.99 Batches/s]


query_id: 10497, ndcg5: 0.6992148198508501, ndcg10: 0.4537425745411855


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.40 Batches/s]


query_id: 3909, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.89 Batches/s]


query_id: 4409, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.67 Batches/s]


query_id: 3724, ndcg5: 0.3391602052736161, ndcg10: 0.22009176629808017


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.06 Batches/s]


query_id: 2075, ndcg5: 0.3391602052736161, ndcg10: 0.23504554941448536


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.07 Batches/s]


query_id: 5951, ndcg5: 0.3391602052736161, ndcg10: 0.23504554941448536


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.74 Batches/s]


query_id: 2296, ndcg5: 0.5531464700081437, ndcg10: 0.38334277998463445


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.35 Batches/s]


query_id: 2204, ndcg5: 0.7227265726449519, ndcg10: 0.5390031312763821


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  1.91 Batches/s]


query_id: 8974, ndcg5: 0.3391602052736161, ndcg10: 0.2529427027676571


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.01 Batches/s]


query_id: 2685, ndcg5: 0.13120507751234178, ndcg10: 0.09785159463516042


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.46 Batches/s]


query_id: 11039, ndcg5: 0.0, ndcg10: 0.0


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.46 Batches/s]


query_id: 2376, ndcg5: 0.8687949224876582, ndcg10: 0.647939623894138


Inferencing Samples: 100%|██████████| 1/1 [00:00<00:00,  2.44 Batches/s]


query_id: 6221, ndcg5: 0.0, ndcg10: 0.0


6. Compare the NDCG score from this exercise with the score from [lab 2](2-fts.md) and from [lab 6](6-classification.md).

In [58]:
scores_from_lab2 = {
    "5993": 0.0267,
    "2348": 0.13595,
    "6005": 0.029648,
    "6131": 0.029969,
    "776": 0.04750,
    "6002": 0.103275,
    "5511": 0.017775,
    "659": 0.070973,
    "3909": 0.075426,
    "4409": 0.091913,
}

scores_from_lab6 = {
    "5993": [0.2837, 0.2048],
    "2348": [0.2048, 0.3301],
    "6005": [0.2719, 0.3882],
    "6131": [0.2201, 0.2201],
    "776": [0.2837, 0.1952],
    "6002": [0.6236, 0.4331],
    "5511": [0, 0],
    "659": [0, 0],
    "3909": [0, 0],
    "4409": [0, 0],
}

In [68]:
from prettytable import PrettyTable

nDCG_results_table = PrettyTable(['ndcg5', 'ndcg10', 'lab2', 'lab6_FTS', 'lab6_Reranker'])

for idx, ((_,ndcg5), (_,ndcg10), (_,lab2), (_,lab6)) in enumerate(zip(scores_5.items(), scores_10.items(), scores_from_lab2.items(), scores_from_lab6.items())):
    nDCG_results_table.add_row([round(ndcg5, 4), round(ndcg10, 4), lab2, lab6[0], lab6[1]])

In [69]:
print(nDCG_results_table)

+--------+--------+----------+----------+---------------+
| ndcg5  | ndcg10 |   lab2   | lab6_FTS | lab6_Reranker |
+--------+--------+----------+----------+---------------+
|  0.0   |  0.0   |  0.0267  |  0.2837  |     0.2048    |
|  0.0   |  0.0   | 0.13595  |  0.2048  |     0.3301    |
| 0.4852 | 0.3149 | 0.029648 |  0.2719  |     0.3882    |
| 0.1696 |  0.11  | 0.029969 |  0.2201  |     0.2201    |
| 0.3392 | 0.2201 |  0.0475  |  0.2837  |     0.1952    |
| 0.6548 | 0.4249 | 0.103275 |  0.6236  |     0.4331    |
|  0.0   |  0.0   | 0.017775 |    0     |       0       |
| 0.5087 | 0.3301 | 0.070973 |    0     |       0       |
| 0.6992 | 0.4537 | 0.075426 |    0     |       0       |
|  0.0   |  0.0   | 0.091913 |    0     |       0       |
+--------+--------+----------+----------+---------------+


7. **Bonus** (+2p) Combine dense retrieval with classification model from [lab 6](6-classification.md) to implement a two-step
   retrieval. Compute NDCG@5 for this combined model.

8. **Bonus** (+2p) Use a different dense encoder, e.g. [E5 large](https://huggingface.co/intfloat/multilingual-e5-large) or [Polish Roberta Base](https://huggingface.co/sdadas/mmlw-retrieval-roberta-base) and compute NDCG@5.

## Questions (2 points)

Disclaimer:
Faiss document store didn't work, so I used InMemoryDocumentStore.

1. Which of the methods: lexical match (e.g. ElasticSearch) or dense representation works better?\
    **Dense representation works better. Sometimes FTS or Reranker performs well, but for dense representation the results achieved higher figures.**
2. Which of the methods is faster?\
    **If we consider preprocessing Elastic Search is faster, but process of searching answer is quite the same for both.**
3. Try to determine the other pros and cons of using lexical search and dense document retrieval models.\
    **Dense retrieval model:\
        pros:\
        - it incorporates meaning in searching, so it is able to search more advanced language constructions with paraphrases words\
        cons:\
        - computationally intensive\
        - can omit keywords\
        - require fine-tuning and specified training\
      Lexical search:\
        pros:\
        - looks for exact match of words and some of phrases requires that\
        - relatively simple and fast\
        cons:\
        - difficult to handle synonyms, misspellings paraphrases\
        - doesn't take meaning of words into account and it leads to useless results**