## Installing Haystack

To start, let's install the latest release of Haystack with `pip`:

In [1]:
# %%bash

# pip install --upgrade pip
# pip install farm-haystack[colab]

Set the logging level to INFO:

In [1]:
import logging

logging.basicConfig(format="%(levelname)s - %(name)s -  %(message)s", level=logging.WARNING)
logging.getLogger("haystack").setLevel(logging.INFO)

## Initializing the ElasticsearchDocumentStore


1. Download, extract, and set the permissions for the Elasticsearch installation image:

In [2]:
import elasticsearch

In [None]:
%%bash

wget https://artifacts.elastic.co/downloads/elasticsearch/elasticsearch-7.9.2-linux-x86_64.tar.gz -q
tar -xzf elasticsearch-7.9.2-linux-x86_64.tar.gz
chown -R daemon:daemon elasticsearch-7.9.2

2. Start the server:

In [5]:
%%bash --bg

sudo -u daemon -- elasticsearch-7.9.2/bin/elasticsearch

In [3]:
import time
time.sleep(30)

4. Initialize the ElasticsearchDocumentStore:


In [4]:
from haystack.utils import launch_es
launch_es()

In [5]:
import os
from haystack.document_stores import ElasticsearchDocumentStore

# Get the host where Elasticsearch is running, default to localhost
host = os.environ.get("ELASTICSEARCH_HOST", "localhost")

document_store = ElasticsearchDocumentStore(
    host=host,
    username="",
    password="",
    index="document"
)



## Indexing Documents with a Pipeline


In [6]:
import pandas as pd
import re

df = pd.read_csv("pt_question_answers.csv")

df.shape

df[["pt_title", "pt_body", "pt_answer"]]

df["text"] = "question: " + df["pt_title"] + "\n" + df["pt_body"] + "\n" + "answer: " + df["pt_answer"]
# df["text"] = "question: " + df["pt_title"] + "\n" + "answer: " + df["pt_answer"]

df = df[["text"]]

CLEANR = re.compile('<.*?>') 

def cleanhtml(raw_html):
  cleantext = re.sub(CLEANR, '', raw_html)
  return cleantext

df["text"] = df["text"].apply(lambda x: cleanhtml(x))

df["text"] = df["text"].str.lower()

df

Unnamed: 0,text
0,question: extracting the top-k value-indices from a 1-d tensor\nanswer: as o...
1,question: how to display custom images in tensorboard (e.g. matplotlib plots...
2,"question: python wheels: cp27mu not supported\nanswer: yes, that is possible..."
3,question: loading torch7 trained models (.t7) in pytorch\nanswer: view() res...
4,question: pytorch: how to use dataloaders for custom datasets\nanswer: while...
...,...
14588,question: how to disable neptune callback in transformers trainer runs?\nans...
14589,question: bgr to rgb for cub_200 images by image.split()\nanswer: i would st...
14590,question: neural networks extending learning domain\nanswer: what you want i...
14591,question: how do i multiply tensors like this?\nanswer: you should familiari...


In [7]:
d=df['text'].tolist()
file = open('data_qa.txt','w')
file.writelines(d)
file.close()


2. Initialize the pipeline, TextConverter, and PreProcessor:

In [8]:
from haystack import Pipeline
from haystack.nodes import TextConverter, PreProcessor

indexing_pipeline = Pipeline()
text_converter = TextConverter()
preprocessor = PreProcessor(
    clean_whitespace=True,
    clean_header_footer=True,
    clean_empty_lines=True,
    split_by="word",
    split_length=1024,
    split_overlap=20,
    split_respect_sentence_boundary=True,
)


In [9]:
import os

indexing_pipeline.add_node(component=text_converter, name="TextConverter", inputs=["File"])
indexing_pipeline.add_node(component=preprocessor, name="PreProcessor", inputs=["TextConverter"])
indexing_pipeline.add_node(component=document_store, name="DocumentStore", inputs=["PreProcessor"])


3. Run the indexing pipeline to write the text data into the DocumentStore:

In [10]:
files_to_index = ['data_qa.txt']
indexing_pipeline.run_batch(file_paths=files_to_index)

INFO - haystack.pipelines.base -  It seems that an indexing Pipeline is run, so using the nodes' run method instead of run_batch.


Converting files:   0%|          | 0/1 [00:00<?, ?it/s]

Preprocessing:   0%|          | 0/1 [00:00<?, ?docs/s]



{'documents': [<Document: {'content': 'question: extracting the top-k value-indices from a 1-d tensor\nanswer: as of pull request #496 torch now includes a built-in api named torch.topk. example:\n\n&gt; t = torch.tensor{9, 1, 8, 2, 7, 3, 6, 4, 5}\n\n-- obtain the 3 smallest elements\n&gt; res = t:topk(3)\n&gt; print(res)\n1\n2\n3\n[torch.doubletensor of size 3]\n\n-- you can also get the indices in addition\n&gt; res, ind = t:topk(3)\n&gt; print(ind)\n2\n4\n6\n[torch.longtensor of size 3]\n\n-- alternatively you can obtain the k largest elements as follow\n-- (see the api documentation for more details)\n&gt; res = t:topk(3, true)\n&gt; print(res)\n9\n8\n7\n[torch.doubletensor of size 3]\n\nat the time of writing the cpu implementation follows a sort and narrow approach (there are plans to improve it in the future). that being said an optimized gpu implementation for cutorch is currently being reviewed.\nquestion: how to display custom images in tensorboard (e.g. matplotlib plots)?\na

## Initializing the Retriever


## BM25Retriever and FarmReader roberta

In [11]:
from haystack.nodes import BM25Retriever, EmbeddingRetriever
from haystack.nodes import FARMReaderrom haystack.nodes import FARMReader, TransformersReader
from haystack import Pipeline


bm_retriever = BM25Retriever(document_store=document_store)

In [12]:
farm_reader_roberta = FARMReader(model_name_or_path="deepset/roberta-base-squad2", use_gpu=True)

INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.modeling.model.language_model -   * LOADING MODEL: 'deepset/roberta-base-squad2' (Roberta)
INFO - haystack.modeling.model.language_model -  Auto-detected model language: english
INFO - haystack.modeling.model.language_model -  Loaded 'deepset/roberta-base-squad2' (Roberta model) from model hub.
INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1


In [13]:
## Creating the Retriever-Reader Pipeline

bm25_querying_pipeline = Pipeline()
bm25_querying_pipeline.add_node(component=bm_retriever, name="Retriever", inputs=["Query"])
bm25_querying_pipeline.add_node(component=farm_reader_roberta, name="Reader", inputs=["Retriever"])


## EmbeddingRetriever using sentence-transformers and FarmReader albert xxlargev1

In [14]:
embedding_retriever = EmbeddingRetriever(document_store=document_store,
                              embedding_model="sentence-transformers/multi-qa-mpnet-base-dot-v1",
                               model_format="sentence_transformers")

INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.nodes.retriever.dense -  Init retriever using embeddings of model sentence-transformers/multi-qa-mpnet-base-dot-v1


In [15]:
farm_reader_albert = FARMReader(model_name_or_path="ahotrod/albert_xxlargev1_squad2_512", use_gpu=True)

INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.modeling.model.language_model -   * LOADING MODEL: 'ahotrod/albert_xxlargev1_squad2_512' (Albert)
INFO - haystack.modeling.model.language_model -  Auto-detected model language: english
INFO - haystack.modeling.model.language_model -  Loaded 'ahotrod/albert_xxlargev1_squad2_512' (Albert model) from model hub.
INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1


In [16]:
embedding_querying_pipeline = Pipeline()
embedding_querying_pipeline.add_node(component=embedding_retriever, name="Retriever", inputs=["Query"])
embedding_querying_pipeline.add_node(component=farm_reader_albert, name="Reader", inputs=["Retriever"])


## EmbeddingRetriever using sentence-transformers and FarmReader albert xxlargev2

In [17]:
farm_reader_albertv2 = FARMReader(model_name_or_path="mfeb/albert-xxlarge-v2-squad2", use_gpu=True)
## Creating the Retriever-Reader Pipeline

embedding_querying_pipeline1 = Pipeline()
embedding_querying_pipeline1.add_node(component=embedding_retriever, name="Retriever", inputs=["Query"])
embedding_querying_pipeline1.add_node(component=farm_reader_albertv2, name="Reader", inputs=["Retriever"])


INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.modeling.model.language_model -   * LOADING MODEL: 'mfeb/albert-xxlarge-v2-squad2' (Albert)
INFO - haystack.modeling.model.language_model -  Auto-detected model language: english
INFO - haystack.modeling.model.language_model -  Loaded 'mfeb/albert-xxlarge-v2-squad2' (Albert model) from model hub.
INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1


## EmbeddingRetriever using sentence-transformers and TransformersReader albert xxlargev2

In [18]:
transformers_reader_albertv2 = TransformersReader(model_name_or_path="mfeb/albert-xxlarge-v2-squad2", use_gpu=True)
## Creating the Retriever-Reader Pipeline

embedding_querying_pipeline2 = Pipeline()
embedding_querying_pipeline2.add_node(component=embedding_retriever, name="Retriever", inputs=["Query"])
embedding_querying_pipeline2.add_node(component=transformers_reader_albertv2, name="Reader", inputs=["Retriever"])


INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1


## DensePassageRetriever and FARMReader

In [19]:
from haystack.document_stores import FAISSDocumentStore
from haystack.nodes import DensePassageRetriever

#dpr_document_store = FAISSDocumentStore(similarity="dot_product")

dpr_retriever = DensePassageRetriever(
    document_store=document_store,
    query_embedding_model="facebook/dpr-question_encoder-single-nq-base",
    passage_embedding_model="facebook/dpr-ctx_encoder-single-nq-base"
)

farm_reader_albertv2 = FARMReader(model_name_or_path="mfeb/albert-xxlarge-v2-squad2", use_gpu=True)

dense_querying_pipeline = Pipeline()
dense_querying_pipeline.add_node(component=dpr_retriever, name="Retriever", inputs=["Query"])
dense_querying_pipeline.add_node(component=farm_reader_albertv2, name="Reader", inputs=["Retriever"])


INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.modeling.model.language_model -  Auto-detected model language: english
The tokenizer class you load from this checkpoint is not the same type as the class this function is called from. It may result in unexpected tokenization. 
The tokenizer class you load from this checkpoint is 'DPRQuestionEncoderTokenizer'. 
The class this function is called from is 'DPRContextEncoderTokenizerFast'.
INFO - haystack.modeling.model.language_model -  Auto-detected model language: english
INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.modeling.utils -  Using devices: CUDA:0 - Number of GPUs: 1
INFO - haystack.modeling.model.language_model -   * LOADING MODEL: 'mfeb/albert-xxlarge-v2-squad2' (Albert)
INFO - haystack.modeling.model.language_model -  Auto-detected model language: english
INFO - haystack.modeling.model.language_model -  Loaded 'mfeb/albert-xxlarge-v2-squad

## Asking a Question


In [33]:
from pprint import pprint
def get_answer(querying_pipeline, query):
    prediction = querying_pipeline.run(
    query=query,
    params={
        "Retriever": {"top_k": 5},
        "Reader": {"top_k": 1,"debug": True}
    })
    
    return prediction["answers"][0].answer
    
    
    
    

In [21]:
top_10_questions = pd.read_csv("top100questions.csv").iloc[:10].question.tolist()

top_10_questions

['How do I check if PyTorch is using the GPU?\n',
 'How do I save a trained model in PyTorch?\n',
 'What does .view() do in PyTorch?\n',
 'Why do we need to call zero_grad() in PyTorch?\n',
 'How do I print the model summary in PyTorch?\n',
 'How do I initialize weights in PyTorch?\n',
 'What does model.eval() do in pytorch?\n',
 "What's the difference between reshape and view in pytorch?\n",
 'What does model.train() do in PyTorch?\n',
 'What does .contiguous() do in PyTorch?\n']

In [22]:
## bm25_querying_pipeline - BM25Retriever and FarmReader roberta

bm25_roberta = []
for query in top_10_questions:
    answer = get_answer(bm25_querying_pipeline, query)
    
    print("Query: ", query)
    print("Answer: ", answer)
    bm25_roberta.append(answer)
    print("\n\n\n")

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I check if PyTorch is using the GPU?

Answer:  identify the model of your graphics card






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I save a trained model in PyTorch?

Answer:  however you want






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does .view() do in PyTorch?

Answer:  all the elements






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  Why do we need to call zero_grad() in PyTorch?

Answer:  when we want to &quot;conserve&quot; ram with massive datasets






Inferencing Samples:   0%|          | 0/3 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I print the model summary in PyTorch?

Answer:  batch_size






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I initialize weights in PyTorch?

Answer:  randomly






Inferencing Samples:   0%|          | 0/3 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does model.eval() do in pytorch?

Answer:  changes






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What's the difference between reshape and view in pytorch?

Answer:  flatten a tensor






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does model.train() do in PyTorch?

Answer:  run multiple times and print output






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  What does .contiguous() do in PyTorch?

Answer:  to convert non-contiguous tensors to contiguous tensors






In [24]:
## embedding_querying_pipeline - EmbeddingRetriever using sentence-transformers and FarmReader albert xxlargev1

embedding_albertxxlv1 = []
for query in top_10_questions:
    answer = get_answer(embedding_querying_pipeline, query)
    
    print("Query: ", query)
    print("Answer: ", answer)
    embedding_albertxxlv1.append(answer)
    print("\n\n\n")

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  How do I check if PyTorch is using the GPU?

Answer:  torch.cuda.is_available()






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  How do I save a trained model in PyTorch?

Answer:  you can directly save the model itself, or you can save a dictionary that includes multiple models






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/12 [00:00<?, ? Batches/s]

ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-12202, -12171) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-21231, -21199) with a span answer. 


Query:  What does .view() do in PyTorch?

Answer:  creates a view with different dimensions of the storage associated with tensor






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  Why do we need to call zero_grad() in PyTorch?

Answer:  reduce memory consumption






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/3 [00:00<?, ? Batches/s]

Query:  How do I print the model summary in PyTorch?

Answer:  #print(cost(x,y,beta))






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  How do I initialize weights in PyTorch?

Answer:  self.weight = torch.nn.linear(in_features, out_featues)






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/11 [00:00<?, ? Batches/s]

ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-28611, -28593) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-28999, -28992) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-1801, -1784) with a span answer. 


Query:  What does model.eval() do in pytorch?

Answer:  fix the parameters of bn






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  What's the difference between reshape and view in pytorch?

Answer:  explicit exception






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/18 [00:00<?, ? Batches/s]

ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-884, -848) with a span answer. 


Query:  What does model.train() do in PyTorch?

Answer:  train(model, batch)






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  What does .contiguous() do in PyTorch?

Answer:  a no-op






In [25]:
## embedding_querying_pipeline1 - EmbeddingRetriever using sentence-transformers and FarmReader albert xxlargev2

embedding_albertxxlv2 = []
for query in top_10_questions:
    answer = get_answer(embedding_querying_pipeline1, query)
    
    print("Query: ", query)
    print("Answer: ", answer)
    embedding_albertxxlv2.append(answer)
    print("\n\n\n")

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  How do I check if PyTorch is using the GPU?

Answer:  torch.cuda.is_available()






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  How do I save a trained model in PyTorch?

Answer:  checkpoint = {'state_dict': model.state_dict(),'optimizer' :optimizer.state_dict()}
torch.save(checkpoint, 'checkpoint.pth')






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/12 [00:00<?, ? Batches/s]

ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-19759, -19229) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-16439, -16416) with a span answer. 


Query:  What does .view() do in PyTorch?

Answer:  reshapes the tensor to a






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  Why do we need to call zero_grad() in PyTorch?

Answer:  you are passing the map_location to the wrong function






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/3 [00:00<?, ? Batches/s]

Query:  How do I print the model summary in PyTorch?

Answer:  summary(self.model, (1, 34, 8))






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  How do I initialize weights in PyTorch?

Answer:  from the weights of the pretrained model






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/11 [00:00<?, ? Batches/s]

ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-21637, -21597) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-11030, -10964) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-18328, -18256) with a span answer. 


Query:  What does model.eval() do in pytorch?

Answer:  fix the parameters of bn






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  What's the difference between reshape and view in pytorch?

Answer:  .contiguous().view(shape) will create a copy






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/18 [00:00<?, ? Batches/s]

ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-19817, -19772) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-31681, -31663) with a span answer. 


Query:  What does model.train() do in PyTorch?

Answer:  train(model, batch)






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

Query:  What does .contiguous() do in PyTorch?

Answer:  tensor.permute(2,0,1,3).contiguous()






In [26]:
## embedding_querying_pipeline2 - EmbeddingRetriever using sentence-transformers and TransformersReader albert xxlargev2

embedding_transformersReader = []
for query in top_10_questions:
    answer = get_answer(embedding_querying_pipeline2, query)
    
    print("Query: ", query)
    print("Answer: ", answer)
    embedding_transformersReader.append(answer)
    print("\n\n\n")

Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I check if PyTorch is using the GPU?

Answer:   torch.cuda.is_available()






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I save a trained model in PyTorch?

Answer:  
is it possible to save a file from test_step() function?






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does .view() do in PyTorch?

Answer:   reshapes the tensor to a different but compatible shape.






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  Why do we need to call zero_grad() in PyTorch?

Answer:   too many indices for array"






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I print the model summary in PyTorch?

Answer:   summary(self.model, (1, 34, 8))






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I initialize weights in PyTorch?

Answer:   randomly:






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does model.eval() do in pytorch?

Answer:   fix the parameters of bn,






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What's the difference between reshape and view in pytorch?

Answer:   z0 is a new view of x, but z1 is a copy:






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does model.train() do in PyTorch?

Answer:  
train(model, batch)






Batches:   0%|          | 0/1 [00:00<?, ?it/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does .contiguous() do in PyTorch?

Answer:   essentially a no-op,






In [27]:
## dense_querying_pipeline - DensePassageRetriever and FarmReader

dense_retriever = []
for query in top_10_questions:
    answer = get_answer(dense_querying_pipeline, query)
    
    print("Query: ", query)
    print("Answer: ", answer)
    dense_retriever.append(answer)
    print("\n\n\n")

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I check if PyTorch is using the GPU?

Answer:  compare

import numpy as np
a = np.ones(5)
b = a

followed by either

np.add(a, 1, out=a)
print(b)

or

a = a + 1
print(b)






Inferencing Samples:   0%|          | 0/18 [00:00<?, ? Batches/s]

ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-10955, -10920) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-28931, -28892) with a span answer. 
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the docu

Query:  How do I save a trained model in PyTorch?

Answer:  _singleprocessdataloaderiter' object is not callabl






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does .view() do in PyTorch?

Answer:  sigma.view(out_n, 1, 1, 1).repeat(out_n, in_c, out_h, out_w)






Inferencing Samples:   0%|          | 0/10 [00:00<?, ? Batches/s]

ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-32467, -32438) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-8114, -8068) with a span answer. 
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the docume

Query:  Why do we need to call zero_grad() in PyTorch?

Answer:  # pred is also a non-leaf tensor






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  How do I print the model summary in PyTorch?

Answer:  -> onnx -> coreml






Inferencing Samples:   0%|          | 0/10 [00:00<?, ? Batches/s]

ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-26665, -26589) with a span answer. 
ERROR - haystack.modeling.model.predictions -  Invalid end offset: 
(-25977, -25887) with a span answer. 
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the docu

Query:  How do I initialize weights in PyTorch?

Answer:  y in numericalize(self, arr, device)
321                 arr = self.postproc






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does model.eval() do in pytorch?

Answer:  requires input dimension to be correctly put






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What's the difference between reshape and view in pytorch?

Answer:  this method retrain all weights






Inferencing Samples:   0%|          | 0/2 [00:00<?, ? Batches/s]

INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.
INFO - haystack.schema -  Setting the ID manually. This might cause a mismatch with the ID that would be generated from the document content and id_hash_keys value.


Query:  What does model.train() do in PyTorch?

Answer:  pytorch already has built in pruning related package






Inferencing Samples:   0%|          | 0/3 [00:00<?, ? Batches/s]

Query:  What does .contiguous() do in PyTorch?

Answer:  variable(torch.from_numpy(nopeak_mask)






In [28]:
data = {'questions':top_10_questions, 'bm25_roberta':bm25_roberta, 'embedding_albertxxlv1':embedding_albertxxlv1, 'embedding_albertxxlv2':embedding_albertxxlv2,'embedding_transformersReader_v2':embedding_transformersReader,'dense_retriever':dense_retriever}

In [27]:
df1 = pd.DataFrame(data)
df1

Unnamed: 0,questions,bm25_roberta,embedding_albertxxlv1,embedding_albertxxlv2,embedding_transformersReader_v2,dense_retriever
0,How do I check if PyTorch is using the GPU?\n,identify the model of your graphics card,torch.cuda.is_available(),torch.cuda.is_available(),torch.cuda.is_available(),compare\n\nimport numpy as np\na = np.ones(5)\nb = a\n\nfollowed by either\n...
1,How do I save a trained model in PyTorch?\n,however you want,"you can directly save the model itself, or you can save a dictionary that in...","checkpoint = {'state_dict': model.state_dict(),'optimizer' :optimizer.state_...",\nis it possible to save a file from test_step() function?,_singleprocessdataloaderiter' object is not callabl
2,What does .view() do in PyTorch?\n,expects the new shape to be provided by individual int arguments,creates a view with different dimensions of the storage associated with tensor,reshapes the tensor to a,reshapes the tensor to a different but compatible shape.,"sigma.view(out_n, 1, 1, 1).repeat(out_n, in_c, out_h, out_w)"
3,Why do we need to call zero_grad() in PyTorch?\n,when we want to &quot;conserve&quot; ram with massive datasets,reduce memory consumption,you are passing the map_location to the wrong function,"too many indices for array""",# pred is also a non-leaf tensor
4,How do I print the model summary in PyTorch?\n,forward_pass,"#print(cost(x,y,beta))","summary(self.model, (1, 34, 8))","summary(self.model, (1, 34, 8))",-> onnx -> coreml
5,How do I initialize weights in PyTorch?\n,adjust\nnewval,"self.weight = torch.nn.linear(in_features, out_featues)",from the weights of the pretrained model,randomly:,"y in numericalize(self, arr, device)\n321 arr = self.postproc"
6,What does model.eval() do in pytorch?\n,fix the parameters of bn,fix the parameters of bn,fix the parameters of bn,"fix the parameters of bn,",requires input dimension to be correctly put
7,What's the difference between reshape and view in pytorch?\n,two different methods,explicit exception,.contiguous().view(shape) will create a copy,"z0 is a new view of x, but z1 is a copy:",this method retrain all weights
8,What does model.train() do in PyTorch?\n,run multiple times and print output,"train(model, batch)","train(model, batch)","\ntrain(model, batch)",pytorch already has built in pruning related package
9,What does .contiguous() do in PyTorch?\n,convert non-contiguous tensors to contiguous tensors,a no-op,"tensor.permute(2,0,1,3).contiguous()","essentially a no-op,",variable(torch.from_numpy(nopeak_mask)


In [28]:
df1.to_csv('haystack_comparison.csv')