Skip to content

Commit

Permalink
Update docstring in DPR for embed_title (#459)
Browse files Browse the repository at this point in the history
  • Loading branch information
tholor committed Oct 2, 2020
1 parent 9b58374 commit 029d1b7
Showing 1 changed file with 6 additions and 1 deletion.
7 changes: 6 additions & 1 deletion haystack/retriever/dense.py
Original file line number Diff line number Diff line change
Expand Up @@ -50,7 +50,12 @@ def __init__(self,
:param max_seq_len: Longest length of each sequence
:param use_gpu: Whether to use gpu or not
:param batch_size: Number of questions or passages to encode at once
:param embed_title: Whether to concatenate title and passage to a text pair that is then used to create the embedding
:param embed_title: Whether to concatenate title and passage to a text pair that is then used to create the embedding.
This is the approach used in the original paper and is likely to improve performance if your
titles contain meaningful information for retrieval (topic, entities etc.) .
The title is expected to be present in doc.meta["name"] and can be supplied in the documents
before writing them to the DocumentStore like this:
{"text": "my text", "meta": {"name": "my title"}}.
:param remove_sep_tok_from_untitled_passages: If embed_title is ``True``, there are different strategies to deal with documents that don't have a title.
If this param is ``True`` => Embed passage as single text, similar to embed_title = False (i.e [CLS] passage_tok1 ... [SEP]).
If this param is ``False`` => Embed passage as text pair with empty title (i.e. [CLS] [SEP] passage_tok1 ... [SEP])
Expand Down

0 comments on commit 029d1b7

Please sign in to comment.