You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Hi, authors. I'm now going to replicate your FiD project. I'm wondering about the data preprocessing strategies.
I found that the ''passages_index'' of Natural Questions and triviaqa datasets are just downloaded from the URL link ''https://dl.fbaipublicfiles.com/FiD/data/[dataset-name].tar.gz''. However, I could not find details about how to generate these passages_index files. Would the passages just be ranked based on the descending order of the Lucene-BM25 scores (excluding the passages that do not contain answers)? Or you adopted other methods to generate the passages_index?
Looking forward to your reply.
The text was updated successfully, but these errors were encountered:
The passages we have released in our repository have been obtained by distilling the reader into the retriever, the method is described here: https://arxiv.org/pdf/2012.04584.pdf. The retriever can be downloaded from the repo.
Hi, authors. I'm now going to replicate your FiD project. I'm wondering about the data preprocessing strategies.
I found that the ''passages_index'' of Natural Questions and triviaqa datasets are just downloaded from the URL link ''https://dl.fbaipublicfiles.com/FiD/data/[dataset-name].tar.gz''. However, I could not find details about how to generate these passages_index files. Would the passages just be ranked based on the descending order of the Lucene-BM25 scores (excluding the passages that do not contain answers)? Or you adopted other methods to generate the passages_index?
Looking forward to your reply.
The text was updated successfully, but these errors were encountered: