You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
Right now, it's possible to train DPR in a single command, via the tevatron.driver.train module. However, to evaluate, a more complex series of command (involving lower-level for loops) is needed, e.g. for DPR on NQ:
Note the usage of tevatron.driver.evaluate in order to keep driver.encode at a lower level and backward compatible, while evaluate would be for higher-level usage like reproducing results. Moreover, tevatron.driver.evaluate could throw an error if pyserini is not available, e.g.:
ImportError: could not import pyserini, a library needed to save as format "pyserini_dpr". Please install with `pip install pyserini`
The text was updated successfully, but these errors were encountered:
Hi @xhluca,
Thanks for the suggestion.
I guess here one reason we keep the encoding process separately is to keep it flexible wrt tasks (e.g. NQ/MSMARCO) and GPU/RAM resources.
I agree that the evaluation process of dpr can be simpler, maybe we can have a simpler dpr evaluation in pyserini.
I'll take a look.
Right now, it's possible to train DPR in a single command, via the
tevatron.driver.train
module. However, to evaluate, a more complex series of command (involving lower-level for loops) is needed, e.g. for DPR on NQ:I think it would be nicer if all this could be reduce to 1 or 2 commands:
Note the usage of
tevatron.driver.evaluate
in order to keepdriver.encode
at a lower level and backward compatible, whileevaluate
would be for higher-level usage like reproducing results. Moreover,tevatron.driver.evaluate
could throw an error if pyserini is not available, e.g.:The text was updated successfully, but these errors were encountered: