Training script for multi context training from ConveRT #14
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Added Training script for multiple-context conversational model mentioned in conveRT paper. The code is adapted from @vasudevgupta7's code search. Updated the below
losses for the 3 objectives mentioned in the paper,
However, the paper doesn't mention how the three losses are combined ( weighted or simple average).
I have done a simple average for now. If there is a better way to do this (please let me know), can be updated as needed.
Past contexts are concatenated ( instead of separated by
[SEP]
token), as mentioned in the paperand as implemented here. Contexts are sorted to have the most recent context first and so on
I have tested this on GPU and the script works. Will update this with multi-context evaluation and sync with other recent changes done to the code-search training script.
Suggestions or feedback on this PR are welcome.