Standardize RoBERTa Tensorizer Vocab Creation #1113

kartikayk · 2019-11-07T03:19:46Z

Summary: As part of the Tensorizer refactor, I standarize vocab creation for the RoBERTa tesorizer i.e. remove vocab creation from the tokenizer and bring it into the tensorizer. I also make the special tokens for RoBERTa confiurable so that we don't need a separate tensorizer if we decide to train RoBERTa with different special tokens. I also revert some of the changes made in D17974656 which break loading of fairseq vocab for all tensorizers.

Reviewed By: chenyangyu1988

Differential Revision: D18289234

Summary: As part of the Tensorizer refactor, I standarize vocab creation for the RoBERTa tesorizer i.e. remove vocab creation from the tokenizer and bring it into the tensorizer. I also make the special tokens for RoBERTa confiurable so that we don't need a separate tensorizer if we decide to train RoBERTa with different special tokens. I also revert some of the changes made in D17974656 which break loading of fairseq vocab for all tensorizers. Reviewed By: chenyangyu1988 Differential Revision: D18289234 fbshipit-source-id: b0432df63a6aab3c0e2ee9ff392ff64349342599

facebook-github-bot · 2019-11-07T03:20:15Z

This pull request was exported from Phabricator. Differential Revision: D18289234

facebook-github-bot · 2019-11-07T04:14:38Z

This pull request has been merged in f9765dc.

facebook-github-bot added CLA Signed Do not delete this pull request or issue due to inactivity. fb-exported labels Nov 7, 2019

facebook-github-bot closed this in f9765dc Nov 7, 2019

facebook-github-bot added the Merged label Nov 7, 2019

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Standardize RoBERTa Tensorizer Vocab Creation #1113

Standardize RoBERTa Tensorizer Vocab Creation #1113

kartikayk commented Nov 7, 2019

facebook-github-bot commented Nov 7, 2019

facebook-github-bot commented Nov 7, 2019

Standardize RoBERTa Tensorizer Vocab Creation #1113

Standardize RoBERTa Tensorizer Vocab Creation #1113

Conversation

kartikayk commented Nov 7, 2019

facebook-github-bot commented Nov 7, 2019

facebook-github-bot commented Nov 7, 2019