Open Source MLM Implementation in Fairseq #635

kartikayk · 2019-04-15T23:29:23Z

Summary: Adding a task and relevant models, datasets and criteria needed for training Cross-lingual Language Models similar to Masked Language Model used in XLM (Lample and Conneau, 2019 - https://arxiv.org/abs/1901.07291).

Differential Revision: D14943776

Summary: Pull Request resolved: facebookresearch#635 Adding a task and relevant models, datasets and criteria needed for training Cross-lingual Language Models similar to Masked Language Model used in XLM (Lample and Conneau, 2019 - https://arxiv.org/abs/1901.07291). Reviewed By: liezl200 Differential Revision: D14943776 fbshipit-source-id: 9835d82e9741c2ff9091f24cdbe4bb4be654c5a5

facebook-github-bot · 2019-04-16T23:38:20Z

This pull request has been merged in 8776928.

hanyh · 2019-04-17T15:29:35Z

@kartikayk After this PR, i get this error, It seems some files are missing from this?

from fairseq.data.masked_lm_dataset import MaskedLMDataset
fairseq/data/masked_lm_dataset.py", line 18, in
from fairseq.data.fb_block_pair_dataset import BlockPairDataset
ModuleNotFoundError: No module named 'fairseq.data.fb_block_pair_dataset'

kartikayk · 2019-04-17T16:34:53Z

@kartikayk After this PR, i get this error, It seems some files are missing from this?

from fairseq.data.masked_lm_dataset import MaskedLMDataset
fairseq/data/masked_lm_dataset.py", line 18, in
from fairseq.data.fb_block_pair_dataset import BlockPairDataset
ModuleNotFoundError: No module named 'fairseq.data.fb_block_pair_dataset'

@hanyh I'm working on fixing this right now. Will send out an update soon. Sorry for the inconvenience!

stefan-it · 2019-04-17T22:10:21Z

@kartikayk Thanks for that implementation :+1 I have one question: could you also provide a kind of example that shows a) to load a trained model and b) that returns embeddings for each subtoken in a given sentence from that model? That would really help me :)

Summary: Pull Request resolved: facebookresearch/fairseq#635 Adding a task and relevant models, datasets and criteria needed for training Cross-lingual Language Models similar to Masked Language Model used in XLM (Lample and Conneau, 2019 - https://arxiv.org/abs/1901.07291). Reviewed By: liezl200 Differential Revision: D14943776 fbshipit-source-id: 3e416a730303d1dd4f5b92550c78db989be27073

Add the missing step to add the arguments to the parser.

facebook-github-bot added the CLA Signed label Apr 15, 2019

kartikayk force-pushed the export-D14943776 branch from e9d0158 to 13194a0 Compare April 16, 2019 20:37

facebook-github-bot closed this in 8776928 Apr 16, 2019

facebook-github-bot added the Merged label Apr 16, 2019

yfyeung pushed a commit to yfyeung/fairseq that referenced this pull request Dec 6, 2023

Update train.py (facebookresearch#635)

6709bf1

Add the missing step to add the arguments to the parser.

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Open Source MLM Implementation in Fairseq #635

Open Source MLM Implementation in Fairseq #635

kartikayk commented Apr 15, 2019

facebook-github-bot commented Apr 16, 2019

hanyh commented Apr 17, 2019 •

edited

kartikayk commented Apr 17, 2019

stefan-it commented Apr 17, 2019

Open Source MLM Implementation in Fairseq #635

Open Source MLM Implementation in Fairseq #635

Conversation

kartikayk commented Apr 15, 2019

facebook-github-bot commented Apr 16, 2019

hanyh commented Apr 17, 2019 • edited

kartikayk commented Apr 17, 2019

stefan-it commented Apr 17, 2019

hanyh commented Apr 17, 2019 •

edited