Unclear what differences if any exist in handling trainpref/validpref/testpref #2565

munael · 2020-09-03T12:54:09Z

📚 Documentation

See: https://fairseq.readthedocs.io/en/latest/command_line_tools.html#Preprocessing

-	-
`--trainpref`	train file prefix
`--validpref`	comma separated, valid file prefixes
`--testpref`	comma separated, test file prefixes

Besides that, does fairseq-preprocess do any extra processing of the data (that is, besides binarization) for any of the splits? Is any processing different between the splits?

The text was updated successfully, but these errors were encountered:

myleott · 2020-09-04T00:40:40Z

Good point, I'll add some docs. --trainpref produces the dictionary (dict.txt). --validpref and --testpref use that dictionary, but any missing words are replaced with <unk>. There is no difference between --validpref and --testpref other than the output filenames.

Summary: - Rename type -> key in fairseq/tasks/sentence_prediction.py (fixes facebookresearch/fairseq#2746) - Update preprocessing docs (fixes facebookresearch/fairseq#2565) - Turn off logging in test_fp16_optimizer.TestGradientScaling - Documentation updates - Remove some unused code - Fix noisychannel example (fixes facebookresearch/fairseq#2213) Pull Request resolved: facebookresearch/fairseq#2786 Reviewed By: shruti-bh Differential Revision: D24515146 Pulled By: myleott fbshipit-source-id: 86b0f5516c57610fdca801c60e58158ef052fc3a

munael added documentation needs triage labels Sep 3, 2020

myleott removed the needs triage label Sep 4, 2020

myleott self-assigned this Sep 4, 2020

myleott added a commit that referenced this issue Oct 23, 2020

Update preprocessing docs (fixes #2565)

34b6501

myleott mentioned this issue Oct 23, 2020

Misc fixes #2786

Closed

myleott added a commit that referenced this issue Oct 26, 2020

Update preprocessing docs (fixes #2565)

135ba51

facebook-github-bot closed this as completed in 1bc83c7 Oct 27, 2020

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Unclear what differences if any exist in handling trainpref/validpref/testpref #2565

Unclear what differences if any exist in handling trainpref/validpref/testpref #2565

munael commented Sep 3, 2020

myleott commented Sep 4, 2020

Unclear what differences if any exist in handling trainpref/validpref/testpref #2565

Unclear what differences if any exist in handling trainpref/validpref/testpref #2565

Comments

munael commented Sep 3, 2020

📚 Documentation

myleott commented Sep 4, 2020