Join GitHub today
GitHub is home to over 40 million developers working together to host and review code, manage projects, and build software together.Sign up
Getting an empty gzipped phrase table #1
Hi @jsenellart @srush ,
Input : train files containing sentencepiece-tokenized sentences.
For reference, this was discussed on the forum: https://forum.opennmt.net/t/get-vmap-from-the-corpus-to-be-used-in-ctranslate2/3573
There were some issues with the training data (mostly empty lines). This is fixed by c44b9ff which adds a basic filtering.