-
Notifications
You must be signed in to change notification settings - Fork 493
truecasing #12
Comments
Hi, No, we never used truecasing / lowercasing. This was quite popular in PBSMT models, but for NMT the best is to use BPE. BPE are typically applied on sentences with regular casing. |
We used lowercasing + BPE for XNLI though, as in this case the task is to do sentence classification, and the casing is not very useful. But in MT, where you need to generate, it's good to directly generate the good casing and BPE does this very well. |
@glample Moses truecasing only modifies the case of the 1st word in a sentence (it does not modify things like Named Entities thou). This is to reduce the sparcity of the vocabulary (why to have both "Starting" and "starting" in the vocab?). With BPE you have the same issue with the 1st wordpiece of the 1st word (e.g. start ing vs Start ing). |
Yes you could use truecasing in combination with BPE, but probably it wouldn't make a big difference. Also it's nice to limit the number of preprocessing steps in practice, I guess this is also why people don't use truecasing anymore. But it wouldn't hurt to use it for sure. |
@glample I didn't catch that people don't use truecasing anymore, so will look more into it. Thanks for pointing out! |
Hi,
did you do truecasing/lowercasing in your MT experiments? From the code I can't find any signs of this.
Is there any specific reason to do / not do it?
Thanks
The text was updated successfully, but these errors were encountered: