-
Hi, I'm looking to train the punctuation and capitalization model for the french language. |
Beta Was this translation helpful? Give feedback.
Replies: 1 comment 5 replies
-
We haven't done any experiments with this model on non-English data but the model should work for other languages out-of-box. |
Beta Was this translation helpful? Give feedback.
We haven't done any experiments with this model on non-English data but the model should work for other languages out-of-box.
The quickest wait to try this model with French data, would be to use a pre-trained BERT-like model, for example,
model.language_model.pretrained_model_name=bert-base-multilingual-cased
oramine/bert-base-5lang-cased
. To prepare data for the punctuation and capitalization tasks, please see this tutorial. The Tatoeba dataset contains French sentences as well, you would need to modify this line to get Fr data.