BERT model trained from scratch on Finnish.
September 30, 2019 We release a beta version of the BERT base cased model trained from scratch on a corpus of Finnish news, online discussions, and crawled data.
Download the model here: bert-base-finnish-cased.zip
If you want to use the model with the huggingface/transformers library, follow the steps in huggingface_transformers.md
Initial, as of yet unpublished and therefore unofficial evaluation results of the model are as follows:
Named Entity Recognition on the FiNER data
|BERT-Base Multilingual Cased (Google)||88%|
UD_Finnish-TDT test set, gold segmentation
|BERT-Base Multilingual Cased (Google)||96.93%|