Models included in this release:
ru2_nerus_800ks_96
- width=96 (for CPU and GPU **)
- POS score: 87,9
- DEP score: 87,1
- NER score: 95,3
- trained on Nerus
- LICENSE: MIT
Itn Tag Loss Tag % Dep Loss UAS LAS
20 612196.679 91.566 2285020.336 91.676 85.352
ru2_combined_400ks_96 *
- width=96 (for CPU and GPU **)
- POS score: 89,2
- DEP score: 87,9
- NER score: 94,73
- LICENSE: CC BY-NC-SA 4.0
Itn Tag Loss Tag % Dep Loss UAS LAS
20 468998.154 92.414 1774568.248 92.134 86.241
ru2_grameval_96
- width=96 (for CPU and GPU **)
- POS score: 89,0
- DEP score: 87,9
- NER score: 0,0
- only POS tagging & DEP parsing !!!,
- LICENSE: CC BY-NC-SA 4.0
Itn Tag Loss Tag % Dep Loss UAS LAS
20 207172.379 93.661 926799.585 94.010 88.752
ru2_grameval_300
- width=300 (for GPU **)
- POS score: 90,0
- DEP score: 91,3
- NER score: 0,0
- only POS tagging & DEP parsing !!!,
- LICENSE: CC BY-NC-SA 4.0
Itn Tag Loss Tag % Dep Loss UAS LAS
20 54762.824 95.291 394716.120 98.595 94.527
Notes:
- All models are based on Navec vectors & pymorphy2 morphology (So we have ~2.5 mln words included in a combined vector model).
- POS and DEP tests are based on the weighted model quality on grameval subsets: score = (3news + 3fiction + wiki + social) / 8.
-
- "combined" dataset = grameval 2020 + a part of Nerus
- ** CPU speed depends on the network width square, so width-300 model compared to width-96 model is about 10x slower on CPU, though GPU speed is almost constant.
width=48: CPU WPS=8000 GPU WPS=12000
width=96: CPU WPS=3600 GPU WPS=12000
width=192: CPU WPS=1300 GPU WPS=10000
width=300: CPU WPS=600 GPU WPS=8000