/
eval.log
52 lines (52 loc) · 3.02 KB
/
eval.log
1
2
3
4
5
6
7
8
9
10
11
12
13
14
15
16
17
18
19
20
21
22
23
24
25
26
27
28
29
30
31
32
33
34
35
36
37
38
39
40
41
42
43
44
45
46
47
48
49
50
51
52
Running the following version of UD tools:
commit 78ce4b21495c6e4c17a7b07925bec1267d833d14
Author: Dan Zeman <zeman@ufal.mff.cuni.cz>
Date: Sun May 5 09:21:16 2024 +0200
Evaluating the following revision of UD_Turkish-IMST:
commit 7aaa466c94038dedaa6796ab27f554a74b6e35ab
Author: Dan Zeman <zeman@ufal.mff.cuni.cz>
Date: Mon Nov 6 13:10:00 2023 +0100
Size: counted 58096 of 58096 words (nodes).
Size: min(0, log((N/1000)**2)) = 8.12419362934396.
Size: maximum value 13.815511 is for 1000000 words or more.
Split: Found more than 10000 training words.
Split: Found at least 10000 development words.
Split: Found at least 10000 test words.
Lemmas: source of annotation (from README) factor is 0.8.
Universal POS tags: 14 out of 17 found in the corpus.
Universal POS tags: source of annotation (from README) factor is 0.8.
Features: 35585 out of 58096 total words have one or more features.
Features: source of annotation (from README) factor is 0.8.
Universal relations: 33 out of 37 found in the corpus.
Universal relations: source of annotation (from README) factor is 0.8.
Udapi:
TOTAL 8699
Udapi: found 8699 bugs.
Udapi: worst expected case (threshold) is one bug per 10 words. There are 58096 words.
Genres: found 2 out of 17 known.
/net/work/people/zeman/unidep/tools/validate.py --lang tr --max-err=10 UD_Turkish-IMST/tr_imst-ud-dev.conllu
[Line 4716 Sent 00045224_1 Node 12]: [L3 Warning fixed-gap] Gaps in fixed expression [12, 16] 'göbek * * * da'
[Line 9889 Sent 00220166_36 Node 12]: [L3 Warning fixed-gap] Gaps in fixed expression [12, 14] 'aralarından * da'
Warnings: 2
*** PASSED ***
/net/work/people/zeman/unidep/tools/validate.py --lang tr --max-err=10 UD_Turkish-IMST/tr_imst-ud-test.conllu
*** PASSED ***
/net/work/people/zeman/unidep/tools/validate.py --lang tr --max-err=10 UD_Turkish-IMST/tr_imst-ud-train.conllu
[Line 8394 Sent 00038121_25 Node 1]: [L3 Warning fixed-gap] Gaps in fixed expression [1, 4] 'Ev * * da'
[Line 14918 Sent 00084111_21 Node 2]: [L3 Warning fixed-gap] Gaps in fixed expression [2, 4] 'öldürmek * da'
[Line 26915 Sent 00131266_41 Node 12]: [L3 Warning fixed-gap] Gaps in fixed expression [12, 15] 'bakkaldan * * da'
[Line 50224 Sent 21040000_68 Node 8]: [L3 Warning fixed-gap] Gaps in fixed expression [8, 10] 'yasal * da'
Warnings: 4
*** PASSED ***
Validity: 1
(weight=0.0769230769230769) * (score{features}=0.8) = 0.0615384615384615
(weight=0.0769230769230769) * (score{genres}=0.117647058823529) = 0.00904977375565611
(weight=0.0769230769230769) * (score{lemmas}=0.8) = 0.0615384615384615
(weight=0.256410256410256) * (score{size}=0.588048743856272) = 0.150781729193916
(weight=0.0512820512820513) * (score{split}=1) = 0.0512820512820513
(weight=0.0769230769230769) * (score{tags}=0.658823529411765) = 0.0506787330316742
(weight=0.307692307692308) * (score{udapi}=0.01) = 0.00307692307692308
(weight=0.0769230769230769) * (score{udeprels}=0.713513513513514) = 0.0548856548856549
(TOTAL score=0.442831788302799) * (availability=1) * (validity=1) = 0.442831788302799
STARS = 2
UD_Turkish-IMST 0.442831788302799 2