Lemma are from the Henry George Liddell, Robert Scott, A Greek-English Lexicon
Models are available in releases.
| task | Accuracy | Accuracy Ambiguous |
|---|---|---|
| case | 0.9612 | 0.8854 |
| degree | 0.9926 | 0.9596 |
| gender | 0.9436 | 0.8296 |
| lemma | 0.954 | 0.9097 |
| mood | 0.9913 | 0.957 |
| num | 0.9841 | 0.9589 |
| pers | 0.9864 | 0.9219 |
| pos | 0.9287 | 0.8805 |
| tense | 0.9917 | 0.9588 |
| voice | 0.9915 | 0.9606 |
- Run
build.pyto get the "simple" training data- Warning: default output is NFKD
- Run
build-normalized.pyto get nfd and nfc data
- "Gorman Trees", Vanessa Gorman, University of Nebraska-Lincoln, https://github.com/perseids-publications/gorman-trees, https://doi.org/10.5281/zenodo.3596009
- "Daphne Trees", Francesco Mambrini, https://github.com/perseids-publications/daphne-trees
- "Pedalion Trees", Toon Van Hal et al., https://github.com/perseids-publications/cst-trees
- "Perseus Treebank Data", G. Celano et al., https://github.com/PerseusDL/treebank_data
- "Harrington Trees", J. Matthew Harrington, https://github.com/perseids-publications/harrington-trees.git
Those are sources I do not know the status of (Gold ? Silver ? Bronze ? Wood ?)
- https://github.com/ezhenrik/sematia-tb
- https://github.com/DigitalHill/treebank-data
- https://github.com/danielrruf/AristarchusTreebank-Lit
- https://github.com/Drewlatimer/student-data
- https://github.com/polinayordanova/Treebank-of-Aphtonius-Progymnasmata
Licence are the one from the original repositories. Converted data inherits the
Mozilla Public Licence
- 1,068,131 tokens,
- including 115,412 punctuation signs
- 56,133 different sentences
91 chars found
| Char | Count |
|---|---|
| 7743 | |
| " | 4219 |
| % | 4 |
| ' | 6745 |
| ( | 704 |
| ) | 702 |
| , | 142218 |
| - | 7085 |
| . | 66860 |
| 0 | 1 |
| 1 | 5727 |
| 2 | 3197 |
| 3 | 1616 |
| 4 | 2 |
| : | 7638 |
| ; | 7268 |
| < | 72 |
| > | 74 |
| ? | 137 |
| [ | 577 |
| ] | 571 |
| j | 3 |
| { | 1 |
| ~ | 38 |
| · | 31204 |
| ʽ | 17 |
| ̀ | 230277 |
| ́ | 1123673 |
| ̄ | 25 |
| ̆ | 8 |
| ̈ | 3682 |
| ̓ | 584276 |
| ̔ | 287290 |
| ͂ | 249187 |
| ͅ | 38177 |
| Α | 24953 |
| Β | 1412 |
| Γ | 1957 |
| Δ | 4253 |
| Ε | 7741 |
| Ζ | 2358 |
| Η | 2125 |
| Θ | 2724 |
| Ι | 4642 |
| Κ | 9669 |
| Λ | 5939 |
| Μ | 6123 |
| Ν | 1777 |
| Ξ | 728 |
| Ο | 3754 |
| Π | 9063 |
| Ρ | 2739 |
| Σ | 6155 |
| Τ | 5237 |
| Υ | 586 |
| Φ | 3391 |
| Χ | 903 |
| Ψ | 34 |
| Ω | 346 |
| α | 957329 |
| β | 53775 |
| γ | 152992 |
| δ | 248067 |
| ε | 880724 |
| ζ | 23108 |
| η | 294280 |
| θ | 112297 |
| ι | 845411 |
| κ | 294851 |
| λ | 281371 |
| μ | 315232 |
| ν | 617318 |
| ξ | 30632 |
| ο | 968199 |
| π | 330404 |
| ρ | 379429 |
| ς | 479697 |
| σ | 271423 |
| τ | 541687 |
| υ | 398026 |
| φ | 81370 |
| χ | 95052 |
| ψ | 8992 |
| ω | 340318 |
| ϝ | 13 |
| — | 388 |
| ‘ | 2 |
| ’ | 5404 |
| “ | 4 |
| † | 74 |
| ⏑ | 4 |