We moved our treebanks to https://github.com/perseids-publications/pedalion-trees/!
See also http://en.pedalion.org/treebanks for more background.
You can find here the modifications in the Perseus Dependency Treebanks of Ancient Greek.
Modifications in the Perseus Dependency Treebanks of Ancient Greek
What? This database-generated list contains a number of modifications in the existing Ancient Greek Dependency Treebanks. We are currently conducting experiments with automated parsing of Greek, and we are therefore attempting to homogenize the training corpus. The modifications included are of a manifold nature. The number of what we believe are clear mistakes is just a minor -- although not unsubstantial -- part of the file: most suggestions are made for purposes of homogenization. Later versions of this file will likely qualify the nature of each modification made. As this is work in progress, it is safe to say that this file might also contain a number of improvements for the worse.
The modifications are implemented in our own treebank search device, DendroSearch.
Who? Toon Van Hal and Alek Keersmaekers.
How much? The current release version contains modifications of ca. 120K tokens.
First release? 2018
Updates? March 2019; April 2019.
For our experiments with automated analysis, we thankfully rely on the high number of treebanks readily available
- Perseus Treebanks: https://perseusdl.github.io/treebank_data/
- PROIEL Treebanks: https://proiel.github.io/
- The Gorman Treebanks: https://github.com/rgorman/Greek_Dependency_Treebanks
- Harrington's Treebanks: https://perseids-project.github.io/harrington_trees/.
- The Sematia Project: https://github.com/ezhenrik/sematia
Our treebank data was created and edited through the help of the Arethusa application (https://github.com/alpheios-project/arethusa) as provided by the Perseids Project at Tufts University (https://perseids.org). Arethusa has received support from the Andrew W. Mellon Foundation, the Institute of Museum and Library Services, Tufts University, and the Humboldt Chair of Digital Humanities at Leipzig. Arethusa is now being jointly maintained by the Perseids Project at Tufts University and The Alpheios Project, Ltd.
Since January 2019, this work is also partly funded through an FWO research grant (Research Foundation Flanders).
We will assign a Creative-Commons licence to our treebanks, probably the following one: https://creativecommons.org/licenses/by-sa/4.0/. Please feel free to contact us for further questions.
toon -dot- vanhal -emailsign- kuleuven.be; alek -dot- keersmaekers -emailsign- kuleuven.be