Dependency treebanks for individual Tupían languages are curated within the Universal Dependencies framework, i.e. in GitHub repositories under the UniversalDependencies GitHub organization.
TuDeT aggregates the data from these repositories (using the
git submodules mechanism)
to publish a unified set of treebanks as part of TuLar.
For a list of the currently included repositories/languages, see .gitmodules
.
Note that TuDeT aggregates the data from the dev
branches of these repositories,
thus may differ from the data released at Universal Dependencies.
TuDeT offers a list of annotated texts of different types in Tupían languages. Annotations are based on the Universal Dependencies framework.
TuDeT is being compiled within the CrossLingference project.