Skip to content

donnekgit/autoglosser

master
Switch branches/tags
Code

##Bangor autoglosser##

The code here was produced to POS-tag the conversational corpora assembled by the ESRC Centre for Research on Bilingualism in Theory & Practice at University of Wales Bangor.

The data was bilingual conversational running text, and the autoglosser tags it in one pass based on constraint grammar linguistic rules for each language.

Note that this code is not really packaged properly, because a lot of the work was done ad hoc. (To get a smaller, cleaner implementation, try the Gáidhlig autoglosser.) Hopefully this will be remedied (at least for Welsh) as part of the work on the new CorcCenCC (Corpus Cenedlaethol Cymraeg Cyfoes - National Corpus of Contemporary Welsh).

About

Bangor Autoglosser

Resources

Stars

Watchers

Forks

Releases

No releases published

Packages

No packages published