The code here was produced to POS-tag the conversational corpora assembled by the ESRC Centre for Research on Bilingualism in Theory & Practice at University of Wales Bangor.
The data was bilingual conversational running text, and the autoglosser tags it in one pass based on constraint grammar linguistic rules for each language.
Note that this code is not really packaged properly, because a lot of the work was done ad hoc. (To get a smaller, cleaner implementation, try the Gáidhlig autoglosser.) Hopefully this will be remedied (at least for Welsh) as part of the work on the new CorcCenCC (Corpus Cenedlaethol Cymraeg Cyfoes - National Corpus of Contemporary Welsh).