Common Parallel Corpora: A high-quality community-driven extension of multitext-nllb-seed, flores-200, and ntrex-128 to more languages: nqo_Nkoo, ful_Adlm (coming soon).
Fria||el is a collaborative parallel text curation software system that tracks individual segments through a translation and copyedit workflow. Each segment is translated by one translator, and subsequently sequentially copyedited by other translators. Fria||el allows translators to simultaneously inspect variants of the source segment in multiple languages. This results in segments translated and copyedited in the context of different subsets of source languages. In addition to the final parallel corpus, Fria||el also yields copyedit logs, which could be valuable in various modeling scenarios.
"Machine Translation for Nko: Tools, Corpora and Baseline Results." paper code
(coming soon)