ConcretePlanning

Alex discussing with Mike about the system and the LREC paper we want to write about it.

input verification: is the user putting in obviously wrong text?
- character set identification
- unicode checks
- langid -- maybe with langid.py by default?
sentence segmenters for your language: this should be pluggable.
- check out OmegaT's segmenters and how they work.
Also text normalization routines. Let's write about GuaraniTextProcessing in a separate page...

Can you anonymously upload documents?
Can you anonymously provide translations?
Other sites (installs of Guampa) might want to do political things. There's that site for leftist translators already -- what is it? Does Mike have the link?

maybe the interface should be in the document target language
this should be configurable for sure. Is there support for l10n in angular builtin?
For me and Mike, for example, it would be easy to translate Spanish articles into English, and easiest for us if the interface is in English.

Well, one thing is Guampa is novel in part because it's FOSS and there doesn't seem to be a good FOSS tool for this.
unlike tatoeba:
- we are for TM, MT, and CAT (in the long run). But we're explicitly about MT.
- We are for translating documents.
unlike traduwiki:
- we are for MT and have proper sentence segmentation
- traduwiki is for translating a document too.
deep question: what's the benefit for volunteer translators?
- Well, we do have activists and students...
why is this different than Pootle?
- Pootle's interface is kind of nonobvious, and it's meant for UI strings, it seems like. Not documents? And it's not for reading.
- but it does have terminology come up...

canned data?
maybe we should use it ourselves, translate some wikipedia documents es->en maybe.

Provide feedback