Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Translations: indicate when they are out of date with the English version #138

Closed
hcayless opened this issue Feb 3, 2016 · 42 comments
Closed
Assignees
Labels
conversion:i18n This issue relates to handling of multiple languages. resp: council status: inProgress Ticket has been assigned and someone is working on it.

Comments

@hcayless
Copy link
Member

hcayless commented Feb 3, 2016

A sizable chunk of the text in the reference pages is translated to other languages (mostly French). It would be useful to indicate on the ref pages whether the English text has outdated the translation. It would be possible to determine this based on @versionDate on <desc> and <remarks>.

Moved from TEIC/TEI#1348

@martindholmes
Copy link
Contributor

So, to clarify: every instance where the @versionDate on the English version postdates the @versionDate on another language version, there should be some signal?

That's going to be a scary number of instances. It had better be a rather unobtrusive signal. What do we suggest? If it's text, we'll have to go and get all the translations of that piece of text ahead of time, otherwise it's pointless. If it's an icon or a symbol, it will probably need descriptive text for mouseover, in which case ditto.

@lb42
Copy link
Member

lb42 commented Apr 28, 2018

I had assumed that all that would be required would be to add a signal in the HTML (say, add class="outofdate" to the relevant block) so that it can be rendered in a different way, e.g. wavy type to indicate uncertainty or something. The stylesheet doesn't need to do more than compare the @versionDate on the chunk it's handling with that of its sibling with xml:lang="en" (assuming that the English one is always up to date, ha ha)

@martindholmes
Copy link
Contributor

It's easy to do that, but unless you know what it means, it's not helpful. I don't know of any generally-accepted visual/design signal for "out-of-date translation", so somewhere, somehow, it has to be explained; and if you're in need of the translation for a gloss or desc, the chances are you'd be equally in need of a translated explanation for the signal.

@lb42
Copy link
Member

lb42 commented Apr 28, 2018

So what is your suggestion? Insert the phrase "[traduction douteuse/dépassée/périmée]" vel sim? It seems to me that there are already quite a few visual conventions in the way the Glines are displayed on screen (e.g. use of red box for deprecated examples). Or is the suggestion that out of date translations should be silently suppressed/replaced by English or what?

@martindholmes
Copy link
Contributor

I don't have a suggestion; I'm just looking for guidance. The ticket was assigned to me, but I don't know exactly what to do to handle it.

@martindholmes
Copy link
Contributor

Quick note to say that this is problematic with <remarks> because there may be multiple remarks in various languages and it's not clear which remark in (say) Japanese should correspond with which remark in English. So I'm working on <gloss> and <desc> at the moment.

martindholmes added a commit that referenced this issue Apr 30, 2018
…ated from glosses and descs to signify when they're out of date.
@martindholmes
Copy link
Contributor

First stage is done in commit 863fed8: the odd2lite.xsl process now adds @rend='outOfDateTranslation' to <seg> elements generated from <gloss> and <desc> elements which are dated before the English version. The next stage is to process that @rend attribute into a class in the HTML generation. (Do we care about PDF?)

@martindholmes martindholmes added the status: inProgress Ticket has been assigned and someone is working on it. label Apr 30, 2018
@martindholmes
Copy link
Contributor

martindholmes commented Apr 30, 2018

Oops -- I just realized that I'm not supposed to do this ticket, it was just assigned for the Stylesheets coop group meeting. Oh well, I've done a bit of it. Anyone who runs this:

java -jar Utilities/lib/saxon9he.jar -xsl:`pwd`/../../Stylesheets/odds/odd2lite.xsl -s:p5.xml -o:test.xml documentationLanguage=ja lang=ja language=ja doclang=ja

in the P5 directory will get a test.xml file of TEI Lite in Japanese with the seg[@rend='outOfDateTranslation'] elements in it (assuming the path to your local Stylesheets dev checkout is the same as mine, relative to P5). Anyone else in the group want to have a go at figuring out why that doesn't get turned into a hi element with a class in the HTML guidelines output?

Incidentally, I cannot for the life of me tell the difference between the parameters documentationLanguage, lang, language, and doclang. I just kept adding new ones until I got the output I wanted.

@martindholmes
Copy link
Contributor

Hmmph. Now I have to fix the Stylesheets tests. :-(

@martinascholger
Copy link
Member

Sorry that I'm just joining the discussion now. I try to summarize for myself:
we need some kind of indicator for outdated translations, which means a) text rendered in some way or b) an icon. Both has to be explained. I personally tend towards an icon before or after the affected text with a tooltip saying "translation out of date". I think it rarely affects <gloss>?, it more frequently affects <desc> and <remarks> (if we find a solution for the allocation problem).
Should we use @corresp for this purpose in the future? I'm not suggesting to add this to every single spec now, but we could make those changes whenever we are working on a specific spec anyway.
And, we must not forget attributes and their values ...
I don't know either if there is a generally-accepted icon for out-of-date-translation. Maybe an icon like this: https://cdn2.iconfinder.com/data/icons/social-productivity-line-art-1/128/history-512.png

@jamescummings
Copy link
Member

jamescummings commented May 1, 2018

I like b). Just an icon or asterisk or something afterwards with a hover/title tooltip saying "Translation out of date".

In discussion I suggested that whatever system there is should feed into a method to submit an up-to-date translation. (i.e. goes to an email that is monitored by IFTTT to stick it into a google spreadsheet or some other system). All thought a good idea but that it is a separate ticket... once this one is done.

@martindholmes
Copy link
Contributor

Stylesheets coop group likes the idea of an icon with a tooltip in the HTML output. We need to get translations in all the languages of the explanatory text, and then figure out how to put those into the i18n system, then make the icon and tooltip appear in the text.

@sydb
Copy link
Member

sydb commented May 1, 2018

Of the 518 constructs with child <remarks>, only 6 have different counts of individual languages:

construct en fr ja es de
@degree of att.damaged 1 1 2 1 0
att.editLike 2 1 1 1 0
@unit of att.milestoneUnit 1 2 1 0 0
<expan> 2 1 1 0 2
@type of <list> 2 1 1 0 2
<locus> 2 1 0 0 0

@martindholmes
Copy link
Contributor

Thanks Syd. I can confirm that my code also handles gloss and desc in valItems.

@martindholmes
Copy link
Contributor

Thanks @sydb. In the cases where English has more than the others, I guess there are new ones waiting for translation. In the two cases where Japanese has an extra one, or French has an extra one, we should probably take a look and see what's happening there.

@martinascholger
Copy link
Member

@martindholmes I wanted to run this test, but can't find text.xml
java -jar Utilities/lib/saxon9he.jar -xsl:pwd/../../Stylesheets/odds/odd2lite.xsl -s:p5.xml -o:test.xml documentationLanguage=ja lang=ja language=ja doclang=ja
Or did you already fix the issue with hi ?

@martindholmes
Copy link
Contributor

@martinascholger You'd be running this in order to create "test.xml". It assumes you have p5.xml already in your TEI/P5 folder, and that the Stylesheets repo is in the same folder as the TEI repo. What happens when you run it?

@martinascholger
Copy link
Member

Do we agree to say "Translation out of date" in the tooltip? We could start to collect the translations then.
ko, zh-TW, ja, fr, es, it, de

@martindholmes
Copy link
Contributor

That works for me.

@martinascholger
Copy link
Member

martinascholger commented Jun 3, 2018

Ok, we need translations of the explanatory text "Translation out of date" in the tooltip.
@lb42, @raffazizzi could you please help us with this and add translations?

language info text
en Translation out of date
de Übersetzung veraltet
ko
zh-TW
ja
fr Traduction périmée
es
it

@lb42
Copy link
Member

lb42 commented Jun 3, 2018

Traduction périmée

@lb42
Copy link
Member

lb42 commented Jun 3, 2018

Otoh the translation is not necessarily wrong, even if it's "out of date". Itmay just lack some nuance added in the English. I don't think anything is edited for the first time in any non English language is it,?

@lb42
Copy link
Member

lb42 commented Jun 4, 2018

Sorry @martindholmes, but the evidence does not support your assertion. I ran the attached stylesheet langCounts.txtagainst the current version of p5subset to look at "out of date" desc elements in French (there are 351 of them) in comparison with the "correct" English version. I got bored eyeballing the results, but well over 70% of those I checked were perfectly OK. Try it for yourself.

@martindholmes
Copy link
Contributor

@lb42 That suggests one of two things: either the dates on the English translations were changed without their contents being changed, or the French translations were updated without their version date being updated. I think the latter is more plausible.

Nevertheless, you can't know without checking, can you? So they all have to be looked at.

@lb42
Copy link
Member

lb42 commented Jun 4, 2018

Speculation as to whether contents were changed without dates being changed could be resolved by attention to the version history for the file I suppose, but I don't think that's what's happening here. My speculation is that @versionDate changed when any change was made to the specification concerned, whether or not it invalidated existing descriptions.

@martindholmes
Copy link
Contributor

If people are arbitrarily changing the @versionDate on the English <gloss> or <desc> when they didn't actually change the <gloss> or <desc> itself, then we have a problem. Why would anyone do that?

@lb42
Copy link
Member

lb42 commented Jun 4, 2018

My suspicions focus on commit 087964ad7e7b3b369368e4974d7481d0f7ddbef5, which is the one that introduced @versionDate values for english language descs (previously missing). Who knows where the date that was introduced there came from?

@lb42
Copy link
Member

lb42 commented Jun 5, 2018

FWIW, here are the counts for elements which have been translated in the current p5subset
I'll do another table for the allegedly out of date ones later. This shows that there's still some way to go on German, and that the French are excessively fond of glosses.

element en de fr ja ko es it zh-TW nolang
DESC 1808 292 1471 1410 1374 1396 1417 1287 24
GLOSS 530 209 680 10 460 488 478 306 3
REMARKS 523 87 391 367 0 69 0 0 0

@martindholmes
Copy link
Contributor

How does French have more glosses than English?

@lb42
Copy link
Member

lb42 commented Jun 5, 2018

You may well ask....

@martindholmes martindholmes removed their assignment Jul 3, 2018
@martindholmes
Copy link
Contributor

martindholmes commented Jul 3, 2018

Stylesheets group believes the manifestation of this should be an asterisk with a mouseover explanation. We need to strike a balance between worrying people and soliciting corrections. Council will try to decide on a method to allow people to provide a proposed correction. Should the English equivalent be available to the reader of the non-English version of the Guidelines somehow, so that they can more easily provide a translation?

@ebeshero
Copy link
Member

ebeshero commented Jul 3, 2018

See TEIC/TEI#1780 : which should be an agenda item for next Council meeting.

@martindholmes
Copy link
Contributor

martindholmes commented Jul 6, 2018

Counts of out-of-date items versus total items for each language:

GI en de fr ja ko es it zh-TW
gloss 0 / 530 6 / 209 121 / 680 0 / 10 17 / 460 104 / 488 63 / 478 101 / 306
desc 0 / 1800 22 / 292 350 / 1449 250 / 1389 268 / 1374 409 / 1396 652 / 1396 603 / 1266
remarks 2 / 523 1 / 87 147 / 391 128 / 367 0 / 0 26 / 69 0 / 0 0 / 0

(Syd and I generated this.)

@martinascholger
Copy link
Member

@sydb and @martindholmes: since you already have a script, could you please let me know which German gloss/desc/remarks are out-of-date? Since there aren't that many, I'd update them.

@sydb
Copy link
Member

sydb commented Jul 6, 2018

OK. (@martindholmes, if you want to send me a copy of the script, I can modify for German-only and say which ones; @martinascholger: I will try to do that tomorrow night your time)

@lb42
Copy link
Member

lb42 commented Jul 6, 2018

To answer Martin's question of a while ago (why does French have more glosses than English?) -- gloss is only (should only be) provided if the meaning of an element's GI is not apparent from its form, i.e. we don't explain that the tag "editor" is so called because it contains an editor, though we do explain that the tag "fs" contains a feature structure. The French translators, being French and logical, reasoned that therefore they should provide a literal translation of the GI for each element, and tag it as gloss, since (e.g.) "editor" is not a French word (well it is nearly but it means something different).

@martindholmes
Copy link
Contributor

Here it is (quick and dirty as usual).
outOfDate.xsl.zip

@sydb
Copy link
Member

sydb commented Jul 8, 2018

Just sent the results to @martinascholger, a mere 24 hours late.

@martinascholger
Copy link
Member

Thanks for the list @sydb. I had a look at the translations today. Some of the German glosses aren't out of date, although they have an older date than the English gloss. This is because the spelling has been changed to lower-case in some of the English glosses, which doesn't make a difference for the German translations. How should we deal with that? For practical reasons it would make sense to change the date for the German translations, but that would distort the history.

@martinascholger
Copy link
Member

Updated German translations with TEIC/TEI@1e148284d78b6c52a526f3642769d3b36cf6bc02U

@lb42 lb42 removed their assignment Apr 22, 2020
@martindholmes martindholmes added conversion:i18n This issue relates to handling of multiple languages. resp: council labels May 26, 2020
hcayless added a commit that referenced this issue Apr 17, 2022
@hcayless hcayless added this to the Release 7.53.0 milestone Apr 19, 2022
@hcayless
Copy link
Member Author

I contend this is done.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
conversion:i18n This issue relates to handling of multiple languages. resp: council status: inProgress Ticket has been assigned and someone is working on it.
Projects
None yet
Development

No branches or pull requests

8 participants