Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tags handling in xliff files #490

Closed
domeav opened this issue Jul 9, 2014 · 14 comments · Fixed by #2364
Closed

Tags handling in xliff files #490

domeav opened this issue Jul 9, 2014 · 14 comments · Fixed by #2364
Assignees
Labels
enhancement Adding or requesting a new feature. translate-toolkit Issues which need to be fixed in the translate-toolkit
Milestone

Comments

@domeav
Copy link

domeav commented Jul 9, 2014

Hello,
I'm experiencing an issue with xliff files containing tags in some of their translation units (weblate 1.9, but the issue persists with a fresh git clone). Here is a real-world example with a en/fr unit:

<trans-unit id="1108488165166414712" datatype="html" approved="yes">
    <source><x id="XXX_1"/> / <x id="XXX_2"/> events succesfully imported</source>
    <target state="translated"><x id="XXX_1"/> / <x id="XXX_2"/> événements importés</target>
</trans-unit>

Weblate handles source string as "/ events succesfully imported", and target string as "/ événements importés", tags are somehow lost during the import. I've been fiddling around with datatypes but it doesn't seem to help.

Moreover, when editing and saving the translation in weblate (let's say the new French translation for this is "duh"), here is what appears in my xliff file:

<trans-unit id="1108488165166414712" datatype="html" approved="yes">
    <source><x id="XXX_1"/> / <x id="XXX_2"/> events succesfully imported</source>
    <target state="translated">duh<x id="XXX_1"/> / <x id="XXX_2"/> événements importés</target>
</trans-unit>

The new translation is appended at the beginning of the string instead of replacing it.
Thanks for your help!


Want to back this issue? Post a bounty on it! We accept bounties via Bountysource.

@nijel
Copy link
Member

nijel commented Jul 9, 2014

Hmm looks like something wrong in translate-toolkit or in way Weblate uses it for xliff files. What version of translate-toolkit are you using?

@nijel nijel added the bug label Jul 9, 2014
@domeav
Copy link
Author

domeav commented Jul 10, 2014

I've been careful to use the latest 1.11.0 release.
TTK xliffunit instances indeed return stripped strings to getsource() and gettarget() calls, but I have no idea if it's a ttk bug, if they are misused by weblate (they do have the complete info as _rich_source and _rich_target) or if something is wrong with my xliff files. Here are my files for reference:
http://pastebin.com/G3Pnt9mj (fr.xliff) and http://pastebin.com/3dvJmEiM (en.xliff).

@domeav
Copy link
Author

domeav commented Jul 10, 2014

We're migrating from pootle and I've just checked how it performed with these files: string ' / events succesfully imported' is handled properly in the pootle UI as "{XXX_1} / {XXX_2} events succesfully imported", so the culprit could be the way weblate handles xliff.

@nijel
Copy link
Member

nijel commented Sep 8, 2014

You're right, Weblate should use rich_source when possible and correctly store and interpret placeables from it...

@nijel nijel added enhancement Adding or requesting a new feature. and removed bug Something is broken. labels Sep 8, 2014
@mapx
Copy link
Contributor

mapx commented Apr 29, 2016

Same issue in weblate 2.6 :)

@nijel
Copy link
Member

nijel commented Apr 29, 2016

That's expected. nobody has fixed it :-).

@nijel nijel added this to TODO in File format support Jan 30, 2018
@nijel nijel added the translate-toolkit Issues which need to be fixed in the translate-toolkit label Oct 26, 2018
@nijel
Copy link
Member

nijel commented Oct 26, 2018

The best approach to address this is probably overriding set_source, get_source and get_target methods in XliffUnit to use rich_source/rich_target from translate-toolkit. That way the strings would get into Weblate. To make them behave like placeholders, additional flag would be needed on the unit and implementing corresponding check to check and extract the placeables when displaying.

@PowerKiKi
Copy link
Contributor

I started working on this, but I would need a bit more guidance. I didn't write Python for years, and I really don't have a good understanding of how data flow.

So far I overrode the method as suggested, and completed XliffUnit.get_flags. So, given the following XLIFF unit:

<trans-unit id="id-1">
    <source xml:space="preserve">source "<x id="INTERPOLATION" equiv-text="{{ angularExpression }}"/>", source.</source>
    <target xml:space="preserve">target "<x id="INTERPOLATION" equiv-text="{{ angularExpression }}"/>", target.</target>
</trans-unit>

I ended up the following content in DB:

image

Is serializing StringElem that way a good idea ? What part should I edit to be able to unserialize properly ? would it be the role of a check ?

Is there any documentation describing how the app internals work for developer to help me understand it ?

Also @nijel would you prefer if I created a work-in-progress PR to talk about this ?

@nijel
Copy link
Member

nijel commented Oct 30, 2018

The easiest approach is probably to serialize it back to xml in get_source. Perfect solution would be to store rich structure in the database and have some editor for that, but I'd suggest to stay with simple approach - store the XML in the database and let user edit it with proper checking.

As for docs there is https://docs.weblate.org/en/latest/contributing.html and https://docs.weblate.org/en/latest/internals.html, but these are far from being complete, sorry :-(.

PowerKiKi added a commit to PowerKiKi/weblate that referenced this issue Nov 29, 2018
This should allow to translate units containing placeholders such
as the one used by Angular i18n.

PoXliff will not support placehoders, because it probably doesn't
make sense to support XLIFF specific placeholders in a format that
actually embed PO placeholders.

Fixes WeblateOrg#490
Fixes WeblateOrg#1535

Signed-off-by: Adrien Crivelli <adrien.crivelli@gmail.com>
@nijel nijel added this to the 3.4 milestone Nov 30, 2018
@nijel nijel self-assigned this Nov 30, 2018
File format support automation moved this from TODO to Done Nov 30, 2018
nijel added a commit that referenced this issue Nov 30, 2018
Fixes #490
Fixes #1535

Signed-off-by: Michal Čihař <michal@cihar.com>
@nijel
Copy link
Member

nijel commented Nov 30, 2018

Thank you for your report, the issue you have reported has just been fixed.

  • In case you see problem with the fix, please comment on this issue.
  • In case you see similar problem, please open separate issue.
  • If you are happy with the outcome, consider supporting Weblate by donating.

@ramyakrishnapandian
Copy link

The easiest approach is probably to serialize it back to xml in get_source. Perfect solution would be to store rich structure in the database and have some editor for that, but I'd suggest to stay with simple approach - store the XML in the database and let user edit it with proper checking.
As for docs there is https://docs.weblate.org/en/latest/contributing.html and https://docs.weblate.org/en/latest/internals.html, but these are far from being complete, sorry :-(.

Hi Nijel,
Could you please help to translate a business central Xliff file using Weblate??

Thanks in Advance

Ramya

@nijel
Copy link
Member

nijel commented Sep 4, 2020

@ramyakrishnapandian This issue should be fixed, in case you see some similar issue, please open a separate issue. For business inquiries, please ask on sales@weblate.org (see https://weblate.org/support/).

@sntxerror
Copy link

sntxerror commented Nov 2, 2021

@nijel So, for now, does any XML tag inside source/target nodes supported? Or XLIFF placeholders x tags only?

I mean, we are using XLIFF and we have strings like Plaint text part <someTag attr1="value" /> other plaintext part, or even some table data in xml.

Should we add data-type="xml" to trans-unit? Or change something in our config?

Because it looks like we have to always escape xml brackets with &lt; and &gt; which annoys our editors and translators...

@nijel
Copy link
Member

nijel commented Nov 3, 2021

Maybe you're hitting #3081?

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding or requesting a new feature. translate-toolkit Issues which need to be fixed in the translate-toolkit
Projects
Development

Successfully merging a pull request may close this issue.

6 participants