New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Load original file metadata when loading Xliff 1.2 files #29148

Open
wants to merge 1 commit into
base: master
from

Conversation

Projects
None yet
3 participants
@eternoendless

eternoendless commented Nov 8, 2018

Q A
Branch? master
Bug fix? no
New feature? no
BC breaks? no
Deprecations? no
Tests pass? yes
Fixed tickets n/a
License MIT
Doc PR n/a

At PrestaShop, we maintain our translations catalog automatically using an internal tool based on our TranslationToolsBundle, which is capable of reverse building a MessageCatalogue by parsing the source code, and then saving it to Xliff files.

Currently, this tool is only capable of building catalogs from scratch. We are currently moving to an incremental catalog where we only add new wordings, and keep old ones even if they are no longer present in the code (because of B/C). To do that, instead of starting from a clean MessageCatalogue, we load our current catalog using XliffLoader, and use that MessageCatalogue as a base. Easy peasy. But then we found a problem...

The Xliff 1.2 standard defines a list of <trans-unit> elements within a collection of <file> elements. The <file> element has a required attribute named original, which is supposed to contain the name of the file where the wordings are used in (at least in our case it does). This attribute is currently ignored by XliffFileLoader.

This means that it's currently impossible to read a Xliff 1.2 file using XliffFileloader, and save it back to Xliff without losing data.

This Pull Request adds a new file element to the messages' metadata (alongside notes, target-attributes and id). Right now, it only contains original, but it could be extended to contain all the other attributes from the <file> element if needed.

This required a small change in the loader where we loop through <file> elements before fetching their <trans-unit> children, instead of fetching all <trans-unit> elements at once.

@nicolas-grekas

This comment has been minimized.

Member

nicolas-grekas commented Nov 8, 2018

This attribute is currently ignored by XliffFileLoader.

does it mean we have a spec violation? should we consider this a bug?

@eternoendless

This comment has been minimized.

eternoendless commented Nov 8, 2018

does it means we have spec violation?

I guess it depends on where you draw the line regarding the scope of XliffFileLoader and MessageCatalogue. Is it just about loading the translatables and making them available for the Translator? Then it's not a bug.

In my opinion, loading metadata like notes is a "nice to have" that actually goes beyond the scope of MessageCatalogue. It could be considered a "leak" of vendor-specific (Xliff in this case) data into MessageCatalogue. In the end I don't think it hurts, albeit for a little overhead when reading the files and a small increase in the catalogue's memory footprint.

// If the xlf file has another encoding specified, try to convert it because
// simple_xml will always return utf-8 encoded values
$target = $this->utf8ToCharset(
(string) (isset($translation->target) ? $translation->target : $source),

This comment has been minimized.

@nicolas-grekas

nicolas-grekas Dec 1, 2018

Member

please keep this call on one line - but use the ?? operator

@nicolas-grekas

LGTM with one minor comment thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment