Skip to content

String extraction breaks on non-unicode strings #190

Closed
pascalchevrel opened this Issue Mar 11, 2014 · 2 comments

1 participant

@pascalchevrel
Mozilla Francophone member

If a string is not extractable as utf8, tmxmaker fails and just stops for the locale. That means we generate incomplete cache_xx.php arrays and incomplete xml files for tmx.

I just noticed that with Occitan on the release channel, one of the mail files is not encoded correctly and only 8000 strings are extracted instead of 18000 (and they are not usable because of the broken generated data) .

We need to ignore these errors and just skip the wrongly encoded string to at least not have Transvision breaking on this content.

@pascalchevrel pascalchevrel added the bug label Mar 11, 2014
@pascalchevrel pascalchevrel self-assigned this Mar 11, 2014
@pascalchevrel pascalchevrel added a commit that referenced this issue Mar 11, 2014
@pascalchevrel pascalchevrel Issue #190:
When using silme for string extraction, don't stop the script on errors as it prevents closing properly the php array and the tmx file, instead just skip the string and keep on indexing the repo.
9372a9c
@pascalchevrel
Mozilla Francophone member

Note that it also breaks the 'onestring' view on the release channel since it includes broken occitan data:
http://transvision.mozfr.org/string/?entity=browser/chrome/browser/aboutHome.dtd:abouthome.downloadsButton.label&repo=release

@pascalchevrel
Mozilla Francophone member

merged, also manually generated occitan files on the server from the transvision-beta repo as a temporary solution to the onestring view being broken.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Something went wrong with that request. Please try again.