Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Rails yml import: the key 'one' is loaded as plural form 'zero' if the key 'zero' is missing #4484

Open
TruffeCendree opened this issue Sep 10, 2020 · 7 comments
Labels
backlog This is not on the Weblate roadmap for now. Can be prioritized by sponsorship. enhancement Adding or requesting a new feature. translate-toolkit Issues which need to be fixed in the translate-toolkit

Comments

@TruffeCendree
Copy link

Describe the bug

I changed the plural count and formula of French to support the 'zero' case.

  • plural count: 3
  • plural formula: (n == 0) ? 0 : (n == 1 ? 1 : 2)

Then, I load the following rails i18n file:

fr:
  successful_key:
    one: Un événement
    other: '%{count} événements'
    zero: Aucun événement
  failing_key:
    one: Un participant
    other: '%{count} participant'

The first key successful_key is successfully parsed. But the second key failing_key like if the input was :

fr:
  failing_key:
    one: '%{count} participant'
    other: 
    zero: Un participant

I other words, if the 'zero' key is missing, the 'one' key will be used for the plural form n = 0.

To Reproduce

  • create a new project
  • update the locale properties with plural count: 3 and plural formula: (n == 0) ? 0 : (n == 1 ? 1 : 2)
  • use the example file below
  • see the imported result

Expected behavior

If the zero key was missing during import, it should not corrupt other plural form.
Rails supports that the zero key is missing, so we only set it when necessary.

Screenshots

Peek 10-09-2020 16-40

Server configuration and status

  • Weblate: 4.2.2
  • Django: 3.1.1
  • siphashc: 1.3
  • Whoosh: 2.7.4
  • translate-toolkit: 3.0.0
  • lxml: 4.5.2
  • Pillow: 7.2.0
  • bleach: 3.1.5
  • python-dateutil: 2.8.1
  • social-auth-core: 3.3.3
  • social-auth-app-django: 4.0.0
  • django-crispy-forms: 1.9.2
  • oauthlib: 3.1.0
  • django-compressor: 2.4
  • djangorestframework: 3.11.1
  • django-filter: 2.3.0
  • django-appconf: 1.0.4
  • user-agents: 2.1
  • filelock: 3.0.12
  • setuptools: 40.8.0
  • jellyfish: 0.8.2
  • openpyxl: 3.0.5
  • celery: 4.4.7
  • kombu: 4.6.11
  • translation-finder: 2.1
  • html2text: 2020.1.16
  • pycairo: 1.16.2
  • pygobject: 3.30.4
  • diff-match-patch: 20200713
  • requests: 2.24.0
  • django-redis: 4.12.1
  • hiredis: 1.1.0
  • sentry_sdk: 0.16.5
  • Cython: 0.29.21
  • misaka: 2.1.1
  • GitPython: 3.1.7
  • borgbackup: 1.1.13
  • pyparsing: 2.4.7
  • Python: 3.7.3
  • Git: 2.20.1
  • psycopg2: 2.8.5
  • psycopg2-binary: 2.8.5
  • phply: 1.2.5
  • chardet: 3.0.4
  • ruamel.yaml: 0.16.10
  • tesserocr: 2.5.1
  • akismet: 1.1
  • boto3: 1.14.53
  • zeep: 3.4.0
  • aeidon: 1.7.0
  • iniparse: 0.5
  • mysqlclient: 2.0.1
  • Mercurial: 5.5.1
  • git-svn: 2.20.1
  • git-review: 1.28.0
  • hub: 2.13.0
  • lab: 0.16
  • Redis server: 4.0.14
  • PostgreSQL server: 11.9
  • Database backends: django.db.backends.postgresql
  • Cache backends: default:RedisCache, avatar:FileBasedCache
  • Email setup: django.core.mail.backends.smtp.EmailBackend: 127.0.0.1
  • OS encoding: filesystem=utf-8, default=utf-8
  • Celery: redis://cache:6379/1, redis://cache:6379/1, regular
  • Platform: Linux 5.3.0-1035-aws (x86_64)
@TruffeCendree TruffeCendree changed the title Rails yml import: the key 'one' is loaded as plural form 'zero' is the key 'zero' is missing Rails yml import: the key 'one' is loaded as plural form 'zero' if the key 'zero' is missing Sep 10, 2020
@nijel
Copy link
Member

nijel commented Sep 10, 2020

The parser in translate-toolkit has no knowledge of the plurals, it only assumes that CLDR ordering is used. So there is probably no other way to change plural than adding missing keys to the file.

See https://github.com/translate/translate/blob/a88557a4a00eec3cc153b1c6edaf31ab377ebe16/translate/storage/yaml.py#L221-L227 for parsing code.

@TruffeCendree
Copy link
Author

Thanks for your help.

Forcing user to always specify the 'zero' key is a bit boring. In our case, it means adding a thousand useless strings to enable the feature for some keys. The alernative are stopping using zero quantifier or patching the parser/serializer (we use the docker image with auto update script, so it will be more difficult).

This issue impacts default configuration of weblate too. With plural_count = 2 for French, just having the key 'other' (without 'one') will silently corrupt the loaded hash. It appened to us in production.

Their is a real risk of corrupting imported files, specially when theses files were hand-crafted or exported from localeapp.
If the parser still exclusively rely on CLDR ordering, I recommand adding some asserts. For example, the parser could check that all pluralized string contains the same keys. I prefer an explicit error message that a silent corruption for production use.

@nijel
Copy link
Member

nijel commented Sep 12, 2020

Can you please describe how does the consumer determine which key to use? Does it have defined the custom plural equation as well? I'm just trying to understand how you use the resulting files because that's important part of the picture as well.

Having "one" key doesn't really define for which counts it should be used. For some languages it's really just for one, but there are languages where this form is used for all numbers ending with one (for example https://unicode-org.github.io/cldr-staging/charts/37/supplemental/language_plural_rules.html#be). That's why it's a good idea to have a consistent definition of the plurals.

@TruffeCendree
Copy link
Author

Oh... You have an hard work to support all these locales 👍

Our usage

The current implementation works pretty well for the nominal case, so it is important to keep it easy to use.
I wrote some monkey patch to feet our need, covered be this issue and by #4487.

Theses patches updates 2 behaviours :

  • built-in plural counts and formulae should not be overriden at startup.
  • the yaml parser and serializer should maps 'zero' => 0, 'one' => 1 and 'other' => 2 (and vis-versa) because it matches my custom plural formula. It does not export the zero key when not defined (possible because zero is not in CLDR for French and English).

Example of usage
Let imagine a text that show the total number of users.

In our ruby code, we will write :

I18n.t('controllers.users.show_count', count: User.count)

With weblate like today, the en.yml could contains:

en:
  controllers:
    users:
      show_count:
        one: '%{count} user confirmed its account'.
        other: '%{count} users confirmed their accounts.'

If User.count = 0, the resulting string is `0 user confirmed its account'

But, Ruby On Rails allows to write:

en:
  controllers:
    users:
      show_count:
        zero: 'Nobody registred at this time'
        one: 'One user confirmed its account'.
        other: '%{count} users confirmed their accounts.'

In this case, the 'zero' key is not required by CLDR since '0' and '1' quantifier uses the same plural form. It use usefull to write a more natural text.

Some ideas for a generic implementation

Configuration elements could be :

  • for the locale: plural count and formulae
  • for the storage, an explicit map between keys and results of plural formulae.

I think that storage options should be customizable on a "per project and format" basis.

Considering your example with 'one' defined as any number ending with '1', the default configuration (without zero) could be:

  • plural counts = 4 (one + few + any + other)
  • plural formula = (n % 10 = 1 && n % 100 != 11) ? 0 : (...other cases...)
  • plural map (configured for yaml storage) = { 0 => 'one', 1 => 'few', 2 => 'any', 3 => 'other' }

If the customer need to support the zero, it could change the settings to look like:

  • plural count = 5 (zero, one + few + any + other)
  • plural formula = n === 0 ? 0 : ((n % 10 = 1 && n % 100 != 11) ? 1 : (...other cases...))
  • plural map (configured for yaml storage) = { 0 => 'zero', 1 => 'one', 2 => 'few', 3 => 'any', 4 => 'other' }

And finally, in the locale configuration, we could mark 'zero' as optional. In weblate, it would allow to translate a key even when 'zero' is not in the source. In translate, it would export the file without the 'zero' key and let the software gracefully fallbacks to 'one' or 'other' key.

This covers our needs, but I haven't enough experience with other locales to understand all implications.
An issue of this proposal is the complexity added to the project. I have no idea of how many people will take advantage of it VS the cost to develop and maintain it.

I hope this will help you :-)

@nijel
Copy link
Member

nijel commented Sep 14, 2020

Okay, so this is specific to Ruby i18n module and applies only to the zero case. See https://github.com/ruby-i18n/i18n/blob/1b5e34553003ca3b42b842769e86c98d5e3b71d4/lib/i18n/backend/pluralization.rb#L27-L30:

The :zero key is always picked directly when count equals 0 AND the translation data has the key :zero. This way translators are free to either pick a special :zero translation even for languages where the pluralizer does not return a :zero key.

To address this behavior properly in Weblate following changes would be needed:

  • Support differentiating plural formula on string basis (either on user request or based on source language, the latter would not need a user interface for that).
  • Translate-toolkit needs to report this situation somehow. It only returns list of the strings right now. Weblate might assume that if there is no zero present in the plural form and there is one extra string, it is this case.
  • Translate-toolkit needs to be able to serialize strings with additional zero. Again, this could be based on string count lack of zero in the per language tags.

Given that it looks like a complex change, I'm not sure it's worth of the effort...

@nijel nijel added backlog This is not on the Weblate roadmap for now. Can be prioritized by sponsorship. enhancement Adding or requesting a new feature. translate-toolkit Issues which need to be fixed in the translate-toolkit labels Sep 14, 2020
@github-actions
Copy link

The issue you've reported needs to be addressed in the translate-toolkit. Please file the issue there and do not forget to include links to any relevant specifications about the formats (if applicable).

@github-actions
Copy link

This issue has been added to the backlog. It is not scheduled on our roadmap, but it eventually might be implemented. In case you desperately need this feature, please consider helping or funding the development.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
backlog This is not on the Weblate roadmap for now. Can be prioritized by sponsorship. enhancement Adding or requesting a new feature. translate-toolkit Issues which need to be fixed in the translate-toolkit
Projects
None yet
Development

No branches or pull requests

2 participants