New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add helper for plural rules placeholders #131
Conversation
357f0c8
to
d18ea0c
Compare
78f4003
to
2c2e449
Compare
openformats/utils/icu.py
Outdated
corresponding strings | ||
:rtype: str | ||
""" | ||
return self.serialize_strings(pluralized_string.string, delimiter) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
What is the purpose of this method? It just calls another
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It was used by the JSON handler. You are right, it doesn't make any sense to keep it.
openformats/utils/icu.py
Outdated
allow_numeric_plural_values=True): | ||
"""Update the given content so that it contains all | ||
necessary placeholders the target language supports, | ||
no more, no less. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The "no more, no less" part can be removed. Or you can specify that this happens "independently of the source language's rules"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I liked the "no more, no less" part, but I like your suggestion more.
openformats/utils/icu.py
Outdated
offset += len(line) | ||
if separator_pos > -1: | ||
offset += sep_length | ||
continue # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Why the # noqa
here?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
(this was actually replaced with a # pragma: no cover
on the next commit, I assume you had commented on the commit diff only)
The reason this was added is that during testing, execution never reached that point. I spent 1-2 hours on this and I am pretty sure that the code was actually reached, but because of the way continue
is handled in terms of execution, the coverage script did not understand that. I had to add the no cover
mark, otherwise the coverage fell to a point that the PR was blocked from merging.
openformats/utils/icu.py
Outdated
offset += len(line) | ||
if separator_pos > -1: | ||
offset += sep_length | ||
continue # noqa |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
The 4 lines above are repeated. Should we move them in a different helper and re-use this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
If we made such a helper, we would have to pass 3 arguments (line
or its length, separator_pos
and sep_length
) in order to make it work. And we would still have to manually call continue
. So I don't think it makes much sense, it would be just noise.
openformats/utils/icu.py
Outdated
strings_by_rule = { | ||
rule: hash_str + str(index) | ||
for index, rule in enumerate(target_plural_forms) | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Since this method is big, should we move the above 3 commands to a different method?
openformats/utils/icu.py
Outdated
# } | ||
hash_str = icu_string.strings_by_rule[5] | ||
hash_str = hash_str[:-1] # remove the last char, which is the plural index | ||
strings_by_rule = { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think it's better to rename this as hashes_by_rule
or plural_placeholders
. There are no actual strings there.
openformats/utils/icu.py
Outdated
|
||
# If there are no more lines, stop iterating | ||
if offset > 0 and separator_pos == -1: | ||
break |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
We could make the code more descriptive here. Something like:
no_more_lines = offset > 0 and separator_pos == -1
if no_more_lines:
break
Should we move this to the end of the loop and change the while
condition to something like: while more_lines
?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried it and I prefer the current way.
Although I would prefer to not have while True
myself, changing the logic to while <condition>
requires significant changes due to the the 2 continue
blocks we have. I'm reluctant to make such as change at this point.
self.assertSetEqual( | ||
set(icu_string.strings_by_rule.keys()), | ||
rule_set, | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that the following cases are missing from the unittests:
- A case with
\r\n
as separator - A case with a line without braces
- A case with a line with braces but without ICU contents
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sounds good, I'll amend the tests.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Update: the previous implementation made assumptions about the file format (assumed key-value syntax). The functionality of the ICU methods and functions is now specific to ICU syntax only, agnostic to any specific file format. Therefore, the tests mentioned above are no longer valid.
u'keyE=brace {\n' | ||
) | ||
|
||
def test_with_alternate_newline_char(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think we should use "alternative", not "alternate"
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Test was removed.
openformats/utils/icu.py
Outdated
for a specific set of plural rules. This works regardless | ||
how many plural rules are found in `icu_string` (source language | ||
string) and how many languages are found in `plural_rules` | ||
(target language). The reason this works is that all placholders |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Typo: placholders
serialized, | ||
) | ||
|
||
def test_with_less_target_languages(self): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think that this test should be named test_with_less_target_plural_rules
, not languages
. Also, the plural rules are equal in quantity but different. Should we add some more tests on this?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Renamed to test_with_less_target_plurals
. Added an additional test for equal number of rules, but different types.
Add a helper function that replaces all serialized plural rule placeholders (hashes), that correspond to the source language, with those that correspond to the given target language rules. This is useful when the template does not serialize all pluralized content for a string (e.g. "{count, plural, <hash>}"), but rather adds one hash placeholder for each rule (e.g. "{count, plural, one {<hash1>} other {<hash2>}"). In the latter case, when compiling, the existing source placeholders need to be removed completely, and the target placeholders need to appear in their place.
86a699f
to
fda56ed
Compare
Checklist (for the reviewer)
Problem and/or solution
Add a helper function that replaces all serialized plural rule placeholders (hashes), that correspond to the source language, with those that correspond to the given target language rules.
This is useful when the template does not serialize all pluralized content for a string (e.g. "{count, plural, }"), but rather adds one hash placeholder for each rule (e.g. "{count, plural, one {} other {}").
In the latter case, when compiling, the existing source placeholders need to be removed completely, and the target placeholders need to appear in their place.