Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

java-format check should be enabled by default on Android strings.xml format #4623

Closed
eighthave opened this issue Sep 30, 2020 · 12 comments
Closed
Assignees
Labels
enhancement Adding or requesting a new feature.
Milestone

Comments

@eighthave
Copy link
Contributor

The java-format check is described as checking "Java format string does not match source", but it is failing to catch issues. For example, these are not flagged:

https://hosted.weblate.org/translate/calyxos/seedvault/fr/?checksum=c124a012361960e
source: Note: Your %1$s needs to be plugged in for this to work.
translation: Remarque : votre %1$ doit être branché pour que ceci fonctionne.
problem: missing s

https://hosted.weblate.org/translate/f-droid/f-droid/ar/?checksum=87fd8aaaedef575f
source: Disabled %1$s.
translation: تم تعطيل ٪1$s.
problem: uses Arabic ٪ instead of ASCII %

Expected behavior

Should flag those issues in the standard check method.

Screenshots

Screenshot from 2020-09-30 12-11-12
Screenshot from 2020-09-30 12-11-27

Server configuration and status

hosted.weblate.org

@eighthave
Copy link
Contributor Author

I have a simple Python version of this check working in fdroidclient, you can use it under any FSF-free software license you need.

https://gitlab.com/fdroid/fdroidclient/-/blob/1.9/tools/check-format-strings.py

#!/usr/bin/env python3

# Remove extra translations

import glob
import os
import sys
import re
from xml.etree import ElementTree

formatRe = re.compile(r'(%%|%[^%](\$.)?)')

simpleFormatRe = re.compile(r'%[ds]')
numberedFormatRe = re.compile(r'%[0-9]+\$')
validFormatRe = re.compile(r'^(%%|%[sd]|%[0-9]\$[sd])$')
oddQuotingRe = re.compile(r'^"\s*(.+?)\s*"$')

resdir = os.path.join(os.path.dirname(__file__), '..', 'app', 'src', 'main', 'res')

count = 0

for d in sorted(glob.glob(os.path.join(resdir, 'values-*'))):

    str_path = os.path.join(d, 'strings.xml')
    if not os.path.exists(str_path):
        continue

    with open(str_path, encoding='utf-8') as fp:
        fulltext = fp.read()

    tree = ElementTree.parse(str_path)
    root = tree.getroot()

    for e in root.findall('.//string') + root.findall('.//item'):
        if e.tag == "item" and e.text is None:
            continue

        found_simple_format = False  # %s
        found_numbered_format = False  # %1$s
        if e.text is None:
            fulltext = re.sub(r' *<string name="' + e.attrib['name'] + r'"> *</string> *\n?', '', fulltext)
            print('%s: %s is blank!' % (str_path, e.attrib['name']))
            count += 1
            continue
        for m in formatRe.finditer(e.text):
            s = m.group(0)
            if simpleFormatRe.match(s):
                found_simple_format = True
            if numberedFormatRe.match(s):
                found_numbered_format = True
            if validFormatRe.match(s):
                continue
            count += 1
            print('%s: Invalid format "%s" in "%s"' % (str_path, s, e.text))

        if found_simple_format and found_numbered_format:
            count += 1
            print('%s: Invalid mixed formats "%s" in "%s"' % (str_path, s, e.text))

        m = oddQuotingRe.search(e.text)
        if m:
            print('%s: odd quoting in %s' % (str_path, e.tag))
            print('found', fulltext.rfind(e.text))
            fulltext = fulltext.replace(e.text, m.group(1))
            count += 1
            if e.text != m.group(1):
                print(e.text, '-=<' + m.group(1) + '>=-')

    with open(str_path, 'w', encoding='utf-8') as fp:
        fp.write(fulltext)

if count > 0:
    print("%d misformatted strings found!" % count)
    sys.exit(1)

@grote

@eighthave
Copy link
Contributor Author

I just looked at the Weblate source. My guess is that the problem is that the Weblate check does not make a list of the formats that are in the source string, then compare that to the list of the formats in the translation. Basically, if %1$s exists in the source, it must also exist in the translation.

@nijel
Copy link
Member

nijel commented Oct 1, 2020

You first need to tell Weblate using a flag that this expects to be a format string. See https://docs.weblate.org/en/latest/user/checks.html#formatted-strings

I don't think that all Android strings have to be used as format string, so it doesn't make sense to enable it automatically for all strings.

@eighthave
Copy link
Contributor Author

Oh, its not enabled by default? Sorry this whole bug report was based on my assumption that it was enabled by default.

I think it should be enabled by default on all Android strings since strings there is no separate type for string formats, they are just plain <string name=""> entries, without any special marking. (I'd say the same for Python with gettext/babel). I think it would be extremely rare to have an Android string with %1$s in it that was not meant to be a format. Even %s or %d would be rare.

It would be quite laborious to manually mark all the Android strings as containing a format, and there is no official way to do that. I guess it is easy to add java-format globally per-project, but that still requires someone to know about that. IIRC, there are some translation formats that automatically enable checks.

@chirayudesai
Copy link

Additionally, we enabled for the 'calyxos/seedvault' project, I see 'java-format' enabled in Settings.

However, I don't see it under https://hosted.weblate.org/checks/?project=calyxos&component=seedvault nor do I see any errors at the string in question - https://hosted.weblate.org/translate/calyxos/seedvault/fr/?checksum=c124a012361960e

I do get a warning if I try to change the punctuation, from Remarque : votre to Remarque: votre - removing the space before the colon.

@eighthave
Copy link
Contributor Author

Ok, I looked at Python+gettext code, it looks like gettext actually adds the python-brace-format and python-format to the .po files per-string. The Android strings.xml format doesn't define any method to tag a string at all, and the Android tools do not add any tags. So this is different than Python or other gettext things. The end effect is that Python+gettext format strings are automatically set up with the checks. I think we want Android to also be automatically setup with the checks, and the only way that seems to be possible is to add it globally by default for the Android strings.xml format.

@chirayudesai I think the checks are only run when the strings are edited, so enabling the checks won't take effect on existing strings.

@nijel
Copy link
Member

nijel commented Oct 2, 2020

I think we want Android to also be automatically setup with the checks, and the only way that seems to be possible is to add it globally by default for the Android strings.xml format.

That's easily possible, I just don't have the knowledge to decide whether it's really desired - whether the strings are always treated as format strings. Weblate already does this for some formats:

check_flags = ("i18next-interpolation",)

I think the checks are only run when the strings are edited, so enabling the checks won't take effect on existing strings.

The checks should be updated in background, though I don't recall whether it's scheduled immediately.

@nijel
Copy link
Member

nijel commented Oct 2, 2020

Additionally, we enabled for the 'calyxos/seedvault' project, I see 'java-format' enabled in Settings.

It has to be added to flags to trigger the check. You've added it to enforced checks, which defines which checks can not be ignored by users.

I've done the change now and the check now properly fires.

@eighthave
Copy link
Contributor Author

That's easily possible, I just don't have the knowledge to decide whether it's really desired - whether the strings are always treated as format strings. Weblate already does this for some formats:

I'm now 100% sure that the Android strings.xml should automatically enable the java-format check. I've run that checker script on fdroidclient for three years now, and I handle Weblate for a number of other Android apps where Android's lint reports these errors. I cannot recall a single false positive from this checkers.

@eighthave eighthave changed the title java-format check failing on Android strings.xml format java-format check should be enabled by default on Android strings.xml format Oct 2, 2020
@nijel nijel self-assigned this Oct 2, 2020
@nijel nijel added the enhancement Adding or requesting a new feature. label Oct 2, 2020
@nijel nijel added this to the 4.3 milestone Oct 2, 2020
@nijel nijel closed this as completed in fe86811 Oct 2, 2020
@github-actions
Copy link

github-actions bot commented Oct 2, 2020

Thank you for your report, the issue you have reported has just been fixed.

  • In case you see a problem with the fix, please comment on this issue.
  • In case you see a similar problem, please open a separate issue.
  • If you are happy with the outcome, consider supporting Weblate by donating.

@mikeNG
Copy link

mikeNG commented Nov 3, 2020

We are still getting broken strings as you can see in seedvault-app/seedvault@e9a3d1c#diff-7049c01dbaca09e41462205bc840bb2bdf6766f0b832e1ff68059c7fdfafadc8R73

The strings look correct in weblate interface but not in the exported git commits.

@nijel
Copy link
Member

nijel commented Nov 4, 2020

@mikeNG That seems completely unrelated here, this issue is about quality checks enabled by default for Android formats. The issue you are hitting is most likely addressed by translate/translate#4137

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement Adding or requesting a new feature.
Projects
None yet
Development

No branches or pull requests

4 participants