Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gettext bug while parsing plural-forms metadata #62098

Closed
straz mannequin opened this issue May 3, 2013 · 11 comments
Closed

gettext bug while parsing plural-forms metadata #62098

straz mannequin opened this issue May 3, 2013 · 11 comments
Assignees
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@straz
Copy link
Mannequin

straz mannequin commented May 3, 2013

BPO 17898
Nosy @akuchling, @terryjreedy, @nedbat, @ned-deily, @bitdancer
Files
  • pybug.tar.gz: minimal reproducible test case
  • issue17898.patch: Patch with fix and test
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/akuchling'
    closed_at = <Date 2015-04-14.14:37:15.653>
    created_at = <Date 2013-05-03.18:19:46.302>
    labels = ['type-bug', 'library']
    title = 'gettext bug while parsing plural-forms metadata'
    updated_at = <Date 2015-04-14.14:37:15.653>
    user = 'https://bugs.python.org/straz'

    bugs.python.org fields:

    activity = <Date 2015-04-14.14:37:15.653>
    actor = 'akuchling'
    assignee = 'akuchling'
    closed = True
    closed_date = <Date 2015-04-14.14:37:15.653>
    closer = 'akuchling'
    components = ['Library (Lib)']
    creation = <Date 2013-05-03.18:19:46.302>
    creator = 'straz'
    dependencies = []
    files = ['30206', '38915']
    hgrepos = ['304']
    issue_num = 17898
    keywords = ['patch']
    message_count = 11.0
    messages = ['188318', '188323', '188324', '188866', '188873', '240613', '240629', '240632', '240676', '240891', '240893']
    nosy_count = 7.0
    nosy_names = ['akuchling', 'terry.reedy', 'nedbat', 'ned.deily', 'r.david.murray', 'python-dev', 'straz']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue17898'
    versions = ['Python 2.7', 'Python 3.4', 'Python 3.5']

    @straz
    Copy link
    Mannequin Author

    straz mannequin commented May 3, 2013

    The gettext.py parser used by django (lib/python2.7/gettext.py),
    GNUTranslations._parse(), around line 313 does not use clean values for k,v on each
    iteration ("for item in tmsg.splitlines():")

    To reproduce the problem (see traceback, below), try parsing a .PO file containing two headers like this, with a comment header immediately following a plurals header. This example was created by calling msgcat to combine several .po files into a single .po file. Msgcat inserted the comment line.

    "Plural-Forms: nplurals=2; plural=(n != 1);\n"
    "#-#-#-#-# messages.po (EdX Studio) #-#-#-#-#\n"

    Parsing the first header binds the inner loop variables:
    k= plural-forms v= ['nplurals=2', ' plural=(n != 1)', '']

    Parsing the second header leaves k,v untouched, which then causes an improper
    attempt to parse (since it's a comment, no further parsing of k,v should occur)
    v = v.split(';')

    Bug workaround: I use polib to read and immediately save the file. This reorders the metadata to avoid presenting the parser with something that will break it.

    Recommended bug fix: on each iteration over tmsg.splitlines, reset the values of k,v = (None, None)

    --------------------
    Traceback:
    File "/Users/sstrassmann/src/mitx_all/python/lib/python2.7/site-packages/django/core/handlers/base.py" in get_response
    89. response = middleware_method(request)
    File "/Users/sstrassmann/src/mitx_all/python/lib/python2.7/site-packages/django/middleware/locale.py" in process_request
    24. translation.activate(language)
    File "/Users/sstrassmann/src/mitx_all/python/lib/python2.7/site-packages/django/utils/translation/init.py" in activate
    105. return _trans.activate(language)
    File "/Users/sstrassmann/src/mitx_all/python/lib/python2.7/site-packages/django/utils/translation/trans_real.py" in activate
    201. _active.value = translation(language)
    File "/Users/sstrassmann/src/mitx_all/python/lib/python2.7/site-packages/django/utils/translation/trans_real.py" in translation
    191. current_translation = _fetch(language, fallback=default_translation)
    File "/Users/sstrassmann/src/mitx_all/python/lib/python2.7/site-packages/django/utils/translation/trans_real.py" in _fetch
    180. res = _merge(localepath)
    File "/Users/sstrassmann/src/mitx_all/python/lib/python2.7/site-packages/django/utils/translation/trans_real.py" in _merge
    156. t = _translation(path)
    File "/Users/sstrassmann/src/mitx_all/python/lib/python2.7/site-packages/django/utils/translation/trans_real.py" in _translation
    138. t = gettext_module.translation('django', path, [loc], DjangoTranslation)
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gettext.py" in translation
    480. t = _translations.setdefault(key, class_(fp))
    File "/Users/sstrassmann/src/mitx_all/python/lib/python2.7/site-packages/django/utils/translation/trans_real.py" in __init__
    76. gettext_module.GNUTranslations.__init__(self, *args, **kw)
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gettext.py" in __init__
    180. self._parse(fp)
    File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gettext.py" in _parse
    315. v = v.split(';')

    Exception Type: AttributeError at /
    Exception Value: 'list' object has no attribute 'split'

    @straz straz mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels May 3, 2013
    @bitdancer
    Copy link
    Member

    Does this bear any relationship to bpo-1475523? (And yes, I know it is...sad...that that issue hasn't been fixed yet.)

    @straz
    Copy link
    Mannequin Author

    straz mannequin commented May 3, 2013

    There seem to be several bugs involving this particular inner loop in gettext._parse(), but I don't think they're equivalent.

    The present bug (bpo-17898) is that parsing a plural header breaks the following header when it happens to be a comment.

    bpo-1475523 seems to involve multi-line handling

    bpo-12425 seems to involve breaking when the plural-forms value is empty.

    Perhaps a useful design pattern to follow for code which executes this inner loop would be to have some initialization and loop invariants which are asserted true on each iteration. For example, properly initializing k and v on each iteration.

    @terryjreedy
    Copy link
    Member

    Since the other two listed inner-loop issues apply to 3.x, I would guess that this does also. Steve, can you test with 3.3? And provide a *minimal* test case?

    @straz
    Copy link
    Mannequin Author

    straz mannequin commented May 10, 2013

    Sorry, I haven't installed python 3.*, I just have default Mac OS python 2.7.

    Here's a minimal test case. Tar expands to
    file structure:
    ./test.py
    ./en/LC_MESSAGES/messages.po

    $ ./test.py 
    Traceback (most recent call last):
      File "./test.py", line 28, in <module>
        test()
      File "./test.py", line 23, in test
        gettext.install('messages', localedir=LOCALEDIR)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gettext.py", line 494, in install
        t = translation(domain, localedir, fallback=True, codeset=codeset)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gettext.py", line 479, in translat\
    ion
        t = _translations.setdefault(key, class_(fp))
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gettext.py", line 180, in __init__
        self._parse(fp)
      File "/System/Library/Frameworks/Python.framework/Versions/2.7/lib/python2.7/gettext.py", line 314, in _parse
        v = v.split(';')
    AttributeError: 'list' object has no attribute 'split'

    @akuchling
    Copy link
    Member

    Proposed patch against 3.5.

    @akuchling
    Copy link
    Member

    Adding a link to a bitbucket repo.

    @akuchling
    Copy link
    Member

    I would apply this change to 3.4 and 3.5. Should I also backport it to 2.7? I think the same bug applies there, though I haven't verified this or tried my patch.

    @ned-deily
    Copy link
    Member

    LGTM. I think it should be backported to 2.7.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 14, 2015

    New changeset c3d269c01671 by Andrew Kuchling in branch '2.7':
    bpo-17898: reset k and v so that the loop doesn't use an old value
    https://hg.python.org/cpython/rev/c3d269c01671

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Apr 14, 2015

    New changeset 54df02192bfc by Andrew Kuchling in branch '3.4':
    bpo-17898: reset k and v so that the loop doesn't use an old value
    https://hg.python.org/cpython/rev/54df02192bfc

    @akuchling akuchling self-assigned this Apr 14, 2015
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants