Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

collections.UserString.__rmod__() raises NameError #69838

Closed
jcgoble3 mannequin opened this issue Nov 18, 2015 · 21 comments
Closed

collections.UserString.__rmod__() raises NameError #69838

jcgoble3 mannequin opened this issue Nov 18, 2015 · 21 comments
Labels
3.7 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@jcgoble3
Copy link
Mannequin

jcgoble3 mannequin commented Nov 18, 2015

BPO 25652
Nosy @rhettinger, @pitrou, @bitdancer, @vadmium, @serhiy-storchaka, @zhangyangyu, @jcgoble3, @miss-islington
PRs
  • bpo-25652: Fix __rmod__ of UserString #13326
  • Files
  • collections.UserString
  • collections.UserString1
  • userstringerror.py: demonstration of error in action
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2022-01-17.00:02:23.265>
    created_at = <Date 2015-11-18.05:16:09.748>
    labels = ['3.7', 'type-bug', 'library']
    title = 'collections.UserString.__rmod__() raises NameError'
    updated_at = <Date 2022-01-17.00:02:23.265>
    user = 'https://github.com/jcgoble3'

    bugs.python.org fields:

    activity = <Date 2022-01-17.00:02:23.265>
    actor = 'iritkatriel'
    assignee = 'none'
    closed = True
    closed_date = <Date 2022-01-17.00:02:23.265>
    closer = 'iritkatriel'
    components = ['Library (Lib)']
    creation = <Date 2015-11-18.05:16:09.748>
    creator = 'jcgoble3'
    dependencies = []
    files = ['41065', '41066', '46854']
    hgrepos = []
    issue_num = 25652
    keywords = ['patch']
    message_count = 21.0
    messages = ['254830', '254832', '254833', '254835', '261222', '261223', '261235', '261239', '261251', '261271', '261272', '261273', '270106', '270139', '270140', '280000', '293366', '293378', '293398', '293451', '343091']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'pitrou', 'r.david.murray', 'martin.panter', 'serhiy.storchaka', 'xiang.zhang', 'jcgoble3', 'miss-islington']
    pr_nums = ['13326']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue25652'
    versions = ['Python 3.6', 'Python 3.7']

    @jcgoble3
    Copy link
    Mannequin Author

    jcgoble3 mannequin commented Nov 18, 2015

    In an instance of collections.UserString, any call to the __rmod__() method will raise NameError, due to the undefined "args" name in the method.

    @jcgoble3 jcgoble3 mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Nov 18, 2015
    @jcgoble3
    Copy link
    Mannequin Author

    jcgoble3 mannequin commented Nov 18, 2015

    c06b2480766d appears to be the offending changeset.

    @zhangyangyu
    Copy link
    Member

    It seems this is a typo. The args should be self.data. And since unicodeobject doesn't support __rmod__, at least I think we should str(format) or remove __rmod__ totally. I attach a patch.

    @zhangyangyu
    Copy link
    Member

    Quite sorry for making some mistakes. unicodeobject does get a __radd__ method. It is added by addoperators though I cannot see it in PyNumberMethods. Now I think just fixing the typo is enough.

    @rhettinger rhettinger self-assigned this Nov 18, 2015
    @jcgoble3
    Copy link
    Mannequin Author

    jcgoble3 mannequin commented Mar 5, 2016

    *ping* Can this be reviewed? It's a simple fix to a problem that happens consistently, and it would be nice to get it into the next bugfix release.

    @rhettinger
    Copy link
    Contributor

    I'll apply this shortly.

    @serhiy-storchaka
    Copy link
    Member

    Would be nice to add some tests. For example: 0 % collections.UserString('').

    @rhettinger
    Copy link
    Contributor

    The UserString1 patch is incorrect as it leads to infinite recursion because the __rmod__ operation only gets called when the other argument doesn't have __mod__.

    One possible fix is to coerce the template to a regular str:

        def __rmod__(self, template):
            return self.__class__(str(template) % self)

    That would get the following test to pass:

        class S:
            'strlike class without a __mod__'
            def __init__(self, value):
                self.value = value
            def __str__(self):
                return str(self.value)
    assert S('say %s') % UserString('hello') == 'say hello'
    

    That said, the goal of UserString is to parallel what a regular string would do. In this case, a TypeError is raised:

        >>> print(S('answer is %s') % 'hello')
        Traceback (most recent call last):
          ...
        TypeError: unsupported operand type(s) for %: 'S' and 'str'

    Serhiy, what do you think should be done, coerce to a str or remove the __rmod__ method entirely?

    @serhiy-storchaka
    Copy link
    Member

    In normal case there is no infinite recursion in collections.UserString1, since UserString.__rmod__(S) falls back to S.__mod__(str) or str.__rmod__(S).

    >>> S('say %s') % UserString('hello')
    Traceback (most recent call last):
      File "<stdin>", line 1, in <module>
      File "/home/serhiy/py/cpython/Lib/collections/__init__.py", line 1156, in __rmod__
        return self.__class__(format % self.data)
    TypeError: unsupported operand type(s) for %: 'S' and 'str'

    Infinite recursion is possible if we make recursive UserString (us.data = us), but in this case we will get infinite recursion in all operations.

    collections.UserString1 LGTM at the same degree as UserString.__le__ (it would be better to handle NotImplemented correctly).

    There is yet one consideration. In 2.x UserString simulates both 8-bit and Unicode strings. In 3.x UserString simulates both str and bytes (except decode(), bytes formatting, etc). I think it is worth once to make UserString to support bytes formatting. In this case it would be incorrect to coerce the template to a regular str.

    @rhettinger
    Copy link
    Contributor

    Antoine, do you have any thoughts on this one? I'm inclined to remove __rmod__ entirely. That would match what str does and it avoids having to guess what the user meant when writing "A(s) % UserString(t)" when A does not have a __mod__ formatting method.

    @pitrou
    Copy link
    Member

    pitrou commented Mar 6, 2016

    I admit I don't understand what __rmod__ is meant to achieve here, and following str's behaviour sounds like a reasonable decision.

    @serhiy-storchaka
    Copy link
    Member

    What about such example?

    >>> class F:
    ...     def __init__(self, value):
    ...         self.value = value
    ...     def __mod__(self, other):
    ...         if isinstance(other, str):
    ...             return self.value % other
    ...         return NotImplemented
    ... 
    >>> from collections import UserString
    >>> F('say %s') % UserString('hello')
    'say hello'

    @serhiy-storchaka serhiy-storchaka removed their assignment Mar 12, 2016
    @bitdancer
    Copy link
    Member

    This was set to commit review when Raymond was ready to commit it, but subsequent discussion indicates the patch isn't yet ready for commit. In particular, there ought to be some tests, even if the decision is to remove __rmod__.

    The fact that UserString passes its tests with this buggy method indicates that there is a lack somewhere in the general string tests that test_userstring relies on.

    @vadmium
    Copy link
    Member

    vadmium commented Jul 10, 2016

    Jonathon: Do you have a use case for __rmod__(), or did you find this bug by code analysis?

    UserString.__rmod__() was added as part of bpo-22189. The main reason seems to be so that UserString has all the methods listed by dir(str). The result was that UserString.__rmod__() is documented as existing since 3.5.

    Ideally I would agree with removing UserString.__rmod__(), but perhaps it would be safer to just have it return NotImplemented. As long as UserString properly supports all combinations of user_string_a % (user_string_b, user_string_c) etc, whether or how __rmod__() is implemented is an implementation detail. The existance of str.__rmod__() does not seem to be documented. It just seems to be a side-effect of how the C-level nb_remainder slot works.

    For Serhiy’s F class, which explicitly only works with a single str argument, I think it is reasonable that using UserString instead would raise TypeError. This is what Python 2’s UserString does.

    @jcgoble3
    Copy link
    Mannequin Author

    jcgoble3 mannequin commented Jul 10, 2016

    Code analysis, if it can even be called that. I was simply looking through the source of the collections module one day out of curiosity (mainly to see how various things in it were implemented) and this bug jumped out at me as I was reading the code. I do not have a use case for the __rmod__() method.

    @serhiy-storchaka
    Copy link
    Member

    bpo-28598 may be related.

    @jcgoble3
    Copy link
    Mannequin Author

    jcgoble3 mannequin commented May 10, 2017

    Any decision on this? I recently played around and found a reasonable use case where UserString.__rmod__ does get called; run the attached userstringerror.py to see it in action.

    Basically, it seems the idea of UserString is to subclass it, tweak as desired, and use your subclass as necessary. If you then subclass your subclass (e.g. for a portion of your code with a specialized need) and extend __mod__ and __rmod__ in that sub-subclass (calling the parent method with super()), then any case of "subclass instance % sub-subclass instance" results in Python calling the sub-subclass's __rmod__ method directly without trying a __mod__ method. If that method then calls super().__rmod__ (e.g. it just needed to pre-process the data, such as to normalize it, before the formatting operation), then UserString.__rmod__ will be called and result in the NameError.

    @jcgoble3 jcgoble3 mannequin added the 3.7 only security fixes label May 10, 2017
    @rhettinger
    Copy link
    Contributor

    Unless there are objections, I think the wisest course is to remove __rmod__ entirely (refuse the temptation to guess at what the user would expect the semantics to be).

    @serhiy-storchaka
    Copy link
    Member

    I think any solution is better than keeping a bug in the current code.

    @jcgoble3
    Copy link
    Mannequin Author

    jcgoble3 mannequin commented May 10, 2017

    I would prefer to keep __rmod__ and fix the bug, given that the use case I described above would otherwise create an inconsistency in subclasses, which would be able to easily extend __mod__ by calling super(), but would be forced to fully implement __rmod__ themselves.

    That said, I will defer to those who are more experienced here.

    @miss-islington
    Copy link
    Contributor

    New changeset 7abf8c6 by Miss Islington (bot) (Batuhan Taşkaya) in branch 'master':
    bpo-25652: Fix __rmod__ of UserString (GH-13326)
    7abf8c6

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.7 only security fixes stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    8 participants