Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Doubled backslash in repr() method for unicode #43094

Closed
Cito mannequin opened this issue Mar 27, 2006 · 9 comments
Closed

Doubled backslash in repr() method for unicode #43094

Cito mannequin opened this issue Mar 27, 2006 · 9 comments
Assignees

Comments

@Cito
Copy link
Mannequin

Cito mannequin commented Mar 27, 2006

BPO 1459029
Nosy @Cito, @hyeshik
Files
  • uni.diff: patch for test
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/hyeshik'
    closed_at = <Date 2006-03-28.07:39:56.000>
    created_at = <Date 2006-03-27.02:54:37.000>
    labels = ['expert-unicode']
    title = 'Doubled backslash in repr() method for unicode'
    updated_at = <Date 2006-03-28.07:39:56.000>
    user = 'https://github.com/Cito'

    bugs.python.org fields:

    activity = <Date 2006-03-28.07:39:56.000>
    actor = 'anthonybaxter'
    assignee = 'hyeshik.chang'
    closed = True
    closed_date = None
    closer = None
    components = ['Unicode']
    creation = <Date 2006-03-27.02:54:37.000>
    creator = 'cito'
    dependencies = []
    files = ['1933']
    hgrepos = []
    issue_num = 1459029
    keywords = []
    message_count = 9.0
    messages = ['27885', '27886', '27887', '27888', '27889', '27890', '27891', '27892', '27893']
    nosy_count = 4.0
    nosy_names = ['nnorwitz', 'anthonybaxter', 'cito', 'hyeshik.chang']
    pr_nums = []
    priority = 'high'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1459029'
    versions = ['Python 2.4']

    @Cito
    Copy link
    Mannequin Author

    Cito mannequin commented Mar 27, 2006

    Here is an issue that caused Kid templates (used by
    Turbogears) to malfunction in Python 2.4.3c1.

    The problem shows up with the following code:

    class s1:
        def __repr__(self):
            return '\\n'
    
    class s2:
        def __repr__(self):
            return u'\\n'

    print repr(s1()), repr(s2())

    I get the following results:

    Python 2.3.5: \n \n
    Python 2.4.2: \n \n
    Python 2.4.3c1: \n \\n

    In the output for Python 2.4.3c1, the backslash in the
    representation of class2 appears doubled. This did not
    happen in earlier Python versions and seems to be a bug.

    My vague guess is that the issue may have crept in with
    an attempted fix of Bug bpo-1379994.

    -- Christoph

    @Cito Cito mannequin closed this as completed Mar 27, 2006
    @Cito Cito mannequin assigned hyeshik Mar 27, 2006
    @Cito Cito mannequin added the topic-unicode label Mar 27, 2006
    @Cito Cito mannequin closed this as completed Mar 27, 2006
    @Cito Cito mannequin assigned hyeshik Mar 27, 2006
    @Cito Cito mannequin added the topic-unicode label Mar 27, 2006
    @anthonybaxter
    Copy link
    Mannequin

    anthonybaxter mannequin commented Mar 27, 2006

    Logged In: YES
    user_id=29957

    Confirmed - it's also broken in the trunk, and backing out
    the patch for http://www.python.org/sf/1379994 (r41728)
    fixes the problem. Perky, you checked this in - can you look
    at this soon, please? I don't want to release 2.4.3 until
    it's fixed, but I also want to get 2.4.3 out this week.

    Thanks for the bug report!

    @nnorwitz
    Copy link
    Mannequin

    nnorwitz mannequin commented Mar 27, 2006

    Logged In: YES
    user_id=33168

    Attached a patch for the test case to be added with fix.

    @hyeshik
    Copy link
    Contributor

    hyeshik commented Mar 27, 2006

    Logged In: YES
    user_id=55188

    Looking the C code, unicode_repr is doing correct.
    But the inconsistency came from PyObject_Repr.
    This change made it which is intended:

    ------------------------------------------------------------------------
    r16198 | effbot | 2000-07-09 02:43:32 +0900 (�, 09 7 2000)
    | 6 lines

    • changed __repr__ to use "unicode escape" encoding for unicode
      strings, instead of the default encoding.
      (see "minidom" thread for discussion, and also patch bpo-100706)

    @hyeshik
    Copy link
    Contributor

    hyeshik commented Mar 27, 2006

    Logged In: YES
    user_id=55188

    Found it!:
    http://mail.python.org/pipermail/python-dev/2000-July/005353.html
    But their intention had never applied before 2.4.3.
    What problem would be if we change PyObject_Repr to use the
    default encoding not unicode-escape? (revert r16198)

    @anthonybaxter
    Copy link
    Mannequin

    anthonybaxter mannequin commented Mar 27, 2006

    Logged In: YES
    user_id=29957

    I'm confused how a checkin from 5+ years ago broke a change
    from 3 months ago?

    Or am I misunderstanding you?

    @hyeshik
    Copy link
    Contributor

    hyeshik commented Mar 27, 2006

    Logged In: YES
    user_id=55188

    Because unicode-escape codec didn't escape \,
    PyObject_Repr(u'\\') bypassed backslashes. But Martin and
    Fredrik made PyObject_Repr to use unicode-escape codec for
    unicode repr-returns 5 years ago. So by fixing
    unicode-escape codec, their intention could be applied for
    the first time, 3 months ago.

    @nnorwitz
    Copy link
    Mannequin

    nnorwitz mannequin commented Mar 27, 2006

    Logged In: YES
    user_id=33168

    We need to retain the old behaviour, but also fix the bug.
    How can we do that?

    @anthonybaxter
    Copy link
    Mannequin

    anthonybaxter mannequin commented Mar 28, 2006

    Logged In: YES
    user_id=29957

    Ok. After talking to perky, I reverted the fix for 1379994
    on the release24-maint branch, and reverted /F's ancient
    change on the trunk. This seemed the best combination of
    practicality and purity. Fix will be in 2.4.3 final.

    Thanks for the bug report. Man, unicode and repr is a twisty
    ball of horrors.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant