Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PEP 3137 patch - str8/str comparison should return false #45604

Closed
thomaslee mannequin opened this issue Oct 11, 2007 · 16 comments
Closed

PEP 3137 patch - str8/str comparison should return false #45604

thomaslee mannequin opened this issue Oct 11, 2007 · 16 comments
Assignees
Labels
interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-bug An unexpected behavior, bug, or error

Comments

@thomaslee
Copy link
Mannequin

thomaslee mannequin commented Oct 11, 2007

BPO 1263
Nosy @gvanrossum, @brettcannon
Files
  • unicode-string-eq-false-r2.patch
  • unicode-string-eq-false-structmember-c-r1.patch
  • unicode-string-eq-false-r3.patch
  • unicode-string-eq-false-all-r4.patch
  • r4-revised.patch
  • fix_test_compile.diff
  • fix_test_format.diff
  • fix_modulefinder.diff
  • sqlite_fix.diff
  • fix_test_str.diff
  • fix_test_subprocess.diff
  • fix_test_struct.diff
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/brettcannon'
    closed_at = <Date 2007-10-22.20:25:29.076>
    created_at = <Date 2007-10-11.12:20:47.660>
    labels = ['interpreter-core', 'type-bug', 'release-blocker']
    title = 'PEP 3137 patch - str8/str comparison should return false'
    updated_at = <Date 2008-01-06.22:29:45.526>
    user = 'https://bugs.python.org/thomaslee'

    bugs.python.org fields:

    activity = <Date 2008-01-06.22:29:45.526>
    actor = 'admin'
    assignee = 'brett.cannon'
    closed = True
    closed_date = <Date 2007-10-22.20:25:29.076>
    closer = 'brett.cannon'
    components = ['Interpreter Core']
    creation = <Date 2007-10-11.12:20:47.660>
    creator = 'thomaslee'
    dependencies = []
    files = ['8510', '8511', '8512', '8539', '8572', '8573', '8574', '8579', '8580', '8581', '8582', '8590']
    hgrepos = []
    issue_num = 1263
    keywords = ['patch']
    message_count = 16.0
    messages = ['56343', '56344', '56437', '56452', '56566', '56576', '56578', '56580', '56614', '56615', '56617', '56618', '56619', '56624', '56642', '56656']
    nosy_count = 3.0
    nosy_names = ['gvanrossum', 'brett.cannon', 'thomaslee']
    pr_nums = []
    priority = 'release blocker'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue1263'
    versions = ['Python 3.0']

    @thomaslee
    Copy link
    Mannequin Author

    thomaslee mannequin commented Oct 11, 2007

    The main patch - while exactly what is needed to make str8/str equality
    checks return False - breaks a bunch of tests due to PyString_* still
    being used elsewhere when it should be using PyUnicode.

    The second patch modifies structmember.c to use PyUnicode_* where it was
    previously using PyString_*, which fixes the first problem I stumbled
    across in trying to get test_unicode to run.

    Unfortunately, similar errors are present in Python/codecs.c and other
    places (maybe Python/modsupport.c too? not 100% sure yet) - these still
    need to be fixed!

    @thomaslee thomaslee mannequin added type-feature A feature request or enhancement interpreter-core (Objects, Python, Grammar, and Parser dirs) labels Oct 11, 2007
    @thomaslee
    Copy link
    Mannequin Author

    thomaslee mannequin commented Oct 11, 2007

    Oops - use unicode-string-eq-false-r3.patch, not
    unicode-string-eq-false-r2.patch.

    @gvanrossum gvanrossum self-assigned this Oct 11, 2007
    @thomaslee
    Copy link
    Mannequin Author

    thomaslee mannequin commented Oct 15, 2007

    Hack to make Python/codecs.c use Unicode strings internally. I recognize
    the way I have fixed it here is probably not ideal (basically ripped out
    PyString_*, replaced with a PyMem_Malloc/PyMem_Free call) but it fixes
    10-12 tests that were failing with my earlier changes. If anybody can
    recommend a "nice" way to fix this, I'm all ears.

    The following still fail for me with this patch applied:

    -- test_compile

    This is due to PyString_/PyUnicode_/PyBytes_* confusion in the
    assembler struct (specifically: a_lnotab and a_bytecode) in
    Python/compile.c - tried replacing PyString_* calls with PyBytes_*
    calls, but this raises a TypeError because PyBytes is not hashable ...
    not sure what exactly is causing this.

    -- test_ctypes
    Looks like a simple case of ctypes using str8 instead of str. Appears to
    be an easy fix.

    -- test_modulefinder
    Failing because str8 >= str is now an invalid operation

    -- test_set
    This test needs some love.

    -- test_sqlite
    Not sure what's going on here.

    -- test_str

    This one is a little tricky: str8/str with __str__/unicode ... how
    is this test supposed to behave with the fix in this patch?

    -- test_struct
    "unpack/pack not transitive" - what does that mean?

    -- test_subprocess
    Like modulefinder, this is probably just due to the use of str8 over str
    internally in the subprocess module. Likely to be an easy fix.

    The following tests fail for me irrespective of whether or not I have r4
    of my patch applied:

    -- test_doctest
    -- test_email
    -- test_nis
    -- test_pty

    If anybody can give this new patch a try and let me know the result it
    would be much appreciated.

    @gvanrossum
    Copy link
    Member

    I'll look at this at some point. One quick comment: the lnotab and bytecode
    should use PyString, which will be 'bytes' in 3.0a2. They must be immutable
    because code objects must be immutable (it must not be possible to modify an
    existing code object).

    On 10/15/07, Thomas Lee <report@bugs.python.org> wrote:

    Thomas Lee added the comment:

    Hack to make Python/codecs.c use Unicode strings internally. I recognize
    the way I have fixed it here is probably not ideal (basically ripped out
    PyString_*, replaced with a PyMem_Malloc/PyMem_Free call) but it fixes
    10-12 tests that were failing with my earlier changes. If anybody can
    recommend a "nice" way to fix this, I'm all ears.

    The following still fail for me with this patch applied:

    -- test_compile

    This is due to PyString_/PyUnicode_/PyBytes_* confusion in the
    assembler struct (specifically: a_lnotab and a_bytecode) in
    Python/compile.c - tried replacing PyString_* calls with PyBytes_*
    calls, but this raises a TypeError because PyBytes is not hashable ...
    not sure what exactly is causing this.

    -- test_ctypes
    Looks like a simple case of ctypes using str8 instead of str. Appears to
    be an easy fix.

    -- test_modulefinder
    Failing because str8 >= str is now an invalid operation

    -- test_set
    This test needs some love.

    -- test_sqlite
    Not sure what's going on here.

    -- test_str

    This one is a little tricky: str8/str with __str__/unicode ... how
    is this test supposed to behave with the fix in this patch?

    -- test_struct
    "unpack/pack not transitive" - what does that mean?

    -- test_subprocess
    Like modulefinder, this is probably just due to the use of str8 over str
    internally in the subprocess module. Likely to be an easy fix.

    The following tests fail for me irrespective of whether or not I have r4
    of my patch applied:

    -- test_doctest
    -- test_email
    -- test_nis
    -- test_pty

    If anybody can give this new patch a try and let me know the result it
    would be much appreciated.


    Tracker <report@bugs.python.org>
    <http://bugs.python.org/issue1263\>


    @gvanrossum
    Copy link
    Member

    I've committed the half of this patch that doesn't break any tests: the
    changes to codecs.c and structmember.c.

    Committed revision 58551.

    I'm seeking help getting the remaining unit tests to pass. (Thanks
    Thomas for the enumeration of the test failures!)

    @brettcannon
    Copy link
    Member

    The file I just uploaded is unicode-string-eq-false-all-r4.patch with
    the codecs.c and structmember.c parts of the patch removed.

    @brettcannon
    Copy link
    Member

    Attached is a patch to fix test_compile. Simple fix of turning an empty
    string into str8('').

    @brettcannon
    Copy link
    Member

    Attached a fix for test_format.

    It was testing string interpolation on both str8 and str and using a str
    for the comparison. Since string interpolation is going away for str8
    once it becomes bytes I just removed the testing of str8.

    The failures I know of left are:
    test_modulefinder
    test_sqlite
    test_str
    test_struct
    test_subprocess

    @brettcannon
    Copy link
    Member

    Attached is a fix for modulefinder.

    It is an ugly hack as modulefinder took the numeric opcode numbers from
    dis and passed them to chr(). But that doesn't work since that returns
    Unicode. So I took those single characters and passed them to str8().

    Once str8() has its constructor match bytes() then the chr() call can be
    ditched and the dis values can be tossed into a single-item list.

    @brettcannon
    Copy link
    Member

    Attached is a fix for sqlite3.

    First issue was that the dictionary that was being used to store
    converters was having keys in Python code as Unicode but being compared
    against str8 in C.

    The second issue was that when an object was serialized using
    __conform__ and a Unicode object was returned, it was being unserialized
    as a str8 no matter what type of argument was returned. That makes the
    most sense if only a single type is going to be returned, so I left it
    as such and fixed the test to decode str8 to UTF-8 if passed to __init__.

    @brettcannon
    Copy link
    Member

    Attached is a patch to fix test_str.

    Basically there were a bunch of redundant tests for making sure that
    calling str() on an object called it's __str__ method. str8 no longer
    is directly relevant since it is no longer an actual string type.

    @brettcannon
    Copy link
    Member

    Guido, what do you want to do about the struct module for the various
    string formats (i.e., c, s, p)? Should they return str8 or Unicode?

    @brettcannon
    Copy link
    Member

    Attached is a fix for test_subprocess.

    Simply had to change a call to str8() to str().

    I am going to run the test suite, but that should leave only test_struct
    failing and that can be fixed as soon as Guido makes a call on whether
    str8 or str should be used for the string formats.

    @gvanrossum
    Copy link
    Member

    Guido, what do you want to do about the struct module for the various
    string formats (i.e., c, s, p)? Should they return str8 or Unicode?

    Oh, tough call. I think they should return str8 (i.e. bytes after the
    rename) because the encoding isn't known. Even though this will break
    more code, since I'm sure there's lots of code out there that assumes
    they return "text".

    @brettcannon
    Copy link
    Member

    Attached is a fix for test_struct.

    All of the string tests now assume str8 is returned when arguments of
    bytes, str8 or str are given for the various string formats.

    All tests now pass. Re-assigning to myself to check everything in when
    it isn't so late at night. =)

    @brettcannon brettcannon added release-blocker type-bug An unexpected behavior, bug, or error and removed type-feature A feature request or enhancement labels Oct 22, 2007
    @brettcannon
    Copy link
    Member

    Everything committed in r58596. Thanks for the help, Thomas!

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    interpreter-core (Objects, Python, Grammar, and Parser dirs) release-blocker type-bug An unexpected behavior, bug, or error
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants