Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Add the "namereplace" error handler #63875

Closed
serhiy-storchaka opened this issue Nov 21, 2013 · 13 comments
Closed

Add the "namereplace" error handler #63875

serhiy-storchaka opened this issue Nov 21, 2013 · 13 comments
Assignees
Labels
expert-unicode type-feature A feature request or enhancement

Comments

@serhiy-storchaka
Copy link
Member

serhiy-storchaka commented Nov 21, 2013

BPO 19676
Nosy @malemburg, @amauryfa, @ncoghlan, @vstinner, @ned-deily, @ezio-melotti, @stevendaprano, @ethanfurman, @serhiy-storchaka
Files
  • namereplace_errors.patch
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/serhiy-storchaka'
    closed_at = <Date 2014-11-26.20:27:44.576>
    created_at = <Date 2013-11-21.07:41:46.661>
    labels = ['type-feature', 'expert-unicode']
    title = 'Add the "namereplace" error handler'
    updated_at = <Date 2014-11-26.20:27:44.575>
    user = 'https://github.com/serhiy-storchaka'

    bugs.python.org fields:

    activity = <Date 2014-11-26.20:27:44.575>
    actor = 'ned.deily'
    assignee = 'serhiy.storchaka'
    closed = True
    closed_date = <Date 2014-11-26.20:27:44.576>
    closer = 'ned.deily'
    components = ['Unicode']
    creation = <Date 2013-11-21.07:41:46.661>
    creator = 'serhiy.storchaka'
    dependencies = []
    files = ['32748']
    hgrepos = []
    issue_num = 19676
    keywords = ['patch', 'needs review']
    message_count = 13.0
    messages = ['203579', '203580', '231647', '231649', '231650', '231652', '231653', '231654', '231672', '231700', '231701', '231727', '231728']
    nosy_count = 10.0
    nosy_names = ['lemburg', 'amaury.forgeotdarc', 'ncoghlan', 'vstinner', 'ned.deily', 'ezio.melotti', 'steven.daprano', 'ethan.furman', 'python-dev', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue19676'
    versions = ['Python 3.5']

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Nov 21, 2013

    The proposed patch adds the "namereplace" error handler. This error handler is almost same as the "backslashreplace" error handler, but use \N{...} escape sequences if there is a character name in Unicode database. Result is a little more human-readable (but less portable) than with "backslashreplace".

    >>> '∀ x∈ℜ'.encode('ascii', 'namereplace')
    b'\\N{FOR ALL} x\\N{ELEMENT OF}\\N{BLACK-LETTER CAPITAL R}'

    The proposition was discussed and bikeshedded on Python-Ideas: http://comments.gmane.org/gmane.comp.python.ideas/21296 .

    @serhiy-storchaka serhiy-storchaka added expert-unicode type-feature A feature request or enhancement labels Nov 21, 2013
    @vstinner
    Copy link
    Member

    vstinner commented Nov 21, 2013

    See also issue bpo-18234.

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Nov 25, 2014

    Ping.

    @serhiy-storchaka serhiy-storchaka self-assigned this Nov 25, 2014
    @amauryfa
    Copy link
    Member

    amauryfa commented Nov 25, 2014

    The patch looks good to me.
    But it seems that the reverse operation is not possible in the general case: .decode('unicode_escape') assumes a latin-1 or ascii encoding.
    Should we document this?

    @malemburg
    Copy link
    Member

    malemburg commented Nov 25, 2014

    The patch looks good.

    One nit: the name buffer length should be NAME_MAXLEN instead of 100.

    @ncoghlan
    Copy link
    Contributor

    ncoghlan commented Nov 25, 2014

    Patch looks good to me, too.

    As far as Amaury's question goes, isn't the general reverse operation the same as for the existing backslashreplace handler?

    That is, decode with the appropriate ASCII compatible encoding (since ASCII compatibility is needed for the escape sequences to be valid), then run the result through ast.literal_eval?

    (I'll grant we don't currently provide guidance on reversing backslashreplace either, but addressing that sounds like a separate question from this change)

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 25, 2014

    New changeset 32d08aacffe0 by Serhiy Storchaka in branch 'default':
    Issue bpo-19676: Added the "namereplace" error handler.
    https://hg.python.org/cpython/rev/32d08aacffe0

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Nov 25, 2014

    Thank you all for reviews.

    One nit: the name buffer length should be NAME_MAXLEN instead of 100.

    NAME_MAXLEN is private name available only in Modules/unicodedata.c. Making it public name would be other issue. I have increased buffer size to 256.

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 25, 2014

    New changeset b6fab008d63a by Berker Peksag in branch 'default':
    Issue bpo-19676: Tweak documentation a bit.
    https://hg.python.org/cpython/rev/b6fab008d63a

    @python-dev
    Copy link
    Mannequin

    python-dev mannequin commented Nov 26, 2014

    New changeset 21d1571c0533 by Serhiy Storchaka in branch 'default':
    Issue bpo-19676: Fixed integer overflow issue in "namereplace" error handler.
    https://hg.python.org/cpython/rev/21d1571c0533

    @serhiy-storchaka
    Copy link
    Member Author

    serhiy-storchaka commented Nov 26, 2014

    Thank you Berker.

    @ned-deily
    Copy link
    Member

    ned-deily commented Nov 26, 2014

    ../../source/Python/codecs.c:1022:16: error: use of undeclared identifier 'out'; did you
    mean 'outp'?
    assert(out == start + ressize);
    ^~~
    outp

    @ned-deily ned-deily reopened this Nov 26, 2014
    @ned-deily
    Copy link
    Member

    ned-deily commented Nov 26, 2014

    Fixed in ce8a8531d29a

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    expert-unicode type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    6 participants