Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

UTF-8-Sig codec #41808

Closed
doerwalter opened this issue Apr 5, 2005 · 7 comments
Closed

UTF-8-Sig codec #41808

doerwalter opened this issue Apr 5, 2005 · 7 comments
Assignees
Labels
stdlib Python modules in the Lib dir

Comments

@doerwalter
Copy link
Contributor

BPO 1177307
Nosy @loewis, @doerwalter
Files
  • diff.txt
  • diff2.txt: Better handling of partial BOMs
  • diff3.txt
  • UnicodeBOM.txt
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/loewis'
    closed_at = <Date 2006-01-08.10:46:05.000>
    created_at = <Date 2005-04-05.19:26:11.000>
    labels = ['library']
    title = 'UTF-8-Sig codec'
    updated_at = <Date 2006-01-08.10:46:05.000>
    user = 'https://github.com/doerwalter'

    bugs.python.org fields:

    activity = <Date 2006-01-08.10:46:05.000>
    actor = 'loewis'
    assignee = 'loewis'
    closed = True
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2005-04-05.19:26:11.000>
    creator = 'doerwalter'
    dependencies = []
    files = ['6593', '6594', '6595', '6596']
    hgrepos = []
    issue_num = 1177307
    keywords = ['patch']
    message_count = 7.0
    messages = ['48154', '48155', '48156', '48157', '48158', '48159', '48160']
    nosy_count = 2.0
    nosy_names = ['loewis', 'doerwalter']
    pr_nums = []
    priority = 'normal'
    resolution = 'accepted'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue1177307'
    versions = []

    @doerwalter
    Copy link
    Contributor Author

    This patch implements a UTF-8-Sig codec. This codec
    works like UTF-8 but adds a BOM on writing and skips
    (at most) one BOM on reading.

    @doerwalter doerwalter added the stdlib Python modules in the Lib dir label Apr 5, 2005
    @doerwalter doerwalter added the stdlib Python modules in the Lib dir label Apr 5, 2005
    @doerwalter
    Copy link
    Contributor Author

    Logged In: YES
    user_id=89016

    This second version of the patch will return starting bytes
    immediately, if they don't look like a BOM.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Aug 7, 2005

    Logged In: YES
    user_id=21627

    The patch looks fine, but lacks documentation changes.

    @doerwalter
    Copy link
    Contributor Author

    Logged In: YES
    user_id=89016

    This version (diff3.txt) of the patch adds a note to
    Misc/NEWS and a section to Doc/lib/libcodecs.tex. Is this
    the correct place to add the documentation?

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Aug 9, 2005

    Logged In: YES
    user_id=21627

    The place is right, but I feel this documentation is
    incomplete still. The library reference should explain
    somewhere what the difference between utf-8 and utf-8-sig
    is. Perhaps a footnote could be added. I think I would
    prefer a separate subsection on the BOM, explaining byte
    order in UTF-{16,32}, and how the BOM can be used as a magic
    signature for UTF-8.

    @doerwalter
    Copy link
    Contributor Author

    Logged In: YES
    user_id=89016

    OK, here's a text that explains what the BOM is used for in
    various Unicode encodings. I hope that this can be turned
    into something useful.

    @loewis
    Copy link
    Mannequin

    loewis mannequin commented Jan 8, 2006

    Logged In: YES
    user_id=21627

    Thanks for the patch. Committed as 41977.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant