Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

gzip.open() needs an optional encoding argument #56768

Closed
rhettinger opened this issue Jul 14, 2011 · 7 comments
Closed

gzip.open() needs an optional encoding argument #56768

rhettinger opened this issue Jul 14, 2011 · 7 comments
Labels
stdlib Python modules in the Lib dir type-feature A feature request or enhancement

Comments

@rhettinger
Copy link
Contributor

BPO 12559
Nosy @rhettinger, @amauryfa, @pitrou, @merwok, @durban, @serhiy-storchaka
Files
  • issue12559.patch: Patch 1
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2012-06-26.21:02:21.025>
    created_at = <Date 2011-07-14.15:31:22.219>
    labels = ['type-feature', 'library']
    title = 'gzip.open() needs an optional encoding argument'
    updated_at = <Date 2012-06-26.21:02:21.023>
    user = 'https://github.com/rhettinger'

    bugs.python.org fields:

    activity = <Date 2012-06-26.21:02:21.023>
    actor = 'nadeem.vawda'
    assignee = 'none'
    closed = True
    closed_date = <Date 2012-06-26.21:02:21.025>
    closer = 'nadeem.vawda'
    components = ['Library (Lib)']
    creation = <Date 2011-07-14.15:31:22.219>
    creator = 'rhettinger'
    dependencies = []
    files = ['22661']
    hgrepos = []
    issue_num = 12559
    keywords = ['patch']
    message_count = 7.0
    messages = ['140341', '140369', '140373', '140407', '140462', '163928', '164106']
    nosy_count = 8.0
    nosy_names = ['rhettinger', 'amaury.forgeotdarc', 'pitrou', 'nadeem.vawda', 'eric.araujo', 'daniel.urban', 'rasmusory', 'serhiy.storchaka']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = 'resolved'
    status = 'closed'
    superseder = None
    type = 'enhancement'
    url = 'https://bugs.python.org/issue12559'
    versions = ['Python 3.3']

    @rhettinger
    Copy link
    Contributor Author

    gzip.open() should parallel file.open() so that that zipped files can be read in the same way as regular files:

    for line in gzip.open('notes.txt', 'r', encoding='latin-1'):
        print(line.rstrip())

    @rhettinger rhettinger added stdlib Python modules in the Lib dir type-feature A feature request or enhancement labels Jul 14, 2011
    @durban
    Copy link
    Mannequin

    durban mannequin commented Jul 14, 2011

    Here is a patch. If the code changes are acceptable I can also make a documentation patch.

    (I'm surprised to see 3.2 in "Versions". I thought 3.2 only gets bugfixes...)

    @amauryfa
    Copy link
    Member

    There remains a difference between open() and gzip.open():
    open(filename, 'r', encoding=None) is a text file (with a default encoding), gzip.open() with the same arguments returns a binary file.

    Don't know how to fix this though.

    @pitrou
    Copy link
    Member

    pitrou commented Jul 15, 2011

    If we go this way, the "errors" and "newline" argument should be added as well.

    @durban
    Copy link
    Mannequin

    durban mannequin commented Jul 15, 2011

    If we go this way, the "errors" and "newline" argument should be added
    as well.

    Yeah, I thought about that. I can make a new patch, that implement this, if needed. Though it seems there is a real problem, the one that Amaury Forgeot d'Arc mentioned. I can't think of a way to solve it in a backwards compatible way.

    @serhiy-storchaka
    Copy link
    Member

    Why not use io.TextWrapper? I think it is the right answer for this issue.

    @nadeemvawda
    Copy link
    Mannequin

    nadeemvawda mannequin commented Jun 26, 2012

    I already fixed this without knowing about this issue; see 55202ca694d7.

    storchaka:

    Why not use io.TextWrapper? I think it is the right answer for this issue.

    The proposed patch (and the code I committed) *do* use TextIOWrapper.

    Unless you mean that callers should create the TextIOWrapper themselves.
    This is certainly possible, but quite inconvenient for something that is
    conceptually simple, and not difficult to implement.

    amaury.forgeotdarc:

    There remains a difference between open() and gzip.open():
    open(filename, 'r', encoding=None) is a text file (with a default encoding), gzip.open() with the same arguments returns a binary file.

    The committed code unfortunately still has gzip.open(filename, "r")
    returning a binary file. This is something that cannot be fixed without
    breaking backward compatibility.

    However, it does provide a way to open a text file with the system's
    default encoding (encoding=None, or no encoding argument specified).
    To do this, you can use the "rt"/"wt"/"at" modes, just like with
    builtins.open(). Of course, this also works if you do specify an encoding
    explicitly.

    @nadeemvawda nadeemvawda mannequin closed this as completed Jun 26, 2012
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir type-feature A feature request or enhancement
    Projects
    None yet
    Development

    No branches or pull requests

    4 participants