Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

MemoryError on zip.read in shutil._unpack_zipfile #87816

Closed
igorvoltaic mannequin opened this issue Mar 28, 2021 · 5 comments
Closed

MemoryError on zip.read in shutil._unpack_zipfile #87816

igorvoltaic mannequin opened this issue Mar 28, 2021 · 5 comments
Assignees
Labels
3.9 only security fixes 3.10 only security fixes 3.11 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir

Comments

@igorvoltaic
Copy link
Mannequin

igorvoltaic mannequin commented Mar 28, 2021

BPO 43650
Nosy @gpshead, @animalize, @miss-islington, @igorvoltaic
PRs
  • bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for large files #25058
  • [3.10] bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for large files (GH-25058) #26190
  • [3.9] bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for large files (GH-25058) #26191
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = 'https://github.com/gpshead'
    closed_at = <Date 2021-05-17.17:36:05.781>
    created_at = <Date 2021-03-28.21:21:23.775>
    labels = ['3.11', 'library', '3.9', '3.10', 'performance']
    title = 'MemoryError on zip.read in shutil._unpack_zipfile'
    updated_at = <Date 2021-05-17.17:36:05.780>
    user = 'https://github.com/igorvoltaic'

    bugs.python.org fields:

    activity = <Date 2021-05-17.17:36:05.780>
    actor = 'gregory.p.smith'
    assignee = 'gregory.p.smith'
    closed = True
    closed_date = <Date 2021-05-17.17:36:05.781>
    closer = 'gregory.p.smith'
    components = ['Library (Lib)']
    creation = <Date 2021-03-28.21:21:23.775>
    creator = 'igorvoltaic'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 43650
    keywords = ['patch']
    message_count = 5.0
    messages = ['389652', '393692', '393819', '393820', '393821']
    nosy_count = 5.0
    nosy_names = ['gregory.p.smith', 'python-dev', 'malin', 'miss-islington', 'igorvoltaic']
    pr_nums = ['25058', '26190', '26191']
    priority = 'normal'
    resolution = 'fixed'
    stage = 'commit review'
    status = 'closed'
    superseder = None
    type = 'resource usage'
    url = 'https://bugs.python.org/issue43650'
    versions = ['Python 3.9', 'Python 3.10', 'Python 3.11']

    @igorvoltaic
    Copy link
    Mannequin Author

    igorvoltaic mannequin commented Mar 28, 2021

    MemoryError: null
    ...
    File "....", line 13, in repack__file
    shutil.unpack_archive(local_file_path, local_dir)
    File "python3.6/shutil.py", line 983, in unpack_archive
    func(filename, extract_dir, **kwargs)
    File "python3.6/shutil.py", line 901, in _unpack_zipfile
    data = zip.read(info.filename)
    File "python3.6/zipfile.py", line 1338, in read
    return fp.read()
    File "python3.6/zipfile.py", line 858, in read
    buf += self._read1(self.MAX_N)
    File "python3.6/zipfile.py", line 948, in _read1
    data = self._decompressor.decompress(data, n)

    shutil.unpack_archive tries to read the whole file into memory, making use of any buffer at all. Python crashes for really large files. In my case — archive: ~1.7G, unpacked: ~10G. Interestingly zipfile.ZipFile.extractall handles this case more effective.

    @igorvoltaic igorvoltaic mannequin added type-crash A hard crash of the interpreter, possibly with a core dump 3.7 (EOL) end of life 3.8 only security fixes 3.9 only security fixes stdlib Python modules in the Lib dir labels Mar 28, 2021
    @igorvoltaic
    Copy link
    Mannequin Author

    igorvoltaic mannequin commented May 14, 2021

    pls review

    @gpshead gpshead added 3.10 only security fixes 3.11 only security fixes and removed 3.7 (EOL) end of life 3.8 only security fixes labels May 17, 2021
    @gpshead gpshead self-assigned this May 17, 2021
    @gpshead gpshead added performance Performance or resource usage 3.10 only security fixes 3.11 only security fixes and removed type-crash A hard crash of the interpreter, possibly with a core dump labels May 17, 2021
    @miss-islington
    Copy link
    Contributor

    New changeset 049c412 by Miss Islington (bot) in branch '3.9':
    bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for large files (GH-25058)
    049c412

    @gpshead
    Copy link
    Member

    gpshead commented May 17, 2021

    New changeset 7a58862 by Miss Islington (bot) in branch '3.10':
    bpo-43650: Fix MemoryError on zip.read in shutil._unpack_zipfile for large files (GH-25058) (GH-26190)
    7a58862

    @gpshead
    Copy link
    Member

    gpshead commented May 17, 2021

    thanks for the patch!

    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.9 only security fixes 3.10 only security fixes 3.11 only security fixes performance Performance or resource usage stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    2 participants