Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipfile - extract truncates (existing) file when bad password provided (zip encryption weakness) #80428

Open
CristiFati mannequin opened this issue Mar 8, 2019 · 5 comments
Labels
3.8 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@CristiFati
Copy link
Mannequin

CristiFati mannequin commented Mar 8, 2019

BPO 36247
Nosy @Yhg1s, @ned-deily, @serhiy-storchaka, @CristiFati
PRs
  • [3.6] bpo-36247: zipfile - extract truncates (existing) file when bad password provided (zip encryption weakness) #12242
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = None
    created_at = <Date 2019-03-08.22:56:50.804>
    labels = ['3.8', 'type-bug', 'library']
    title = 'zipfile - extract truncates (existing) file when bad password provided (zip encryption weakness)'
    updated_at = <Date 2019-03-18.02:02:48.092>
    user = 'https://github.com/CristiFati'

    bugs.python.org fields:

    activity = <Date 2019-03-18.02:02:48.092>
    actor = 'ned.deily'
    assignee = 'none'
    closed = False
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2019-03-08.22:56:50.804>
    creator = 'CristiFati'
    dependencies = []
    files = []
    hgrepos = []
    issue_num = 36247
    keywords = ['patch']
    message_count = 5.0
    messages = ['337543', '337545', '337570', '337610', '338155']
    nosy_count = 5.0
    nosy_names = ['twouters', 'alanmcintyre', 'ned.deily', 'serhiy.storchaka', 'CristiFati']
    pr_nums = ['12242']
    priority = 'normal'
    resolution = None
    stage = 'patch review'
    status = 'open'
    superseder = None
    type = 'behavior'
    url = 'https://bugs.python.org/issue36247'
    versions = ['Python 3.8']

    @CristiFati
    Copy link
    Mannequin Author

    CristiFati mannequin commented Mar 8, 2019

    PKWARE encryption password pre check algorithm (relying on an 8 bits value to differentiate passwords) is insanely short.

    Most of the wrong passwords are filtered out by the check, but some of them aren't. For the ones in the latter category, when trying to extract an archive member, a 0 lengthed file with its name will be created on the FS (overwriting any previous version).

    Usecase:

    1. Extract an archive member using the good password. File extracted
    2. Extract the same member using a wrong password:
      2.1 For most of the passwords, they will be detected and the operation cancelled
      2.2 But some of them, they won't be detected (false positives), but the decryption itself will fail overwriting the file (from Support "bpo-" in Misc/NEWS #1.) on FS but leaving it with 0 bytes content

    This is the about #2.2.

    More details on [SO]: zipfile.BadZipFile: Bad CRC-32 when extracting a password protected .zip & .zip goes corrupt on extract (@CristiFati's answer).

    @CristiFati CristiFati mannequin added stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error labels Mar 8, 2019
    @CristiFati
    Copy link
    Mannequin Author

    CristiFati mannequin commented Mar 8, 2019

    Submitted: #12242.

    As a note, it applies to any Python version.

    @serhiy-storchaka
    Copy link
    Member

    When you pass an incorrect password, it is not possible to guarantee that you will get an exception and the destination file will be kept unchanged. It is possible that you will get an incorrectly deciphered file without noticing.

    I do not think that the zipfile code needs to be changed. If you want to keep the content of the target file in case of error, just extract the file into other place.

    @CristiFati
    Copy link
    Mannequin Author

    CristiFati mannequin commented Mar 10, 2019

    Hm, I assumed that a bad password, will raise an exception (at some point). but, if it doesn't, the destination file will be overwritten (with the messed up content), which also happens now (so, no behavior change).

    This is trying to make wrong passwords behavior (when an exception is raised) consistent.

    What I can think of is that when some bytes were already extracted when the exception occurs, overwrite the existing file (if any), so at the end the faulty content will be kept (although I don't know haw this could help anyone), but in any case a 0 lengthed file is not a good option (from my PoV).

    Of course specifying another dir is an option, but I only see it as a workaround.

    @ned-deily
    Copy link
    Member

    I'll let it up to other coredevs to decide whether this is behavior that should be changed in master for the next feature release and/or whether a doc change is needed. But we are not going to change the behavior of 3.6 for sure so I am closing PR 12242.

    @ned-deily ned-deily added 3.8 (EOL) end of life and removed invalid labels Mar 18, 2019
    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    3.8 (EOL) end of life stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
    Projects
    Status: No status
    Development

    No branches or pull requests

    2 participants