Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipfile zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf' #80754

Open
JozefCernak mannequin opened this issue Apr 9, 2019 · 7 comments
Open

zipfile zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf' #80754

JozefCernak mannequin opened this issue Apr 9, 2019 · 7 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@JozefCernak
Copy link
Mannequin

JozefCernak mannequin commented Apr 9, 2019

BPO 36573
Nosy @Yhg1s, @serhiy-storchaka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = None
created_at = <Date 2019-04-09.09:49:43.095>
labels = ['type-bug', 'library']
title = "zipfile zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'"
updated_at = <Date 2019-04-09.12:06:00.945>
user = 'https://bugs.python.org/JozefCernak'

bugs.python.org fields:

activity = <Date 2019-04-09.12:06:00.945>
actor = 'serhiy.storchaka'
assignee = 'none'
closed = False
closed_date = None
closer = None
components = ['Library (Lib)']
creation = <Date 2019-04-09.09:49:43.095>
creator = 'Jozef Cernak'
dependencies = []
files = []
hgrepos = []
issue_num = 36573
keywords = []
message_count = 7.0
messages = ['339722', '339727', '339728', '339729', '339730', '339734', '339736']
nosy_count = 4.0
nosy_names = ['twouters', 'alanmcintyre', 'serhiy.storchaka', 'Jozef Cernak']
pr_nums = []
priority = 'normal'
resolution = None
stage = None
status = 'open'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue36573'
versions = ['Python 3.5']

@JozefCernak
Copy link
Mannequin Author

JozefCernak mannequin commented Apr 9, 2019

Hi,
in the short program, that works well for password of 4 character, when I change password length I got this error (parameter MAXD)

Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'

program:
import string, zipfile, zlib

from zipfile import ZipFile

zf= ZipFile('11_02_2019.pdf.zip')

MAXD=6

upper_case=string.ascii_uppercase
uc=list(upper_case)

n=len(uc)
print (n)

pos=[]
for k in range(0,MAXD):
    pos.append(0)
    
print (pos) 


for let in range(0,n):
    print (let, uc[let]) 








let=0
koniec=0;
k3=0
p=0

while koniec != MAXD :
    
 
    
    k=0
    
    password=''
    for k2 in range(0,MAXD):
        
        password=password+uc[pos[k2]]
        
    print   (password)
  
           
    try:

        with zipfile.ZipFile('11_02_2019.pdf.zip') as zf:
            zf.extractall( pwd=password.encode('cp850','replace'))
            print ("Password found:" + password)
            exit(0)
        
    except RuntimeError:
        pass
    
    except zlib.error:
        pass
        
    
    #print "ppppppppppppppppppppppppp",p,  paswd
pos[0]=pos[0]+1

for k2  in range(0,MAXD-1):
    if pos[k2]>=n:
        pos[k2]=0
        pos[k2+1]=pos[k2+1]+1

koniec=0

for k2 in range(0,MAXD):
    if pos[k2] >= n-1:
        koniec=koniec+1

Similar behaviuor I observed in older version of python (2.7) and correspondig library.

The zip archive is procted by simple password 'ABCD', the file is not big less tha 1MB.

Best regards
Jozef

@JozefCernak JozefCernak mannequin added type-crash A hard crash of the interpreter, possibly with a core dump stdlib Python modules in the Lib dir labels Apr 9, 2019
@SilentGhost SilentGhost mannequin added type-bug An unexpected behavior, bug, or error and removed type-crash A hard crash of the interpreter, possibly with a core dump labels Apr 9, 2019
@serhiy-storchaka
Copy link
Member

Do you get an error when try to extract the file using the valid password?

@JozefCernak
Copy link
Mannequin Author

JozefCernak mannequin commented Apr 9, 2019

Dear Serhiy,
in the case of correct password, the program works well:

OACD
PACD
QACD
RACD
SACD
TACD
UACD
VACD
WACD
XACD
YACD
ZACD
ABCD
Password found:ABCD

for five characters:
RRJBA
Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'

specially for RRJBA
AAAAA
Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'
for six characters:
KMQAAA
LMQAAA
MMQAAA
NMQAAA
OMQAAA
PMQAAA
QMQAAA
RMQAAA
SMQAAA
TMQAAA
UMQAAA
VMQAAA
WMQAAA
XMQAAA
YMQAAA
ZMQAAA
ANQAAA
Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019.pdf'

It seems that after certain attempts command produces different behaviour
as in the previous attemts to call
zf.extractall( pwd=password.encode('cp850','replace'))

Best regards

Jozef

On Tue, Apr 9, 2019 at 12:47 PM Serhiy Storchaka <report@bugs.python.org>
wrote:

Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment:

Do you get an error when try to extract the file using the valid password?

----------


Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue36573\>


@serhiy-storchaka
Copy link
Member

If you try to extract the file using an invalid password, it is an expected behavior.

@JozefCernak
Copy link
Mannequin Author

JozefCernak mannequin commented Apr 9, 2019

Ok, however behaviur is detected after several attempts i.e. behaviour is
not regular but depends on the previous history, how or how many times
functions was called. I think such behaviur should indicate that function
store previous data, i.e. history.
Best regards
Jozef

On Tue, Apr 9, 2019 at 1:05 PM Serhiy Storchaka <report@bugs.python.org>
wrote:

Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment:

If you try to extract the file using an invalid password, it is an
expected behavior.

----------


Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue36573\>


@JozefCernak
Copy link
Mannequin Author

JozefCernak mannequin commented Apr 9, 2019

Hi,
I changed zipped file password to the new string "RRJBB" that is a
combination after RRJBA to see what will happen.
At the input combination KWFEA
I got the message:

KWFEA
Traceback (most recent call last):
  File "p33.py", line 54, in <module>
    zf.extractall( pwd=password.encode('cp850','replace'))
  File "/usr/lib/python3.5/zipfile.py", line 1347, in extractall
    self.extract(zipinfo, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1335, in extract
    return self._extract_member(member, path, pwd)
  File "/usr/lib/python3.5/zipfile.py", line 1399, in _extract_member
    shutil.copyfileobj(source, target)
  File "/usr/lib/python3.5/shutil.py", line 73, in copyfileobj
    buf = fsrc.read(length)
  File "/usr/lib/python3.5/zipfile.py", line 844, in read
    data = self._read1(n)
  File "/usr/lib/python3.5/zipfile.py", line 934, in _read1
    self._update_crc(data)
  File "/usr/lib/python3.5/zipfile.py", line 862, in _update_crc
    raise BadZipFile("Bad CRC-32 for file %r" % self.name)
zipfile.BadZipFile: Bad CRC-32 for file '11_02_2019B.pdf'

Jozef

On Tue, Apr 9, 2019 at 1:05 PM Serhiy Storchaka <report@bugs.python.org>
wrote:

Serhiy Storchaka <storchaka+cpython@gmail.com> added the comment:

If you try to extract the file using an invalid password, it is an
expected behavior.

----------


Python tracker <report@bugs.python.org>
<https://bugs.python.org/issue36573\>


@serhiy-storchaka
Copy link
Member

This is how the weak encryption in ZIP files work. In 255 cases from 256 the wrong password can be detected earlier (this make the encryption just weaker). But it 1 case of 256 this check is passed and you will get either an error of mismatched CRC, or the compressor specific error if use compression. There is even very small chance (1 of 2**32 or like) that you will silently get incorrectly decrypted data.

It is better to not use the weak encryption in ZIP files. If you need to encrypt data safely, use third-party encryption libraries.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
Status: No status
Development

No branches or pull requests

1 participant