Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipfile.is_zipfile wrongly recognizes non-zip as zip #60939

Closed
bkabrda mannequin opened this issue Dec 20, 2012 · 10 comments
Closed

zipfile.is_zipfile wrongly recognizes non-zip as zip #60939

bkabrda mannequin opened this issue Dec 20, 2012 · 10 comments
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error

Comments

@bkabrda
Copy link
Mannequin

bkabrda mannequin commented Dec 20, 2012

BPO 16735
Nosy @bitdancer, @serhiy-storchaka

Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

Show more details

GitHub fields:

assignee = None
closed_at = <Date 2012-12-20.16:21:52.537>
created_at = <Date 2012-12-20.08:30:11.572>
labels = ['invalid', 'type-bug', 'library']
title = 'zipfile.is_zipfile wrongly recognizes non-zip as zip'
updated_at = <Date 2012-12-21.09:47:43.391>
user = 'https://bugs.python.org/bkabrda'

bugs.python.org fields:

activity = <Date 2012-12-21.09:47:43.391>
actor = 'serhiy.storchaka'
assignee = 'none'
closed = True
closed_date = <Date 2012-12-20.16:21:52.537>
closer = 'r.david.murray'
components = ['Library (Lib)']
creation = <Date 2012-12-20.08:30:11.572>
creator = 'bkabrda'
dependencies = []
files = []
hgrepos = []
issue_num = 16735
keywords = []
message_count = 10.0
messages = ['177804', '177806', '177807', '177830', '177831', '177834', '177835', '177836', '177866', '177872']
nosy_count = 3.0
nosy_names = ['r.david.murray', 'serhiy.storchaka', 'bkabrda']
pr_nums = []
priority = 'normal'
resolution = 'not a bug'
stage = 'resolved'
status = 'closed'
superseder = None
type = 'behavior'
url = 'https://bugs.python.org/issue16735'
versions = ['Python 2.7', 'Python 3.3']

@bkabrda
Copy link
Mannequin Author

bkabrda mannequin commented Dec 20, 2012

When I use zipfile.is_zipfile on file fastjar (sample uploaded at [1]) from libgcj, I get True, while I should get False (reproducible with fastjar from libgcj 4.7.2 on Fedora 18).
This is caused by stringEndArchive string being present in the file, but the file still isn't zip. Would it be possible to add some further checks to eliminate this kind of errors? I'd like to submit a patch but I'm not sure what to check for, maybe some other constants mentioned in the ZIP format definition?

Thanks a lot.

[1] http://bkabrda.fedorapeople.org/fastjar

@bkabrda bkabrda mannequin added the stdlib Python modules in the Lib dir label Dec 20, 2012
@serhiy-storchaka
Copy link
Member

You can upload a sample file on bug tracker.

Actually jar files are just zip files (with some limitation and special files). zipfile.is_zipfile should return True on a jar file.

@bkabrda
Copy link
Mannequin Author

bkabrda mannequin commented Dec 20, 2012

Oh, sorry, I will upload it on the bugtracker next time.

I know that jar files are zip files, but this is not a jar (although it has "jar" in file). This is a binary.

@bitdancer
Copy link
Member

I'm imagining that it creates jar files, and thus has the signature as a constant. The is_zipfile check is much more complicated than just looking for that string, though, so what is going on must be even more perverse than that. It would be interesting to know if other zip tools have an issue with it, although be careful when comparing, since is_zipfile only does the initial check, whereas running another unzip tool against it may produce an error, but only later in the process (after the zip tool has decided it is a zip file and tries to process it).

@serhiy-storchaka
Copy link
Member

$ zipinfo fastjar
Archive:  fastjar
Zip file size: 47664 bytes, number of entries: 31883

[fastjar]:
Zipfile is disk 33807 of a multi-disk archive, and this is not the disk on
which the central zipfile directory begins (disk 190).

I.e. zipinfo detects fastjar as a zip file, but fails to read a contents (unzip -l fastjar and python -m zipinfo -l fastjar fail too). The file contains an obviously incorrect values in the control structures.

@bitdancer
Copy link
Member

So, it looks like this is not a bug in Python, just a weirdness of fastjar. Or, if you prefer, a bug in fastjar (they could assemble the signature instead of coding it as a single constant).

@bitdancer bitdancer added invalid type-bug An unexpected behavior, bug, or error labels Dec 20, 2012
@serhiy-storchaka
Copy link
Member

It's rather a bug in the ZIP format design.

@bitdancer
Copy link
Member

Well, yes, but that ship has already sunk :)

@bkabrda
Copy link
Mannequin Author

bkabrda mannequin commented Dec 21, 2012

Tried is_zipfile on /usr/bin/zip and it returns True, too, so it seems that this is a more general problem for zip-handling binaries... Anyway, thank you both, I agree that there is not much that can be done here.

@serhiy-storchaka
Copy link
Member

zipinfo detects /usr/bin/zip as a zip archive too.

@ezio-melotti ezio-melotti transferred this issue from another repository Apr 10, 2022
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
stdlib Python modules in the Lib dir type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

2 participants