-
-
Notifications
You must be signed in to change notification settings - Fork 29.9k
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
ZipFile does not supports Unicode Path Extra Field (0x7075) zip header field #86094
Comments
See attached sample. Well-known unzip command line tool lists its contents correctly: $ unzip -l 23.zip
Archive: 23.zip
Length Date Time Name
--------- ---------- -----
--------- ------- But ZipFile lists the same file inside this archive as It's because ZipFile completely ignores Unicode Path Extra Field (0x7075) zip header field. See .ZIP specification for details on this field meaning and usage: |
Grand unified algorithm to read filenames from zip files correctly:
p7zip with oemcp patch (https://github.com/unxed/oemcp/) uses exactly this method, and is able to process all zip files in my test set correctly (my test set contains several zips generated by different packers on windows, macos, linux, and by online services). The same algorithm should be used in any zip unpacker wishing to process non-latin filenames as gently as possible. |
I submitted more than a month ago a PR that adds support for Unicode Path Extra Field in ZipFile. |
Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.
Show more details
GitHub fields:
bugs.python.org fields:
Linked PRs
The text was updated successfully, but these errors were encountered: