Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Excess spaces at the end of files or repositorys are not handle when extracting zip files on Windows. #94018

Closed
Rygone opened this issue Jun 20, 2022 · 3 comments
Labels
OS-windows type-bug An unexpected behavior, bug, or error

Comments

@Rygone
Copy link
Contributor

Rygone commented Jun 20, 2022

Bug report

Excess spaces at the end of files or repositorys are not handle when extracting zip files on Windows.
FileNotFoundError: [Errno 2] No such file or directory: 'C:\\Documents \\test.txt'

Can be tested with this Documents.zip
and this piece of code:

from zipfile import ZipFile
with ZipFile('Documents.zip', 'r') as zip:
    zip.extractall()

Fix proposal

cpython/Lib/zipfile.py : 1690

# remove end spaces
def remove_end_spaces(x):
    for c in x[::-1]:
        if(c == ' '): x = x[:-1]
        else: return x
arcname = (remove_end_spaces(x) for x in arcname)

Your environment

  • CPython versions tested on: python 3.9
  • Operating system and architecture: Windows 10 Professionnel 21H2 19044.1706
@Rygone Rygone added the type-bug An unexpected behavior, bug, or error label Jun 20, 2022
@dignissimus
Copy link
Contributor

Removing trailing spaces can be done using str.rstrip.

I don't think this is an issue with the zipfile library. Testing the file using unzip shows the filename as including trailing spaces and unzip extracts the file with the directory name containing the trailing spaces. I don't think this behaviour needs to be changed and I don't think it should be altered.

sam@samtop /tmp % zip -Tv Documents.zip 
	zip warning: undefined bits used in flags = 0x0808: Documents /test.txt
Archive:  Documents.zip
    testing: Documents /test.txt      OK
No errors detected in compressed data of Documents.zip.
test of Documents.zip OK
sam@samtop /tmp % 

Testing with p7zip shows the same

sam@samtop /tmp % 7z t Documents.zip -bb3

7-Zip [64] 17.04 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
p7zip Version 17.04 (locale=en_GB.UTF-8,Utf16=on,HugeFiles=on,64 bits,6 CPUs x64)

Scanning the drive for archives:
1 file, 210 bytes (1 KiB)

Testing archive: Documents.zip
--
Path = Documents.zip
Type = zip
Physical Size = 210

T Documents /test.txt
Everything is Ok

Size:       0
Compressed: 210
sam@samtop /tmp % 
7 sam@samtop /tmp % 7z l Documents.zip -bb3

7-Zip [64] 17.04 : Copyright (c) 1999-2021 Igor Pavlov : 2017-08-28
p7zip Version 17.04 (locale=en_GB.UTF-8,Utf16=on,HugeFiles=on,64 bits,6 CPUs x64)

Scanning the drive for archives:
1 file, 210 bytes (1 KiB)

Listing archive: Documents.zip

--
Path = Documents.zip
Type = zip
Physical Size = 210

   Date      Time    Attr         Size   Compressed  Name
------------------- ----- ------------ ------------  ------------------------
2022-06-20 03:19:08 .....            0            2  Documents /test.txt
------------------- ----- ------------ ------------  ------------------------
2022-06-20 03:19:08                  0            2  1 files

@Rygone
Copy link
Contributor Author

Rygone commented Jun 20, 2022

Completely agree with the str.rstrip.

However, the problem is on Windows machines.
Windows Explorer extracts them without spaces at the end because it is not possible to have files or repositorys that end with spaces on Windows.
That's why I propose to make the change in _sanitize_windows_name().

And it's already done for dots :1688

So new proposal
cpython/Lib/zipfile.py : 1687

# remove trailing dots and spaces
arcname = (x.rstrip(' .') for x in arcname.split(pathsep))

@dignissimus
Copy link
Contributor

Ok, if it causes errors on Windows then updating the sanitisation function for windows sounds very reasonable

@ambv ambv closed this as completed in 176fd55 Jun 28, 2022
gvanrossum pushed a commit to gvanrossum/cpython that referenced this issue Jun 30, 2022
…honGH-94040)

Closes python#94018.

Co-authored-by: Sam Ezeh <sam.z.ezeh@gmail.com>
Co-authored-by: Shantanu <12621235+hauntsaninja@users.noreply.github.com>
Co-authored-by: Zachary Ware <zachary.ware@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
OS-windows type-bug An unexpected behavior, bug, or error
Projects
None yet
Development

No branches or pull requests

3 participants