Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

zipfile: inconsistent filenames with InfoZip "unzip" #38653

Closed
gward mannequin opened this issue Jun 15, 2003 · 6 comments
Closed

zipfile: inconsistent filenames with InfoZip "unzip" #38653

gward mannequin opened this issue Jun 15, 2003 · 6 comments
Labels
stdlib Python modules in the Lib dir

Comments

@gward
Copy link
Mannequin

gward mannequin commented Jun 15, 2003

BPO 755031
Nosy @gvanrossum
Files
  • Demo.zip: zip file exhibiting inconsistent filenames
  • Note: these values reflect the state of the issue at the time it was migrated and might not reflect the current state.

    Show more details

    GitHub fields:

    assignee = None
    closed_at = <Date 2003-06-18.01:08:58.000>
    created_at = <Date 2003-06-15.21:23:46.000>
    labels = ['library']
    title = 'zipfile: inconsistent filenames with InfoZip "unzip"'
    updated_at = <Date 2003-06-18.01:08:58.000>
    user = 'https://bugs.python.org/gward'

    bugs.python.org fields:

    activity = <Date 2003-06-18.01:08:58.000>
    actor = 'gward'
    assignee = 'gward'
    closed = True
    closed_date = None
    closer = None
    components = ['Library (Lib)']
    creation = <Date 2003-06-15.21:23:46.000>
    creator = 'gward'
    dependencies = []
    files = ['920']
    hgrepos = []
    issue_num = 755031
    keywords = []
    message_count = 6.0
    messages = ['16417', '16418', '16419', '16420', '16421', '16422']
    nosy_count = 4.0
    nosy_names = ['gvanrossum', 'gward', 'sjones', 'ahlstromjc']
    pr_nums = []
    priority = 'normal'
    resolution = 'fixed'
    stage = None
    status = 'closed'
    superseder = None
    type = None
    url = 'https://bugs.python.org/issue755031'
    versions = []

    @gward
    Copy link
    Mannequin Author

    gward mannequin commented Jun 15, 2003

    zipfile.py gives filenames inconsistent with the
    InfoZIP "unzip" utility for certain ZIP files. My
    source is an email virus, so the ZIP files are almost
    certainl malformed. Nevertheless, it would be nice if
    "unzip -l" and ZipFile.namelist() gave consistent
    filenames.

    Example: the attached Demo.zip (extracted from an email
    virus caught on mail.python.org) looks like this
    according to InfoZip:

    $ unzip -l /tmp/Demo.zip 
    Archive:  /tmp/Demo.zip
      Length     Date   Time    Name
     --------    ----   ----    

    44544  01-26-03 20:49  
    

    DOCUME~1\CHRISS~1\LOCALS~1\Temp\Demo.exe
    -------- -------
    44544 1 file

    But according to ZipFile.namelist(), the name of that
    file is:

    DOCUME~1\CHRISS~1\LOCALS~1\Temp\Demo.exescr000000000000000000.txt

    Getting the same result with Python 2.2.2 and a
    ~2-week-old build of 2.3 CVS.

    @gward gward mannequin closed this as completed Jun 15, 2003
    @gward gward mannequin self-assigned this Jun 15, 2003
    @gward gward mannequin added the stdlib Python modules in the Lib dir label Jun 15, 2003
    @gward gward mannequin closed this as completed Jun 15, 2003
    @gward gward mannequin self-assigned this Jun 15, 2003
    @gward gward mannequin added the stdlib Python modules in the Lib dir label Jun 15, 2003
    @gvanrossum
    Copy link
    Member

    Logged In: YES
    user_id=6380

    That almost sounds like an intentional inconsistency. Could
    it be that the central directory has one name but the local
    header has a different one? Or that there's a null byte in
    the filename so that the filename length is inconsistent?
    The front of the file looks like this according to od -c:

    0000000 P K 003 004 \n \0 \0 \0 \0 \0 * Š :
    . c Ì
    0000020 \v g \0 ® \0 \0 \0 ® \0 \0 D \0 \0
    \0 D O
    0000040 C U M E ~ 1 \ C H R I S S
    ~ 1 \
    0000060 L O C A L S ~ 1 \ T e m p
    \ D e
    0000100 m o . e x e \0 \0 s c r \0 0
    0 0 0
    0000120 0 0 0 0 0 0 0 0 0 0 0 0 0
    0 . t
    0000140 x t M Z 220 \0 003 \0 \0 \0 004 \0 \0
    \0 ÿ ÿ
    0000160 \0 \0 � \0 \0 \0 \0 \0 \0 \0 @ \0 \0
    \0 \0 \0

    @sjones
    Copy link
    Mannequin

    sjones mannequin commented Jun 16, 2003

    Logged In: YES
    user_id=589306

    The actual filename from the zipfile is:
    filename =
    'DOCUME~1\\CHRISS~1\\LOCALS~1\\Temp\\Demo.exe\x00\x00scr\x00000000000000000000.txt'

    Notice there is a \x00 after Demo.exe. My guess is InfoZip
    stores the filename in a null terminated string and this
    extra null character in the filename terminates it at this
    point. Python doesn't care if you have nulls in the string,
    so it prints the entire filename.

    You can see the zip file format description at
    ftp://ftp.info-zip.org/pub/infozip/doc/appnote-981119-iz.zip

    The format does say:
    2) String fields are not null terminated, since the
    length is given explicitly.

    But it doesn't really say if strings are allowed to have
    nulls in them.

    So does Python or InfoZip get this right?

    @ahlstromjc
    Copy link
    Mannequin

    ahlstromjc mannequin commented Jun 16, 2003

    Logged In: YES
    user_id=64929

    The analysis by sjones is correct. Python and the zip file
    format both allow null bytes in file names. But in this case,
    the file is infected with the "I-Worm.Lentin.o" virus and the
    file name is designed to hide this. The file name ends in ".txt"
    but the file name up to the null byte ends in ".exe". The
    intention is that a virus scanner would skip this file because it
    ends in ".txt" ( a non-executable text file), but that
    the ".exe" would be seen (an executable program file) if the
    file were clicked, and so the file would be executed.

    Testing this on my machine, my virus scanner (Kaspersky)
    nevertheless flags the ".zip" file as containing a virus, but this
    depends on the particular virus scanner and its settings.

    I suggest that zipfile.py should terminate file names at a null
    byte as InfoZip does.

    @ahlstromjc
    Copy link
    Mannequin

    ahlstromjc mannequin commented Jun 17, 2003

    Logged In: YES
    user_id=64929

    I submitted a patch for this. It is 755987. See further
    comments there.

    @gward
    Copy link
    Mannequin Author

    gward mannequin commented Jun 18, 2003

    Logged In: YES
    user_id=14422

    Fixed with patch bpo-755987.

    @ezio-melotti ezio-melotti transferred this issue from another repository Apr 9, 2022
    Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
    Labels
    stdlib Python modules in the Lib dir
    Projects
    None yet
    Development

    No branches or pull requests

    1 participant