Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

ZeroDivisionError in the end of zip file #13

Closed
tropicoo opened this issue Jun 28, 2021 · 5 comments
Closed

ZeroDivisionError in the end of zip file #13

tropicoo opened this issue Jun 28, 2021 · 5 comments

Comments

@tropicoo
Copy link

Thanks for the lib.
Got ZeroDivisionError: integer division or modulo by zero during processing zip file's last chunk from example code snippet:

Traceback (most recent call last):
  File "***\tmp.py", line 9, in <module>
    for file_name, file_size, unzipped_chunks in stream_unzip(zipped_chunks()):
  File "***\venv_win32_39py\lib\site-packages\stream_unzip.py", line 180, in stream_unzip
    for _ in yield_all():
  File "***\venv_win32_39py\lib\site-packages\stream_unzip.py", line 35, in _yield_all
    offset = (offset + to_yield) % len(chunk)
ZeroDivisionError: integer division or modulo by zero

Code snippet:

import httpx
from stream_unzip import stream_unzip


def zipped_chunks():
    # Any iterable that yields a zip file
    with httpx.stream('GET', 'https://www.gyan.dev/ffmpeg/builds/packages/ffmpeg-4.4-essentials_build.zip') as r:
        yield from r.iter_bytes()


for file_name, file_size, unzipped_chunks in stream_unzip(zipped_chunks()):
    for chunk in unzipped_chunks:
        # print(chunk)
        print(file_name, file_size)

Python 3.9.5 (Windows 10)
stream-unzip 0.0.23

@michalc
Copy link
Member

michalc commented Jul 4, 2021

Thanks for the report! I have reproduced the issue, and am investigating.

@michalc
Copy link
Member

michalc commented Jul 4, 2021

I'm not entirely sure that this isn't an issue with httpx... it surprises me it yields any zero-length chunks, even at the end.

I've started a discussion at encode/httpx#1733

In the meantime, you can filter out zero-length chunks with an intermediate generator:

import httpx
from stream_unzip import stream_unzip

def without_zero_length(chunks):
    for chunk in chunks:
        if chunk:
            yield chunk

def zipped_chunks():
    with httpx.stream('GET', 'https://www.gyan.dev/ffmpeg/builds/packages/ffmpeg-4.4-essentials_build.zip') as r:
        yield from without_zero_length(r.iter_bytes())

for file_name, file_size, unzipped_chunks in stream_unzip(zipped_chunks()):
    for chunk in unzipped_chunks:
        print(file_name, file_size)

@michalc
Copy link
Member

michalc commented Jul 4, 2021

Ah, or in this case httpx iter_raw also works without filtering out zero length chunks.

import httpx
from stream_unzip import stream_unzip

def zipped_chunks():
    with httpx.stream('GET', 'https://www.gyan.dev/ffmpeg/builds/packages/ffmpeg-4.4-essentials_build.zip') as r:
        yield from r.iter_raw()

for file_name, file_size, unzipped_chunks in stream_unzip(zipped_chunks()):
    for chunk in unzipped_chunks:
        print(file_name, file_size)

Suspect you can only replace iter_bytes with iter_raw if there isn't any additional content encoding on the http response. It would be strange if there were, since the content is a zip file, but you never know...

@michalc
Copy link
Member

michalc commented Jul 4, 2021

Ah found a more robust workaround: specifying chunk_size in the call to iter_bytes makes it avoid the zero length chunk that stream-unzip doesn't handle:

import httpx
from stream_unzip import stream_unzip

def zipped_chunks():
    with httpx.stream('GET', 'https://www.gyan.dev/ffmpeg/builds/packages/ffmpeg-4.4-essentials_build.zip') as r:
        yield from r.iter_bytes(chunk_size=65536)

for file_name, file_size, unzipped_chunks in stream_unzip(zipped_chunks()):
    for chunk in unzipped_chunks:
        print(file_name, file_size)

@michalc
Copy link
Member

michalc commented Jul 10, 2021

I've opted to not change the code, and instead change the README to have an example that works, and explicitly state that zero length chunks are not supported.

It's a tricky call to make, but all things being equal, I'm happier with the error since it indicates something is unexpected earlier in the processing.

[If users do want to not fail with zero-length chunks, then they can filter them out as in the example above]

@michalc michalc closed this as completed Jul 10, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants