Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

http.client.IncompleteRead crash during extract #6

Closed
chfoo opened this issue May 26, 2014 · 1 comment
Closed

http.client.IncompleteRead crash during extract #6

chfoo opened this issue May 26, 2014 · 1 comment
Labels

Comments

@chfoo
Copy link
Owner

chfoo commented May 26, 2014

Traceback (most recent call last):
  File "/0/home/waxy/usr/local/lib/python3.4/runpy.py", line 170, in _run_module_as_main
    "__main__", mod_spec)
  File "/0/home/waxy/usr/local/lib/python3.4/runpy.py", line 85, in _run_code
    exec(code, run_globals)
  File "/0/home/waxy/usr/local/lib/python3.4/site-packages/warcat/__main__.py", line 154, in <module>
    main()
  File "/0/home/waxy/usr/local/lib/python3.4/site-packages/warcat/__main__.py", line 70, in main
    command_info[1](args)
  File "/0/home/waxy/usr/local/lib/python3.4/site-packages/warcat/__main__.py", line 131, in extract_command
    tool.process()
  File "/0/home/waxy/usr/local/lib/python3.4/site-packages/warcat/tool.py", line 112, in process
    raise e
  File "/0/home/waxy/usr/local/lib/python3.4/site-packages/warcat/tool.py", line 106, in process
    self.action(record)
  File "/0/home/waxy/usr/local/lib/python3.4/site-packages/warcat/tool.py", line 229, in action
    shutil.copyfileobj(response, f)
  File "/0/home/waxy/usr/local/lib/python3.4/shutil.py", line 66, in copyfileobj
    buf = fsrc.read(length)
  File "/0/home/waxy/usr/local/lib/python3.4/http/client.py", line 500, in read
    return super(HTTPResponse, self).read(amt)
  File "/0/home/waxy/usr/local/lib/python3.4/http/client.py", line 529, in readinto
    return self._readinto_chunked(b)
  File "/0/home/waxy/usr/local/lib/python3.4/http/client.py", line 621, in _readinto_chunked
    n = self._safe_readinto(mvb)
  File "/0/home/waxy/usr/local/lib/python3.4/http/client.py", line 680, in _safe_readinto
    raise IncompleteRead(bytes(mvb[0:total_bytes]), len(b))
http.client.IncompleteRead: IncompleteRead(7052 bytes read, 16384 more expected)
@chfoo chfoo added the bug label May 26, 2014
@waxpancake
Copy link

A little more information to reproduce this crash... I was running warcat on this 25GB megawarc using this command:

python3 -m warcat extract ~/archives/incoming/upcoming_20130420095943.megawarc.warc.gz --output-dir expanded/ --verbose --progress

It dies right after extracting this file:

INFO:warcat.tool:Extracted <urn:uuid:6eebc1d1-cdda-4e1a-b499-184e9681f1e6> to expanded/upcoming.yahoo.com/event/2715307/LA/New-Orleans/The-Louisiana-State-Museum-Jazz-Collection/Louisiana-State-Museum/_index_da39a3

Hope that helps.

@chfoo chfoo closed this as completed in 5cd2cee Jun 5, 2014
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

No branches or pull requests

2 participants