Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Connection errors aren't caught #19

Closed
Hunter-Github opened this issue Aug 3, 2016 · 7 comments
Closed

Connection errors aren't caught #19

Hunter-Github opened this issue Aug 3, 2016 · 7 comments

Comments

@Hunter-Github
Copy link

requests.exceptions.ChunkedEncodingError is raised all too often.

Traceback (most recent call last):
  File "~/virt_env/bin/waybackpack", line 9, in <module>
    load_entry_point('waybackpack==0.3.2', 'console_scripts', 'waybackpack')()
  File "~/virt_env/lib/python2.7/site-packages/waybackpack/cli.py", line 88, in main
    root=args.root,
  File "~/virt_env/lib/python2.7/site-packages/waybackpack/pack.py", line 63, in download_to
    root=root
  File "~/virt_env/lib/python2.7/site-packages/waybackpack/asset.py", line 45, in fetch
    res = session.get(url)
  File "~/virt_env/lib/python2.7/site-packages/waybackpack/session.py", line 20, in get
    **kwargs
  File "~/virt_env/lib/python2.7/site-packages/requests/api.py", line 71, in get
    return request('get', url, params=params, **kwargs)
  File "~/virt_env/lib/python2.7/site-packages/requests/api.py", line 57, in request
    return session.request(method=method, url=url, **kwargs)
  File "~/virt_env/lib/python2.7/site-packages/requests/sessions.py", line 475, in request
    resp = self.send(prep, **send_kwargs)
  File "~/virt_env/lib/python2.7/site-packages/requests/sessions.py", line 617, in send
    r.content
  File "~/virt_env/lib/python2.7/site-packages/requests/models.py", line 741, in content
    self._content = bytes().join(self.iter_content(CONTENT_CHUNK_SIZE)) or bytes()
  File "~/virt_env/lib/python2.7/site-packages/requests/models.py", line 667, in generate
    raise ChunkedEncodingError(e)
requests.exceptions.ChunkedEncodingError: ('Connection broken: IncompleteRead(0 bytes read)', IncompleteRead(0 bytes read))
@jsvine
Copy link
Owner

jsvine commented Aug 4, 2016

Thanks for flagging! Do you have an example URL where that raises this error?

@Hunter-Github
Copy link
Author

Hunter-Github commented Aug 4, 2016

Sure:

waybackpack -d BugExample --from-date 20130301190431 --to-date 20130301190431 http://www.reuters.com/finance/deals/

(Although I guess it may be a bit more non-deterministic, for instance when I plugged the same date into Web Archive manually, I got the archived page).

@jsvine
Copy link
Owner

jsvine commented Aug 4, 2016

Thanks, and indeed quite strange. I'm having an experience similar to yours: When I visit the archive page for that link, I sometimes get data, and other times an empty response. In terms of handling those errors, would you rather:

(a) waybackpack skip those snapshots, or

(b) retry up to x times, or

(c) follow some other behavior?

Also: @wumpus, any thoughts on what might be happening re. these Wayback Machine responses?

@Hunter-Github
Copy link
Author

The simplest option, IMO, would be to leave the decision to the user:

  • catch the exception;
  • log the error to stderr in an easily greppable/parseable format;
  • move on.

Rationale:

  • retries in close succession are not guaranteed to work and may be considered DOS/TOS violation by the Waybackers;
  • skipping snapshots without telling the user is kinda bad UX.

@jsvine
Copy link
Owner

jsvine commented Aug 24, 2016

First pass at handling this, here: #20

Adds --ignore-errors flag. Though perhaps it should be --skip-errors?

Does this look/work as expected? Or were you thinking of another approach?

@Hunter-Github
Copy link
Author

Haven't tried the test yet, but the changes look sound to me - 9603712

@jsvine
Copy link
Owner

jsvine commented Sep 1, 2016

Merged, incorporated into v0.3.3 and pushed to PyPi!

@jsvine jsvine closed this as completed Sep 1, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants