New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Make upload more robust by ignoring spurious errors while polling the scan status. #480
Changes from all commits
File filter
Filter by extension
Conversations
Jump to
Diff view
Diff view
There are no files selected for viewing
Original file line number | Diff line number | Diff line change |
---|---|---|
|
@@ -20,13 +20,13 @@ | |
import time | ||
import os | ||
|
||
import requests | ||
from concurrent.futures import ThreadPoolExecutor | ||
from progressbar import (ProgressBar, Percentage, Bar, AnimatedMarker) | ||
from requests_toolbelt import (MultipartEncoder, MultipartEncoderMonitor) | ||
|
||
from .common import ( | ||
get_oauth_session, | ||
is_scan_completed, | ||
retry, | ||
) | ||
from .compat import open, quote_plus, urljoin | ||
|
@@ -318,6 +318,30 @@ def get_post_files(metadata=None): | |
return files | ||
|
||
|
||
def is_scan_completed(response): | ||
"""Return True if the response indicates the scan process completed.""" | ||
if response is None: | ||
# To cope with spurious connection failures lacking a proper response: | ||
# either we'll retry and succeed or we fail for all retries and report | ||
# an error. | ||
return False | ||
if response.ok: | ||
return response.json().get('completed', False) | ||
return False | ||
|
||
|
||
def get_scan_status(session, url): | ||
try: | ||
resp = session.get(url) | ||
return resp | ||
except (requests.ConnectionError, requests.HTTPError): | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. What was HTTPError, will we loop forever if the resource does not exist? There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
I don't remember encountering it in this particular case but the code involved can trigger it. I can remove it if you prefer.
No. The caller retries a fixed number of times and will report the last error. |
||
# Something went wrong and we couldn't acquire the status. Upper | ||
# level (is_scan_completed) will deal with the None response | ||
# meaning we don't know the status. This avoid a spurious | ||
# connection error breaking an upload for a wrong reason. | ||
return None | ||
There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. According to [1], we should return an exception instead of None. That makes sense because the name of the exception will make it's meaning clearer than None. [1] https://www.goodreads.com/book/show/23020812-effective-python There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more.
No, that's the bug. The caller (intermediate, hidden, out of our reach) can't catch it nor act properly. Returning None tells the caller: nah, not completed yet. There was a problem hiding this comment. Choose a reason for hiding this commentThe reason will be displayed to describe this comment to others. Learn more. I see, @Retry doesn't let you handle the exception. The comments make this clear, so I'm ok with this. |
||
|
||
|
||
def get_scan_data(session, status_url): | ||
"""Return metadata about the state of the upload scan process.""" | ||
# initial retry after 5 seconds | ||
|
@@ -328,7 +352,7 @@ def get_scan_data(session, status_url): | |
delay=SCAN_STATUS_POLL_DELAY, | ||
backoff=1) | ||
def get_status(): | ||
return session.get(status_url) | ||
return get_scan_status(session, status_url) | ||
|
||
response, aborted = get_status() | ||
|
||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think I would name these helper funcions with a leading underscore, to make it clear that they are called only from higher level functions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There is a refactor in progress in another branch, I'd rather address that there ?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
works for me.