-
Notifications
You must be signed in to change notification settings - Fork 41
PR2-Feature changes to support hash verification, file resume, drive… #17
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
dustinvtran
left a comment
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Excellent work as mentioned before. I only have a bunch of minor comments. What I'll do is merge it now and submit it a new PR making the minor changes and ask you for feedback.
|
|
||
| def check_capabilities(url, start=120, num_bytes=300, user_agent=None, | ||
| _print=True): | ||
| """ Check capabilities of a specified URL. This function queries the URL |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Remove leading space.
|
|
||
| def maybe_download_and_extract(directory, url, extract=True): | ||
| def _get_google_confirm_id_token(url, session): | ||
| """Helper function to get confirmation token for google drive |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Add period at end of all first docstring sentences.
|
|
||
|
|
||
| def _gdrive_params(url, session): | ||
| """Helper function reurn google drive specific parameters |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Helper function returning google drive specific parameters.
| def check_capabilities(url, start=120, num_bytes=300, user_agent=None, | ||
| _print=True): | ||
| """ Check capabilities of a specified URL. This function queries the URL | ||
| to get a response on the remote server capabilities.Requests for num_bytes |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
. Requests
|
|
||
| def humanize_size(n): | ||
| """Convert file size in bytes to a friendly human readable form. | ||
| E.g humanize_size(1024) - '1 KB'. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Examples are written at end of docstring like:
#### Examples
```python
humanize_size(1024)
## '1 KB'
See, e.g., https://github.com/blei-lab/edward/blob/master/edward/inferences/inference.py.
| Args: | ||
| file_path: str. | ||
| Full path of the file. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
file_path vs path seems inconsistent; should stick with path.
|
|
||
| def _print_progress(file_path, dl_bytes, file_size, block_size, count, | ||
| _start_time): | ||
| """ Helper function that is called to print progress of download. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
"""Helper function
| Iteration count. | ||
| _start_time: object. | ||
| Time object to indicate when download started. | ||
| Returns: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Time object ...
Returns:
| dict: Dictionary of remote server capabilities, required prior to full | ||
| download. | ||
| """ | ||
| supports_range = False |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Should supports_range be called supports_resume?
| def normal_download(url, file_path, session, params={}, headers={}, | ||
| hash_true=None, timeout=10, block_size=1024 * 1024): | ||
| """start download of a file. No pause/resume capability. | ||
| Args: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Start download.... blankline between all sections
|
Pls go ahead with your changes. Happy to contribute. |
New features
a) Better progress update giving ETA/Speed
b) Human readable size and time
c) pause/resume downloads with resume=True option
d) hash verification if hash is provided (hash_true option)
e) Special handling for google drive downloads