-
Notifications
You must be signed in to change notification settings - Fork 69
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Added minimum viable download_url function. #29
Conversation
Comments on the interface are welcome - once we've got it, we're kind of stuck with it... @IanHopkinson @drj11 @scraperdragon |
'OPTIONS': requests.options} | ||
|
||
|
||
def download_url(url, method='GET', retry=False, cache_seconds=None, **kwargs): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Prefer to avoid forwarding kwargs
where possible. At least you should explicitly have all of the useful ones you can think of in the function signature. Otherwise what happens is that people build stuff on top of it and you end up looking five functions deep each of which peels off a couple of arguments.
Also good if forwarding kwargs
to specify some useful ones in the docstring and preferably a reference to the inner documentation.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
In this case isn't it about letting the caller use all of the keyword arguments from requests.get without us getting in the way? We're not in a position to predict/document/know the keyword arguments to requests.get and it would be annoying if the caller was unable to use one.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Agree in principle, but the sole purpose of this function is to thinly-wrap the requests module - people should look there (and not 5 modules down!) for documentation, really.
The last thing I want to do is diverge from the requests interface in the future (ie by not supporting a future argument)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
To be clear, I'm not against forwarding kwargs
. The minimum is to have a link pointing directly at the place it's forwarded to. Slightly better is to also explicitly write out the useful ones which are forwarded. Even better is not to have kwargs
.
I encountered this just the other day with dragon where we had to go to something like five places in the code to find out what arguments were possible and what they meant because none of the forwarders documented what were sensible arguments or where to find them.
The most disappointing thing about the interface is that it's not just "requests". I'd prefer an interface where I can go:
and magically all my requests are cached. Maybe there is a scraperwiki.contrib.cache.control I can call too. And similarly for the retry functionality:
I like this because then I don't have to change my code which is probably already using requests.get |
It's a fair point, but I'm pretty selective about which requests I cache (see requests-cache/requests-cache#12) I usually don't cache index pages, but do cache things they link to. The auto retry one is a good point, although I don't particularly want to be in the business of monkey patching requests :( I don't share your pain of having written requests.get everywhere - I've basically always wrapped it in a download_url function :) |
wow, 7 months old. Seems like yesterday. Can we merge or close this already? |
A lot of this functionality seems to already be in https://github.com/scraperwiki/data-services-helpers/blob/master/dshelpers.py - predominantly because it's been 7 months. Should we be focussing on one or the other? |
@paulfurley Reopen this if you feel strongly that it needs considering further. |
Nice one - sorry that was untidy of me. |
Request for comment, don't merge yet :)