Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Added minimum viable download_url function. #29

Closed
wants to merge 1 commit into from

Conversation

fawkesley
Copy link
Contributor

Request for comment, don't merge yet :)

@fawkesley
Copy link
Contributor Author

Comments on the interface are welcome - once we've got it, we're kind of stuck with it... @IanHopkinson @drj11 @scraperdragon

'OPTIONS': requests.options}


def download_url(url, method='GET', retry=False, cache_seconds=None, **kwargs):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Prefer to avoid forwarding kwargs where possible. At least you should explicitly have all of the useful ones you can think of in the function signature. Otherwise what happens is that people build stuff on top of it and you end up looking five functions deep each of which peels off a couple of arguments.

Also good if forwarding kwargs to specify some useful ones in the docstring and preferably a reference to the inner documentation.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In this case isn't it about letting the caller use all of the keyword arguments from requests.get without us getting in the way? We're not in a position to predict/document/know the keyword arguments to requests.get and it would be annoying if the caller was unable to use one.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Agree in principle, but the sole purpose of this function is to thinly-wrap the requests module - people should look there (and not 5 modules down!) for documentation, really.

The last thing I want to do is diverge from the requests interface in the future (ie by not supporting a future argument)

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To be clear, I'm not against forwarding kwargs. The minimum is to have a link pointing directly at the place it's forwarded to. Slightly better is to also explicitly write out the useful ones which are forwarded. Even better is not to have kwargs.

I encountered this just the other day with dragon where we had to go to something like five places in the code to find out what arguments were possible and what they meant because none of the forwarders documented what were sensible arguments or where to find them.

@pwaller pwaller mentioned this pull request Aug 19, 2013
2 tasks
@drj11
Copy link
Contributor

drj11 commented Aug 19, 2013

The most disappointing thing about the interface is that it's not just "requests".

I'd prefer an interface where I can go:

import scraperwiki.contrib.cache.auto

and magically all my requests are cached. Maybe there is a scraperwiki.contrib.cache.control I can call too.

And similarly for the retry functionality:

import scraperwiki.contrib.retry.auto

I like this because then I don't have to change my code which is probably already using requests.get

@fawkesley
Copy link
Contributor Author

It's a fair point, but I'm pretty selective about which requests I cache (see requests-cache/requests-cache#12)

I usually don't cache index pages, but do cache things they link to.

The auto retry one is a good point, although I don't particularly want to be in the business of monkey patching requests :(

I don't share your pain of having written requests.get everywhere - I've basically always wrapped it in a download_url function :)

@pwaller
Copy link
Contributor

pwaller commented Mar 24, 2014

wow, 7 months old. Seems like yesterday. Can we merge or close this already?

@scraperdragon
Copy link
Contributor

A lot of this functionality seems to already be in https://github.com/scraperwiki/data-services-helpers/blob/master/dshelpers.py - predominantly because it's been 7 months.

Should we be focussing on one or the other?

@scraperdragon
Copy link
Contributor

@paulfurley Reopen this if you feel strongly that it needs considering further.

@fawkesley
Copy link
Contributor Author

Nice one - sorry that was untidy of me.

@fawkesley fawkesley deleted the download-url-contrib branch April 29, 2014 09:06
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

4 participants