Switch to depend on requests library #129

peterbe · 2015-06-30T16:55:40Z

I have had more than one problem with the built-in urlopen() to download certain URLs.
This one for example fails on some SSL trouble in urlopen but opens nicely with requests from my OSX laptop.

I thought Python 3 was supposed to have requests built in so maybe this is only necessary for Python 2.

The text was updated successfully, but these errors were encountered:

peterbe · 2015-06-30T22:51:47Z

What do you think @graingert @elidickinson @ionelmc ?

graingert · 2015-06-30T22:57:37Z

python3 doesn't have requests built in.

peterbe · 2015-06-30T22:58:17Z

python3 doesn't have requests built in.

So that was just a rumor I heard then. Well, it makes it easier to just depend on the new external dependency.

graingert · 2015-06-30T22:59:27Z

I think you should go for something like http://weasyprint.org/docs/tutorial/#url-fetchers

peterbe · 2015-06-30T23:01:21Z

Interesting but I'm not sure I follow. How is that better/different than requests?

graingert · 2015-06-30T23:07:16Z

Their default implementation looks pretty robust:

https://github.com/Kozea/WeasyPrint/blob/2d7f68e8a17821374d3974ea987710eecae586ad/weasyprint/urls.py#L241

And users of this library can override it as they see fit. Eg for my site I checked for

def my_fetcher(url):
    if url.stats_with('/static/'):
        return django_static_file_finder(urlparse(url).path)
    else:
        return weasyprint.default_url_fetcher(url)

weasyprint.HTML(doc, url_fetcher=my_fetcher)

graingert · 2015-06-30T23:07:41Z

You could also create a requests implementation:

try:
    import requests
except ImportError:
    def _default_url_fetcher(url):
        ...
else:
    def _default_url_fetcher(url):
        ...
        fileobj = session.get(url, stream=True).raw
        ...


# For autocomplete and static analysis etc
def default_url_fetcher(url):
    return _default_url_fetcher(url)


def premailer(doc, url_fetcher=default_url_fetcher):
    ...

graingert · 2015-06-30T23:15:29Z

I'd also add an argument in for a HTTP 'Accept' header so for:

 <link rel="stylesheet" type="text/css" href="https://example.com/theme.css">

You'd call:

default_url_fetcher(url='https://example.com/theme.css', mimetype='text/css')

ionelmc · 2015-07-01T14:44:38Z

I think it's fine having some default_url_fetcher abstraction in case someone wants to use something else than requests.

peterbe · 2015-07-01T15:51:05Z

But why do you need all of that? What's wrong with simply doing requests.get('https://example.com/theme.css').text. That'll get the CSS payload as a unicode string. And if the response is gzipped, requests will take care of that.

graingert · 2015-07-01T15:52:25Z

Accept header is nothing to do with gzip. The servers can return different documents depending on the Accept header

peterbe · 2015-07-01T16:00:05Z

But is that likely? Have you had problems with doing urlopen('some.com/url.css') and gotten the wrong content because of headers? Note, in premailer we only use urlopen to download external stylesheets.

(Sorry if I sound like I'm "fighting". I'm just trying to learn. And keep things simple :) at the same time)

graingert · 2015-07-01T17:11:15Z

It's just part of the http spec. I've seen some sites render their entire page off of '/' using the accept header

peterbe · 2015-07-01T17:32:45Z

Haha! Well, they're nuts! :)

graingert · 2015-07-22T19:46:50Z

What was the consensus on this?

peterbe · 2015-07-22T22:54:19Z

My view is that anything overly complex is going to mean more code to maintain. A switch to relying on requests is straightforward and it's an established and good package. It just swaps out the urllib.urlopen for a different opener and that's about it. Doesn't try to solve other problems as well.

alexphelps · 2015-09-28T10:49:16Z

I just ran into this bug today. This is the error when using an SSL url for the css file. The only fix I could find was to not use ssl on the url which makes configuring my app harder because all other pages, files, etc need ssl.

URLError: <urlopen error [Errno 1] _ssl.c:510: error:14077438:SSL routines:SSL23_GET_SERVER_HELLO:tlsv1 alert internal error>

peterbe · 2015-09-29T16:32:01Z

@alexphelps I bet if you take that same URL (the https one) and try to do requests.get(that_url) you'll find that it works.
@graingert is working on a fancy wrapper API on top of requests.get but if he doesn't have time, which is fine, we'll just replace urllib with requests.

p12tic · 2018-10-04T06:13:31Z

@peterbe: I think this issue can be closed.

peterbe mentioned this issue Jul 13, 2015

Looks like SSL breaks? peterbe/premailer.io#2

Open

peterbe closed this as completed Oct 4, 2018

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Switch to depend on requests library #129

Switch to depend on requests library #129

peterbe commented Jun 30, 2015

peterbe commented Jun 30, 2015

graingert commented Jun 30, 2015

peterbe commented Jun 30, 2015

graingert commented Jun 30, 2015

peterbe commented Jun 30, 2015

graingert commented Jun 30, 2015

graingert commented Jun 30, 2015

graingert commented Jun 30, 2015

ionelmc commented Jul 1, 2015

peterbe commented Jul 1, 2015

graingert commented Jul 1, 2015

peterbe commented Jul 1, 2015

graingert commented Jul 1, 2015

peterbe commented Jul 1, 2015

graingert commented Jul 22, 2015

peterbe commented Jul 22, 2015

alexphelps commented Sep 28, 2015

peterbe commented Sep 29, 2015

p12tic commented Oct 4, 2018

Switch to depend on requests library #129

Switch to depend on requests library #129

Comments

peterbe commented Jun 30, 2015

peterbe commented Jun 30, 2015

graingert commented Jun 30, 2015

peterbe commented Jun 30, 2015

graingert commented Jun 30, 2015

peterbe commented Jun 30, 2015

graingert commented Jun 30, 2015

graingert commented Jun 30, 2015

graingert commented Jun 30, 2015

ionelmc commented Jul 1, 2015

peterbe commented Jul 1, 2015

graingert commented Jul 1, 2015

peterbe commented Jul 1, 2015

graingert commented Jul 1, 2015

peterbe commented Jul 1, 2015

graingert commented Jul 22, 2015

peterbe commented Jul 22, 2015

alexphelps commented Sep 28, 2015

peterbe commented Sep 29, 2015

p12tic commented Oct 4, 2018