added special handler for about:blank urls following the standard. #81

Closed
wants to merge 2 commits into
from

Conversation

Projects
None yet
2 participants
Contributor

rmax commented Jan 25, 2012

It returns an empty html response. See http://en.wikipedia.org/wiki/About_URI_scheme

Owner

dangra commented Jan 25, 2012

I don't get why scrapy needs to handle about: scheme, but in case we agree it does, this patch is rendering a blank page for every about: scheme request it gets. IMHO rendering 404 for other than about:blank makes more sense.

Contributor

rmax commented Jan 25, 2012

The purpose is to handle the case when a webpage does a redirection (metarefresh or 30x). Like the example below:

$> curl http://www.kijkjerijk.nl/
> GET / HTTP/1.1
> User-Agent: Mozilla/5.1; MSIE; YB/9.5.1 MEGAUPLOAD 1.0;
> Host: www.kijkjerijk.nl
> Accept: */*
> Referer: 
> 
< HTTP/1.1 200 OK
< Content-Length: 57
< Content-Type: text/html
< Content-Location: http://www.kijkjerijk.nl/Index.html
< Last-Modified: Wed, 04 Nov 2009 08:35:42 GMT
< Accept-Ranges: bytes
< ETag: "fa659acb295dca1:2f6"
< Server: Microsoft-IIS/6.0
< X-Powered-By: ASP.NET
< X-Powered-By: PleskWin
< Date: Wed, 25 Jan 2012 19:25:30 GMT
< 
* Connection #0 to host www.kijkjerijk.nl left intact
* Closing connection #0
<meta http-equiv="refresh" content="0;URL=about:blank" />
Owner

dangra commented Jan 8, 2013

The change by itself looks fine, but I still can't see what is the advantage of supporting about:blank.

dangra closed this Jan 8, 2013

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment