Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Is it any plan to support downloading "data:image/jpeg;base64" #2156

Closed
yssource opened this issue Jul 31, 2016 · 6 comments
Closed

Is it any plan to support downloading "data:image/jpeg;base64" #2156

yssource opened this issue Jul 31, 2016 · 6 comments

Comments

@yssource
Copy link

@yssource yssource commented Jul 31, 2016

[scrapy] ERROR: Error downloading <GET data:/robots.txt>: Unsupported URL scheme 'data': no handler available for that scheme
...
scrapy.exceptions.NotSupported: Unsupported URL scheme 'data': no handler available for that scheme

I try to download a image form src attribute "data:image/jpeg;base64", But, an error happens to me.

@ArturGaspar
Copy link
Contributor

@ArturGaspar ArturGaspar commented Aug 2, 2016

You can copy the code of this download handler: https://github.com/ArturGaspar/scrapy-qtwebkit/blob/master/scrapy_qtwebkit/data_downloader.py

And enable it in the settings with:

DOWNLOAD_HANDLERS = {
    'data': 'your_module.DataURLDownloadHandler'
}

To the Scrapy team, if there is any interest in supporting this in Scrapy, I can write tests and make a pull request.

@eliasdorneles
Copy link
Member

@eliasdorneles eliasdorneles commented Aug 2, 2016

Nice project, @ArturGaspar !
Yeah, that download handler looks useful, I think it would be valuable in Scrapy too.

I don't quite follow all those tricky regexes, wouldn't it be clearer if they were spelled out?

@redapple
Copy link
Contributor

@redapple redapple commented Aug 11, 2016

Thanks @ArturGaspar!
I second @eliasdorneles on this, I'd also like to see this in scrapy core.
Waiting for the PR :)

@redapple
Copy link
Contributor

@redapple redapple commented Aug 16, 2016

@yssource , see #2175 and scrapy/w3lib#71
It should land in scrapy soon enough.

@yssource
Copy link
Author

@yssource yssource commented Aug 17, 2016

Thanks everyone. This works for me very well.
Please feel free to close this issue.

@redapple redapple added this to the v1.3 milestone Sep 15, 2016
@redapple
Copy link
Contributor

@redapple redapple commented Sep 15, 2016

@yssource , we'll close the issue once the PR in w3lib is merged and a new w3lib is released and once scrapy updates its download handlers to use the functionality

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Linked pull requests

Successfully merging a pull request may close this issue.

None yet
4 participants