GitHub - saasindustries/scrapy-zenscrape

Scrapy Zenscrape Middleware

Acknowledgements

Thanks to arimbr and ScrapingBee, this is adaptation of their work.

Installation

pip install scrapy-zenscrape

Configuration

Add your ZENSCRAPE_API_KEY and the ZenscrapeMiddleware to your project settings.py. Don't forget to set CONCURRENT_REQUESTS according to your Zenscrape plan.

ZENSCRAPE_API_KEY = 'REPLACE-WITH-YOUR-API-KEY'

DOWNLOADER_MIDDLEWARES = {
    'scrapy_zenscrape.ZenscrapeMiddleware': 700,
}

CONCURRENT_REQUESTS = 1

Usage

Inherit your spiders from ZenscrapeSpider and yield a ZenscrapeRequest.

Below you can see an example from the spider in httpbin.py.

from scrapy import Spider
from scrapy_zenscrape import ZenscrapeSpider, ZenscrapeRequest

class HttpbinSpider(Spider):
    name = 'httpbin'
    start_urls = [
        'https://httpbin.org',
    ]

    def start_requests(self):
        for url in self.start_urls:
            yield ZenscrapeRequest(url, params={
                # 'render': False,
                # 'block_ads': True,
                # 'block_resources': False,
                # 'premium': True,
                # 'location': 'fr',
                # 'wait_for': 5,
                # 'wait_for_css': '#swagger-ui',
            },
            headers={
                # 'Accept-Language': 'En-US',
            },
            cookies={
                # 'name_1': 'value_1',
            })

    def parse(self, response):
        ...

You can pass Zenscrape parameters in the params argument of a ZenscrapeRequest. Headers and cookies are passed like a normal Scrapy Request. ZenscrapeRequests formats all parameters, headers and cookies to the format expected by the API.

Examples

Add your API key to settings.py.

To run the examples you need to clone this repository. In your terminal, go to examples/httpbin/httpbin and run the example spider with:

scrapy crawl httpbin

Name		Name	Last commit message	Last commit date
Latest commit History 2 Commits
examples/httpbin		examples/httpbin
scrapy_zenscrape		scrapy_zenscrape
.gitignore		.gitignore
README.md		README.md
requirements.txt		requirements.txt
setup.py		setup.py

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

examples/httpbin

examples/httpbin

scrapy_zenscrape

scrapy_zenscrape

.gitignore

.gitignore

README.md

README.md

requirements.txt

requirements.txt

setup.py

setup.py

Repository files navigation

Scrapy Zenscrape Middleware

Acknowledgements

Installation

Configuration

Usage

Examples

About

Releases

Packages

Languages

saasindustries/scrapy-zenscrape

Folders and files

Latest commit

History

Repository files navigation

Scrapy Zenscrape Middleware

Acknowledgements

Installation

Configuration

Usage

Examples

About

Resources

Stars

Watchers

Forks

Languages