Skip to content

Latest commit

 

History

History
87 lines (53 loc) · 2.6 KB

quick-start-single.rst

File metadata and controls

87 lines (53 loc) · 2.6 KB

Quick start single process

1. Create your spider

Create your Scrapy project as you usually do. Enter a directory where you’d like to store your code and then run:

scrapy startproject tutorial

This will create a tutorial directory with the following contents:

tutorial/
    scrapy.cfg
    tutorial/
        __init__.py
        items.py
        pipelines.py
        settings.py
        spiders/
            __init__.py
            ...

These are basically:

  • scrapy.cfg: the project configuration file
  • tutorial/: the project’s python module, you’ll later import your code from here.
  • tutorial/items.py: the project’s items file.
  • tutorial/pipelines.py: the project’s pipelines file.
  • tutorial/settings.py: the project’s settings file.
  • tutorial/spiders/: a directory where you’ll later put your spiders.

2. Install Frontera

See :doc:`installation`.

3. Integrate your spider with the Frontera

This article about :doc:`integration with Scrapy <scrapy-integration>` explains this step in detail.

4. Choose your backend

Configure frontier settings to use a built-in backend like in-memory BFS:

BACKEND = 'frontera.contrib.backends.memory.BFS'

5. Run the spider

Run your Scrapy spider as usual from the command line:

scrapy crawl myspider

And that's it! You got your spider running integrated with Frontera.

What else?

You’ve seen a simple example of how to use Frontera with Scrapy, but this is just the surface. Frontera provides many powerful features for making frontier management easy and efficient, such as: