Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Document the Scrapy component API #5439

Open
wants to merge 8 commits into
base: master
Choose a base branch
from
4 changes: 4 additions & 0 deletions docs/index.rst
Original file line number Diff line number Diff line change
Expand Up @@ -170,6 +170,7 @@ Solving specific problems
topics/jobs
topics/coroutines
topics/asyncio
topics/class-methods

:doc:`faq`
Get answers to most frequently asked questions.
Expand Down Expand Up @@ -216,6 +217,9 @@ Solving specific problems
:doc:`topics/asyncio`
Use :mod:`asyncio` and :mod:`asyncio`-powered libraries.

:doc: `topics/class-methods`
Instantiate Scrapy objects from others.
OrestisKan marked this conversation as resolved.
Show resolved Hide resolved

.. _extending-scrapy:

Extending Scrapy
Expand Down
17 changes: 7 additions & 10 deletions docs/topics/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -167,16 +167,6 @@ SpiderLoader API
the :class:`scrapy.interfaces.ISpiderLoader` interface to guarantee an
errorless execution.

.. method:: from_settings(settings)

This class method is used by Scrapy to create an instance of the class.
It's called with the current project settings, and it loads the spiders
found recursively in the modules of the :setting:`SPIDER_MODULES`
setting.

:param settings: project settings
:type settings: :class:`~scrapy.settings.Settings` instance

.. method:: load(spider_name)

Get the Spider class with the given name. It'll look into the previously
Expand All @@ -198,6 +188,13 @@ SpiderLoader API
:param request: queried request
:type request: :class:`~scrapy.Request` instance

:meth:`from_settings`:

This class method is used by Scrapy to create an instance of the class.
It's called with the current project settings, and it loads the spiders
found recursively in the modules of the :setting:`SPIDER_MODULES`
setting.
Comment on lines +191 to +196
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I am thinking factory method signatures should still be kept in the API reference. It is specially important because some components may get extra arguments, I believe, and in those cases we should document them in one of the methods (e.g. from_crawler) and in the other method refer to the former for parameter details.

What we can do now is simplify the description, both of from_crawler and from_settings, to something like:

Factory method. See :ref:`class-methods`.

Assuming we also add .. _class-methods: at the beginning of the new topic.

Also, I imagine (please check) that this class also supports from_crawler. In that case, we should include an entry for that class method as well, with its own signature, but the same description that basically points to the new topic.


.. _topics-api-signals:

Signals API
Expand Down
29 changes: 29 additions & 0 deletions docs/topics/class-methods.rst
Original file line number Diff line number Diff line change
@@ -0,0 +1,29 @@
===========================
Class Instantiation Methods
===========================

These methods create an instance of their implementer class by
extracting the components needed for it from the argument that the method takes.
OrestisKan marked this conversation as resolved.
Show resolved Hide resolved

.. py:classmethod:: from_crawler(cls, crawler)

Factory method that if present, is used to create an instance of the implementer class
OrestisKan marked this conversation as resolved.
Show resolved Hide resolved
using a :class:`~scrapy.crawler.Crawler`. It must return a new instance
of the implementer class. The Crawler object is needed in order to provide
access to all Scrapy core components like settings and signals; It is a
way for the implenter class to access them and hook its functionality into Scrapy.

:param crawler: crawler that uses this middleware
:type crawler: :class:`~scrapy.crawler.Crawler` object


.. py:classmethod:: from_settings(cls, settings)

This class method is used by Scrapy to create an instance of the implementer class
using the settings passed as arguments.
In most implementations of this method, not all settings given will be used
to instantiate the class, but only the ones needed in the specific scenario.
OrestisKan marked this conversation as resolved.
Show resolved Hide resolved


:param settings: project settings
:type settings: :class:`~scrapy.settings.Settings` instance
OrestisKan marked this conversation as resolved.
Show resolved Hide resolved
12 changes: 3 additions & 9 deletions docs/topics/downloader-middleware.rst
Original file line number Diff line number Diff line change
Expand Up @@ -163,16 +163,10 @@ object gives you access, for example, to the :ref:`settings <topics-settings>`.
:param spider: the spider for which this request is intended
:type spider: :class:`~scrapy.Spider` object

.. method:: from_crawler(cls, crawler)
:meth:`from_crawler`:

If present, this classmethod is called to create a middleware instance
from a :class:`~scrapy.crawler.Crawler`. It must return a new instance
of the middleware. Crawler object provides access to all Scrapy core
components like settings and signals; it is a way for middleware to
access them and hook its functionality into Scrapy.

:param crawler: crawler that uses this middleware
:type crawler: :class:`~scrapy.crawler.Crawler` object
Class method that if present, is used to create a middleware
instance using a :class:`~scrapy.crawler.Crawler`.

.. _topics-downloader-middleware-ref:

Expand Down
14 changes: 5 additions & 9 deletions docs/topics/email.rst
Original file line number Diff line number Diff line change
Expand Up @@ -67,14 +67,6 @@ rest of the framework.
:param smtpssl: enforce using a secure SSL connection
:type smtpssl: bool

.. classmethod:: from_settings(settings)

Instantiate using a Scrapy settings object, which will respect
:ref:`these Scrapy settings <topics-email-settings>`.

:param settings: the e-mail recipients
:type settings: :class:`scrapy.settings.Settings` object

.. method:: send(to, subject, body, cc=None, attachs=(), mimetype='text/plain', charset=None)

Send email to the given recipients.
Expand Down Expand Up @@ -102,8 +94,12 @@ rest of the framework.
:type mimetype: str

:param charset: the character encoding to use for the e-mail contents
:type charset: str
:type charset: str

:meth:`from_settings`:

Instantiate a MailSender using a Scrapy settings object, which will respect
:ref:`these Scrapy settings <topics-email-settings>`.

.. _topics-email-settings:

Expand Down
12 changes: 3 additions & 9 deletions docs/topics/item-pipeline.rst
Original file line number Diff line number Diff line change
Expand Up @@ -60,16 +60,10 @@ Additionally, they may also implement the following methods:
:param spider: the spider which was closed
:type spider: :class:`~scrapy.Spider` object

.. method:: from_crawler(cls, crawler)
:meth:`from_crawler`:

If present, this classmethod is called to create a pipeline instance
from a :class:`~scrapy.crawler.Crawler`. It must return a new instance
of the pipeline. Crawler object provides access to all Scrapy core
components like settings and signals; it is a way for pipeline to
access them and hook its functionality into Scrapy.

:param crawler: crawler that uses this pipeline
:type crawler: :class:`~scrapy.crawler.Crawler` object
Class method that if present, is used to create a pipeline
instance using a :class:`~scrapy.crawler.Crawler`.


Item pipeline example
Expand Down
14 changes: 4 additions & 10 deletions docs/topics/spider-middleware.rst
Original file line number Diff line number Diff line change
Expand Up @@ -169,16 +169,10 @@ object gives you access, for example, to the :ref:`settings <topics-settings>`.
:param spider: the spider to whom the start requests belong
:type spider: :class:`~scrapy.Spider` object

.. method:: from_crawler(cls, crawler)

If present, this classmethod is called to create a middleware instance
from a :class:`~scrapy.crawler.Crawler`. It must return a new instance
of the middleware. Crawler object provides access to all Scrapy core
components like settings and signals; it is a way for middleware to
access them and hook its functionality into Scrapy.

:param crawler: crawler that uses this middleware
:type crawler: :class:`~scrapy.crawler.Crawler` object
:meth:`from_crawler`:

Class method that if present, is used to create a middleware
instance using a :class:`~scrapy.crawler.Crawler`.

.. _topics-spider-middleware-ref:

Expand Down