Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

[MRG+1] Per-key priorities for dict-like settings by promoting dicts to Settings instances #1149

Merged
168 changes: 6 additions & 162 deletions docs/topics/api.rst
Original file line number Diff line number Diff line change
Expand Up @@ -140,170 +140,14 @@ Settings API
For a detailed explanation on each settings sources, see:
:ref:`topics-settings`.

.. class:: Settings(values={}, priority='project')
.. autofunction:: get_settings_priority

This object stores Scrapy settings for the configuration of internal
components, and can be used for any further customization.

After instantiation of this class, the new object will have the global
default settings described on :ref:`topics-settings-ref` already
populated.

Additional values can be passed on initialization with the ``values``
argument, and they would take the ``priority`` level. If the latter
argument is a string, the priority name will be looked up in
:attr:`~scrapy.settings.SETTINGS_PRIORITIES`. Otherwise, a expecific
integer should be provided.

Once the object is created, new settings can be loaded or updated with the
:meth:`~scrapy.settings.Settings.set` method, and can be accessed with the
square bracket notation of dictionaries, or with the
:meth:`~scrapy.settings.Settings.get` method of the instance and its value
conversion variants. When requesting a stored key, the value with the
highest priority will be retrieved.

.. method:: set(name, value, priority='project')

Store a key/value attribute with a given priority.

Settings should be populated *before* configuring the Crawler object
(through the :meth:`~scrapy.crawler.Crawler.configure` method),
otherwise they won't have any effect.

:param name: the setting name
:type name: string

:param value: the value to associate with the setting
:type value: any

:param priority: the priority of the setting. Should be a key of
:attr:`~scrapy.settings.SETTINGS_PRIORITIES` or an integer
:type priority: string or int

.. method:: setdict(values, priority='project')

Store key/value pairs with a given priority.

This is a helper function that calls
:meth:`~scrapy.settings.Settings.set` for every item of ``values``
with the provided ``priority``.

:param values: the settings names and values
:type values: dict

:param priority: the priority of the settings. Should be a key of
:attr:`~scrapy.settings.SETTINGS_PRIORITIES` or an integer
:type priority: string or int

.. method:: setmodule(module, priority='project')

Store settings from a module with a given priority.

This is a helper function that calls
:meth:`~scrapy.settings.Settings.set` for every globally declared
uppercase variable of ``module`` with the provided ``priority``.

:param module: the module or the path of the module
:type module: module object or string

:param priority: the priority of the settings. Should be a key of
:attr:`~scrapy.settings.SETTINGS_PRIORITIES` or an integer
:type priority: string or int

.. method:: get(name, default=None)

Get a setting value without affecting its original type.

:param name: the setting name
:type name: string

:param default: the value to return if no setting is found
:type default: any

.. method:: getbool(name, default=False)

Get a setting value as a boolean. For example, both ``1`` and ``'1'``, and
``True`` return ``True``, while ``0``, ``'0'``, ``False`` and ``None``
return ``False````

For example, settings populated through environment variables set to ``'0'``
will return ``False`` when using this method.

:param name: the setting name
:type name: string

:param default: the value to return if no setting is found
:type default: any

.. method:: getint(name, default=0)

Get a setting value as an int

:param name: the setting name
:type name: string

:param default: the value to return if no setting is found
:type default: any

.. method:: getfloat(name, default=0.0)

Get a setting value as a float

:param name: the setting name
:type name: string

:param default: the value to return if no setting is found
:type default: any

.. method:: getlist(name, default=None)

Get a setting value as a list. If the setting original type is a list, a
copy of it will be returned. If it's a string it will be split by ",".

For example, settings populated through environment variables set to
``'one,two'`` will return a list ['one', 'two'] when using this method.

:param name: the setting name
:type name: string

:param default: the value to return if no setting is found
:type default: any

.. method:: getdict(name, default=None)

Get a setting value as a dictionary. If the setting original type is a
dictionary, a copy of it will be returned. If it's a string it will
evaluated as a json dictionary.

:param name: the setting name
:type name: string

:param default: the value to return if no setting is found
:type default: any

.. method:: copy()

Make a deep copy of current settings.

This method returns a new instance of the :class:`Settings` class,
populated with the same values and their priorities.

Modifications to the new object won't be reflected on the original
settings.

.. method:: freeze()

Disable further changes to the current settings.

After calling this method, the present state of the settings will become
immutable. Trying to change values through the :meth:`~set` method and
its variants won't be possible and will be alerted.

.. method:: frozencopy()

Return an immutable copy of the current settings.
.. autoclass:: Settings
:show-inheritance:
:members:

Alias for a :meth:`~freeze` call in the object returned by :meth:`copy`
.. autoclass:: BaseSettings
:members:

.. _topics-api-spiderloader:

Expand Down
24 changes: 11 additions & 13 deletions docs/topics/downloader-middleware.rst
Original file line number Diff line number Diff line change
Expand Up @@ -23,22 +23,20 @@ Here's an example::
'myproject.middlewares.CustomDownloaderMiddleware': 543,
}

The :setting:`DOWNLOADER_MIDDLEWARES` setting is merged with the
:setting:`DOWNLOADER_MIDDLEWARES_BASE` setting defined in Scrapy (and not meant to
be overridden) and then sorted by order to get the final sorted list of enabled
middlewares: the first middleware is the one closer to the engine and the last
is the one closer to the downloader.

To decide which order to assign to your middleware see the
:setting:`DOWNLOADER_MIDDLEWARES_BASE` setting and pick a value according to
The specified :setting:`DOWNLOADER_MIDDLEWARES` setting is merged with the
default one (i.e. it does not overwrite it) and then sorted by order to get the
final sorted list of enabled middlewares: the first middleware is the one
closer to the engine and the last is the one closer to the downloader.

To decide which order to assign to your middleware see the default
:setting:`DOWNLOADER_MIDDLEWARES` setting and pick a value according to
where you want to insert the middleware. The order does matter because each
middleware performs a different action and your middleware could depend on some
previous (or subsequent) middleware being applied.

If you want to disable a built-in middleware (the ones defined in
:setting:`DOWNLOADER_MIDDLEWARES_BASE` and enabled by default) you must define it
in your project's :setting:`DOWNLOADER_MIDDLEWARES` setting and assign `None`
as its value. For example, if you want to disable the user-agent middleware::
If you want to disable a built-in middleware you must define it in your
project's :setting:`DOWNLOADER_MIDDLEWARES` setting and assign ``None`` as its
value. For example, if you want to disable the user-agent middleware::

DOWNLOADER_MIDDLEWARES = {
'myproject.middlewares.CustomDownloaderMiddleware': 543,
Expand Down Expand Up @@ -164,7 +162,7 @@ middleware, see the :ref:`downloader middleware usage guide
<topics-downloader-middleware>`.

For a list of the components enabled by default (and their orders) see the
:setting:`DOWNLOADER_MIDDLEWARES_BASE` setting.
:setting:`DOWNLOADER_MIDDLEWARES` setting.

.. _cookies-mw:

Expand Down
20 changes: 9 additions & 11 deletions docs/topics/extensions.rst
Original file line number Diff line number Diff line change
Expand Up @@ -42,17 +42,15 @@ by a string: the full Python path to the extension's class name. For example::

As you can see, the :setting:`EXTENSIONS` setting is a dict where the keys are
the extension paths, and their values are the orders, which define the
extension *loading* order. Extensions orders are not as important as middleware
orders though, and they are typically irrelevant, ie. it doesn't matter in
which order the extensions are loaded because they don't depend on each other
[1].
extension *loading* order. The specified :setting:`EXTENSIONS` setting is merged
with the default one (i.e. it does not overwrite it) and then sorted by order
to get the final sorted list of enabled extensions.

However, this feature can be exploited if you need to add an extension which
depends on other extensions already loaded.

[1] This is is why the :setting:`EXTENSIONS_BASE` setting in Scrapy (which
contains all built-in extensions enabled by default) defines all the extensions
with the same order (``500``).
As extensions typically do not depend on each other, their loading order is
irrelevant in most cases. This is why the default :setting:`EXTENSIONS` setting
defines all extensions with the same order (``500``). However, this feature can
be exploited if you need to add an extension which depends on other extensions
already loaded.

Available, enabled and disabled extensions
==========================================
Expand All @@ -65,7 +63,7 @@ Disabling an extension
======================

In order to disable an extension that comes enabled by default (ie. those
included in the :setting:`EXTENSIONS_BASE` setting) you must set its order to
included in the default :setting:`EXTENSIONS` setting) you must set its order to
``None``. For example::

EXTENSIONS = {
Expand Down
41 changes: 17 additions & 24 deletions docs/topics/feed-exports.rst
Original file line number Diff line number Diff line change
Expand Up @@ -265,16 +265,6 @@ Whether to export empty feeds (ie. feeds with no items).
FEED_STORAGES
-------------

Default:: ``{}``

A dict containing additional feed storage backends supported by your project.
The keys are URI schemes and the values are paths to storage classes.

.. setting:: FEED_STORAGES_BASE

FEED_STORAGES_BASE
------------------

Default::

{
Expand All @@ -285,36 +275,39 @@ Default::
'ftp': 'scrapy.extensions.feedexport.FTPFeedStorage',
}

A dict containing the built-in feed storage backends supported by Scrapy.
A dict containing all feed storage backends supported by your project. The keys
are URI schemes and the values are paths to storage classes.

When you set :setting:`FEED_STORAGES` manually, e.g. in your project's settings
module, it will be merged with the default, not overwrite it. If you want to
disable any of the default feed storage backends, you must assign ``None`` as
their value.

.. setting:: FEED_EXPORTERS

FEED_EXPORTERS
--------------

Default:: ``{}``

A dict containing additional exporters supported by your project. The keys are
URI schemes and the values are paths to :ref:`Item exporter <topics-exporters>`
classes.

.. setting:: FEED_EXPORTERS_BASE

FEED_EXPORTERS_BASE
-------------------

Default::

FEED_EXPORTERS_BASE = {
{
'json': 'scrapy.exporters.JsonItemExporter',
'jsonlines': 'scrapy.exporters.JsonLinesItemExporter',
'jl': 'scrapy.exporters.JsonLinesItemExporter',
'csv': 'scrapy.exporters.CsvItemExporter',
'xml': 'scrapy.exporters.XmlItemExporter',
'marshal': 'scrapy.exporters.MarshalItemExporter',
'pickle': 'scrapy.exporters.PickleItemExporter',
}

A dict containing the built-in feed exporters supported by Scrapy.
A dict containing all feed exporters supported by your project. The keys are
URI schemes and the values are paths to :ref:`Item exporter <topics-exporters>`
classes.

When you set :setting:`FEED_EXPORTERS` manually, e.g. in your project's settings
module, it will be merged with the default, not overwrite it. If you want to
disable any of the default feed exporters, you must assign ``None`` as their
value.

.. _URI: http://en.wikipedia.org/wiki/Uniform_Resource_Identifier
.. _Amazon S3: http://aws.amazon.com/s3/
Expand Down
Loading