diff --git a/docs/source/actions.rst b/docs/source/actions.rst index 65198084..730b5c19 100644 --- a/docs/source/actions.rst +++ b/docs/source/actions.rst @@ -475,7 +475,7 @@ SPIDERMON_REPORT_S3_REGION_ENDPOINT .. _actions-sentry-action: Sentry action -============ +============= This action allows you to send custom messages to `Sentry`_ when your monitor suites finish their execution. To use this action @@ -533,7 +533,7 @@ It could be set to any level provided by `Sentry Log Level`_ .. _SPIDERMON_SENTRY_FAKE: SPIDERMON_SENTRY_FAKE --------------------- +--------------------- Default: ``False`` diff --git a/docs/source/getting-started.rst b/docs/source/getting-started.rst index f1d3391f..de3d1252 100644 --- a/docs/source/getting-started.rst +++ b/docs/source/getting-started.rst @@ -229,8 +229,8 @@ Item validation Item validators allows you to match your returned items with predetermined structure ensuring that all fields contains data in the expected format. Spidermon allows -you to choose between schematics_ or `JSON Schema`_ to define the structure -of your item. +you to choose from schematics_,`JSON Schema`_ or `cerberus`_ to define structure and +validation tool needed for your item. In this tutorial, we will use a schematics_ model to make sure that all required fields are populated and they are all of the correct format. @@ -385,6 +385,7 @@ The resulted item will look like this: .. _`JSON Schema`: https://json-schema.org/ .. _`schematics`: https://schematics.readthedocs.io/en/latest/ +.. _`cerberus`: https://docs.python-cerberus.org/en/latest/index.html .. _`Scrapy`: https://scrapy.org/ .. _`Scrapy items`: https://docs.scrapy.org/en/latest/topics/items.html .. _`Scrapy Tutorial`: https://doc.scrapy.org/en/latest/intro/tutorial.html diff --git a/docs/source/item-validation.rst b/docs/source/item-validation.rst index 8b7043ae..1d1a98f5 100644 --- a/docs/source/item-validation.rst +++ b/docs/source/item-validation.rst @@ -18,7 +18,7 @@ the first step is to enable the built-in item pipeline in your project settings: } After that, you need to choose which validation library will be used. Spidermon -accepts schemas defined using schematics_ or `JSON Schema`_. +accepts schemas defined using schematics_, `JSON Schema`_ or cerberus_. With schematics --------------- @@ -87,6 +87,34 @@ an example of a schema for the quotes item from the :doc:`tutorial `. + +.. code-block:: json + + { + "quote": {"type": "string", "required": true}, + "author": {"type": "string", "required": true}, + "author_url": {"type": "string"}, + "tags": {"type": "list"} + } + +To use Cerberus validation, you would need to add +:ref:`SPIDERMON_VALIDATION_CERBERUS` setting to your `settings.py` + Settings -------- @@ -193,6 +221,36 @@ as a `dict`: OtherItem: '/path/to/otheritem_schema.json', } +.. _SPIDERMON_VALIDATION_CERBERUS: + +SPIDERMON_VALIDATION_CERBERUS +^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ + +Default: ``None`` + +A `list` containing the local path of the item schema. + +.. code-block:: python + + # settings.py + + SPIDERMON_VALIDATION_CERBERUS = [ + '/path/to/schema.json', + 'http://example.com/mycerberusschema', + {"Field": {"type": "number", "required":True}} + ] + +If you are working on a spider that produces multiple items types, you can define paths to schema for each item as `dict` as shown below: + + # settings.py + + from quotes.items import DummyItem, OtherItem + + SPIDERMON_VALIDATION_CERBERUS = { + DummyItem: '/path/to/dummyitem_schema.json', + OtherItem: '/path/to/otheritem_schema.json', + } + Validation in Monitors ---------------------- @@ -238,3 +296,6 @@ Some examples: .. _`guide`: http://json-schema.org/learn/getting-started-step-by-step.html .. _`schematics models`: https://schematics.readthedocs.io/en/latest/usage/models.html .. _`jsonschema`: https://pypi.org/project/jsonschema/ +.. _`cerberus`: https://pypi.org/project/Cerberus/ +.. _`usage`: http://docs.python-cerberus.org/en/latest/usage.html +.. _`validation-rules`: http://docs.python-cerberus.org/en/latest/validation-rules.html