Skip to content

Commit

Permalink
Deprecate ReppyRobotParser (#6099)
Browse files Browse the repository at this point in the history
  • Loading branch information
wRAR committed Oct 18, 2023
1 parent 991121f commit 39ee8d1
Show file tree
Hide file tree
Showing 2 changed files with 5 additions and 1 deletion.
3 changes: 2 additions & 1 deletion docs/topics/downloader-middleware.rst
Original file line number Diff line number Diff line change
Expand Up @@ -1039,8 +1039,8 @@ RobotsTxtMiddleware

* :ref:`Protego <protego-parser>` (default)
* :ref:`RobotFileParser <python-robotfileparser>`
* :ref:`Reppy <reppy-parser>`
* :ref:`Robotexclusionrulesparser <rerp-parser>`
* :ref:`Reppy <reppy-parser>` (deprecated)

You can change the robots.txt_ parser with the :setting:`ROBOTSTXT_PARSER`
setting. Or you can also :ref:`implement support for a new parser <support-for-new-robots-parser>`.
Expand Down Expand Up @@ -1133,6 +1133,7 @@ In order to use this parser:

.. warning:: `Upstream issue #122
<https://github.com/seomoz/reppy/issues/122>`_ prevents reppy usage in Python 3.9+.
Because of this the Reppy parser is deprecated.

* Set :setting:`ROBOTSTXT_PARSER` setting to
``scrapy.robotstxt.ReppyRobotParser``
Expand Down
3 changes: 3 additions & 0 deletions scrapy/robotstxt.py
Original file line number Diff line number Diff line change
@@ -1,7 +1,9 @@
import logging
import sys
from abc import ABCMeta, abstractmethod
from warnings import warn

from scrapy.exceptions import ScrapyDeprecationWarning
from scrapy.utils.python import to_unicode

logger = logging.getLogger(__name__)
Expand Down Expand Up @@ -79,6 +81,7 @@ def allowed(self, url, user_agent):

class ReppyRobotParser(RobotParser):
def __init__(self, robotstxt_body, spider):
warn("ReppyRobotParser is deprecated.", ScrapyDeprecationWarning, stacklevel=2)
from reppy.robots import Robots

self.spider = spider
Expand Down

0 comments on commit 39ee8d1

Please sign in to comment.