scrapy · kmike · Apr 15, 2020 · Mar 19, 2020 · Mar 19, 2020 · Mar 24, 2020
diff --git a/docs/intro/tutorial.rst b/docs/intro/tutorial.rst
@@ -25,16 +25,16 @@ Scrapy.
 If you're already familiar with other languages, and want to learn Python quickly, the `Python Tutorial`_ is a good resource.
 
 If you're new to programming and want to start with Python, the following books
-may be useful to you: 
+may be useful to you:
 
 * `Automate the Boring Stuff With Python`_
 
-* `How To Think Like a Computer Scientist`_ 
+* `How To Think Like a Computer Scientist`_
 
-* `Learn Python 3 The Hard Way`_ 
+* `Learn Python 3 The Hard Way`_
 
 You can also take a look at `this list of Python resources for non-programmers`_,
-as well as the `suggested resources in the learnpython-subreddit`_. 
+as well as the `suggested resources in the learnpython-subreddit`_.
 
 .. _Python: https://www.python.org/
 .. _this list of Python resources for non-programmers: https://wiki.python.org/moin/BeginnersGuide/NonProgrammers
@@ -62,7 +62,7 @@ This will create a ``tutorial`` directory with the following contents::
             __init__.py
 
             items.py          # project items definition file
-            
+
             middlewares.py    # project middlewares file
 
             pipelines.py      # project pipelines file
@@ -287,8 +287,8 @@ to be scraped, you can at least get **some** data.
 
 Besides the :meth:`~scrapy.selector.SelectorList.getall` and
 :meth:`~scrapy.selector.SelectorList.get` methods, you can also use
-the :meth:`~scrapy.selector.SelectorList.re` method to extract using `regular
-expressions`_:
+the :meth:`~scrapy.selector.SelectorList.re` method to extract using
+:doc:`regular expressions <library/re>`:
 
 >>> response.css('title::text').re(r'Quotes.*')
 ['Quotes to Scrape']
@@ -305,7 +305,6 @@ with a selector (see :ref:`topics-developer-tools`).
 `Selector Gadget`_ is also a nice tool to quickly find CSS selector for
 visually selected elements, which works in many browsers.
 
-.. _regular expressions: https://docs.python.org/3/library/re.html
 .. _Selector Gadget: https://selectorgadget.com/
 
 

diff --git a/docs/topics/contracts.rst b/docs/topics/contracts.rst
@@ -136,7 +136,7 @@ Detecting check runs
 ====================
 
 When ``scrapy check`` is running, the ``SCRAPY_CHECK`` environment variable is
-set to the ``true`` string. You can use `os.environ`_ to perform any change to
+set to the ``true`` string. You can use :data:`os.environ` to perform any change to
 your spiders or your settings when ``scrapy check`` is used::
 
     import os
@@ -148,5 +148,3 @@ your spiders or your settings when ``scrapy check`` is used::
         def __init__(self):
             if os.environ.get('SCRAPY_CHECK'):
                 pass  # Do some scraper adjustments when a check is running
-
-.. _os.environ: https://docs.python.org/3/library/os.html#os.environ
diff --git a/docs/topics/coroutines.rst b/docs/topics/coroutines.rst
@@ -76,8 +76,8 @@ becomes::
 
 Coroutines may be used to call asynchronous code. This includes other
 coroutines, functions that return Deferreds and functions that return
-`awaitable objects`_ such as :class:`~asyncio.Future`. This means you can use
-many useful Python libraries providing such code::
+:term:`awaitable objects <awaitable>` such as :class:`~asyncio.Future`.
+This means you can use many useful Python libraries providing such code::
 
     class MySpider(Spider):
         # ...
@@ -107,4 +107,3 @@ Common use cases for asynchronous code include:
   :ref:`the screenshot pipeline example<ScreenshotPipeline>`).
 
 .. _aio-libs: https://github.com/aio-libs
-.. _awaitable objects: https://docs.python.org/3/glossary.html#term-awaitable
diff --git a/docs/topics/downloader-middleware.rst b/docs/topics/downloader-middleware.rst
@@ -739,7 +739,7 @@ HttpProxyMiddleware
    This middleware sets the HTTP proxy to use for requests, by setting the
    ``proxy`` meta value for :class:`~scrapy.http.Request` objects.
 
-   Like the Python standard library modules `urllib`_ and `urllib2`_, it obeys
+   Like the Python standard library module :mod:`urllib.request`, it obeys
    the following environment variables:
 
    * ``http_proxy``
@@ -751,9 +751,6 @@ HttpProxyMiddleware
    Keep in mind this value will take precedence over ``http_proxy``/``https_proxy``
    environment variables, and it will also ignore ``no_proxy`` environment variable.
 
-.. _urllib: https://docs.python.org/2/library/urllib.html
-.. _urllib2: https://docs.python.org/2/library/urllib2.html
-
 RedirectMiddleware
 ------------------
 
@@ -1036,8 +1033,7 @@ Scrapy uses this parser by default.
 RobotFileParser
 ~~~~~~~~~~~~~~~
 
-Based on `RobotFileParser
-<https://docs.python.org/3.7/library/urllib.robotparser.html>`_:
+Based on :class:`~urllib.robotparser.RobotFileParser`:
 
 * is Python's built-in robots.txt_ parser
 

diff --git a/docs/topics/dynamic-content.rst b/docs/topics/dynamic-content.rst
@@ -115,7 +115,7 @@ data from it depends on the type of response:
 -   If the response is HTML or XML, use :ref:`selectors
     <topics-selectors>` as usual.
 
--   If the response is JSON, use `json.loads`_ to load the desired data from
+-   If the response is JSON, use :func:`json.loads` to load the desired data from
     :attr:`response.text <scrapy.http.TextResponse.text>`::
 
         data = json.loads(response.text)
@@ -130,8 +130,9 @@ data from it depends on the type of response:
 -   If the response is JavaScript, or HTML with a ``<script/>`` element
     containing the desired data, see :ref:`topics-parsing-javascript`.
 
--   If the response is CSS, use a `regular expression`_ to extract the desired
-    data from :attr:`response.text <scrapy.http.TextResponse.text>`.
+-   If the response is CSS, use a :doc:`regular expression <library/re>` to
+    extract the desired data from
+    :attr:`response.text <scrapy.http.TextResponse.text>`.
 
 .. _topics-parsing-images:
 
@@ -168,8 +169,9 @@ JavaScript code:
 Once you have a string with the JavaScript code, you can extract the desired
 data from it:
 
--   You might be able to use a `regular expression`_ to extract the desired
-    data in JSON format, which you can then parse with `json.loads`_.
+-   You might be able to use a :doc:`regular expression <library/re>` to
+    extract the desired data in JSON format, which you can then parse with
+    :func:`json.loads`.
 
     For example, if the JavaScript code contains a separate line like
     ``var data = {"field": "value"};`` you can extract that data as follows:
@@ -241,9 +243,7 @@ along with `scrapy-selenium`_ for seamless integration.
 .. _headless browser: https://en.wikipedia.org/wiki/Headless_browser
 .. _JavaScript: https://en.wikipedia.org/wiki/JavaScript
 .. _js2xml: https://github.com/scrapinghub/js2xml
-.. _json.loads: https://docs.python.org/3/library/json.html#json.loads
 .. _pytesseract: https://github.com/madmaze/pytesseract
-.. _regular expression: https://docs.python.org/3/library/re.html
 .. _scrapy-selenium: https://github.com/clemfromspace/scrapy-selenium
 .. _scrapy-splash: https://github.com/scrapy-plugins/scrapy-splash
 .. _Selenium: https://www.selenium.dev/

diff --git a/docs/topics/email.rst b/docs/topics/email.rst
@@ -7,16 +7,14 @@ Sending e-mail
 .. module:: scrapy.mail
    :synopsis: Email sending facility
 
-Although Python makes sending e-mails relatively easy via the `smtplib`_
+Although Python makes sending e-mails relatively easy via the :mod:`smtplib`
 library, Scrapy provides its own facility for sending e-mails which is very
 easy to use and it's implemented using :doc:`Twisted non-blocking IO
 <twisted:core/howto/defer-intro>`, to avoid interfering with the non-blocking
 IO of the crawler. It also provides a simple API for sending attachments and
 it's very easy to configure, with a few :ref:`settings
 <topics-email-settings>`.
 
-.. _smtplib: https://docs.python.org/2/library/smtplib.html
-
 Quick example
 =============
 

diff --git a/docs/topics/exporters.rst b/docs/topics/exporters.rst
@@ -311,7 +311,7 @@ CsvItemExporter
 
    The additional keyword arguments of this ``__init__`` method are passed to the
    :class:`BaseItemExporter` ``__init__`` method, and the leftover arguments to the
-   `csv.writer`_ ``__init__`` method, so you can use any ``csv.writer`` ``__init__`` method
+   :func:`csv.writer` function, so you can use any :func:`csv.writer` function
    argument to customize this exporter.
 
    A typical output of this exporter would be::
@@ -320,8 +320,6 @@ CsvItemExporter
       Color TV,1200
       DVD player,200
 
-.. _csv.writer: https://docs.python.org/2/library/csv.html#csv.writer
-
 PickleItemExporter
 ------------------
 
@@ -335,15 +333,13 @@ PickleItemExporter
    :param protocol: The pickle protocol to use.
    :type protocol: int
 
-   For more information, refer to the `pickle module documentation`_.
+   For more information, see :mod:`pickle`.
 
    The additional keyword arguments of this ``__init__`` method are passed to the
    :class:`BaseItemExporter` ``__init__`` method.
 
    Pickle isn't a human readable format, so no output examples are provided.
 
-.. _pickle module documentation: https://docs.python.org/2/library/pickle.html
-
 PprintItemExporter
 ------------------
 
@@ -372,8 +368,8 @@ JsonItemExporter
    Exports Items in JSON format to the specified file-like object, writing all
    objects as a list of objects. The additional ``__init__`` method arguments are
    passed to the :class:`BaseItemExporter` ``__init__`` method, and the leftover
-   arguments to the `JSONEncoder`_ ``__init__`` method, so you can use any
-   `JSONEncoder`_ ``__init__`` method argument to customize this exporter.
+   arguments to the :class:`~json.JSONEncoder` ``__init__`` method, so you can use any
+   :class:`~json.JSONEncoder` ``__init__`` method argument to customize this exporter.
 
    :param file: the file-like object to use for exporting the data. Its ``write`` method should
                 accept ``bytes`` (a disk file opened in binary mode, a ``io.BytesIO`` object, etc)
@@ -393,8 +389,6 @@ JsonItemExporter
       stream-friendly format, consider using :class:`JsonLinesItemExporter`
       instead, or splitting the output in multiple chunks.
 
-.. _JSONEncoder: https://docs.python.org/2/library/json.html#json.JSONEncoder
-
 JsonLinesItemExporter
 ---------------------
 
@@ -403,8 +397,8 @@ JsonLinesItemExporter
    Exports Items in JSON format to the specified file-like object, writing one
    JSON-encoded item per line. The additional ``__init__`` method arguments are passed
    to the :class:`BaseItemExporter` ``__init__`` method, and the leftover arguments to
-   the `JSONEncoder`_ ``__init__`` method, so you can use any `JSONEncoder`_
-   ``__init__`` method argument to customize this exporter.
+   the :class:`~json.JSONEncoder` ``__init__`` method, so you can use any
+   :class:`~json.JSONEncoder` ``__init__`` method argument to customize this exporter.
 
    :param file: the file-like object to use for exporting the data. Its ``write`` method should
                 accept ``bytes`` (a disk file opened in binary mode, a ``io.BytesIO`` object, etc)
@@ -417,8 +411,6 @@ JsonLinesItemExporter
    Unlike the one produced by :class:`JsonItemExporter`, the format produced by
    this exporter is well suited for serializing large amounts of data.
 
-.. _JSONEncoder: https://docs.python.org/2/library/json.html#json.JSONEncoder
-
 MarshalItemExporter
 -------------------
 

diff --git a/docs/topics/extensions.rst b/docs/topics/extensions.rst
@@ -364,13 +364,12 @@ Debugger extension
 
 .. class:: Debugger
 
-Invokes a `Python debugger`_ inside a running Scrapy process when a `SIGUSR2`_
+Invokes a :doc:`Python debugger <library/pdb>` inside a running Scrapy process when a `SIGUSR2`_
 signal is received. After the debugger is exited, the Scrapy process continues
 running normally.
 
 For more info see `Debugging in Python`_.
 
 This extension only works on POSIX-compliant platforms (i.e. not Windows).
 
-.. _Python debugger: https://docs.python.org/2/library/pdb.html
 .. _Debugging in Python: https://pythonconquerstheuniverse.wordpress.com/2009/09/10/debugging-in-python/
diff --git a/docs/topics/items.rst b/docs/topics/items.rst
@@ -15,17 +15,15 @@ especially in a larger project with many spiders.
 
 To define common output data format Scrapy provides the :class:`Item` class.
 :class:`Item` objects are simple containers used to collect the scraped data.
-They provide a `dictionary-like`_ API with a convenient syntax for declaring
-their available fields.
+They provide an API similar to :class:`dict` API with a convenient syntax
+for declaring their available fields.
 
 Various Scrapy components use extra information provided by Items:
 exporters look at declared fields to figure out columns to export,
 serialization can be customized using Item fields metadata, :mod:`trackref`
 tracks Item instances to help find memory leaks
 (see :ref:`topics-leaks-trackrefs`), etc.
 
-.. _dictionary-like: https://docs.python.org/2/library/stdtypes.html#dict
-
 .. _topics-items-declaring:
 
 Declaring Items
@@ -79,7 +77,7 @@ Working with Items
 
 Here are some examples of common tasks performed with items, using the
 ``Product`` item :ref:`declared above  <topics-items-declaring>`. You will
-notice the API is very similar to the `dict API`_.
+notice the API is very similar to the :class:`dict` API.
 
 Creating items
 --------------
@@ -145,7 +143,7 @@ KeyError: 'Product does not support field: lala'
 Accessing all populated values
 ------------------------------
 
-To access all populated values, just use the typical `dict API`_:
+To access all populated values, just use the typical :class:`dict` API:
 
 >>> product.keys()
 ['price', 'name']
@@ -162,11 +160,9 @@ Copying items
 To copy an item, you must first decide whether you want a shallow copy or a
 deep copy.
 
-If your item contains mutable_ values like lists or dictionaries, a shallow
-copy will keep references to the same mutable values across all different
-copies.
-
-.. _mutable: https://docs.python.org/3/glossary.html#term-mutable
+If your item contains :term:`mutable` values like lists or dictionaries,
+a shallow copy will keep references to the same mutable values across all
+different copies.
 
 For example, if you have an item with a list of tags, and you create a shallow
 copy of that item, both the original item and the copy have the same list of
@@ -175,9 +171,7 @@ other item as well.
 
 If that is not the desired behavior, use a deep copy instead.
 
-See the `documentation of the copy module`_ for more information.
-
-.. _documentation of the copy module: https://docs.python.org/3/library/copy.html
+See :mod:`copy` for more information.
 
 To create a shallow copy of an item, you can either call
 :meth:`~scrapy.item.Item.copy` on an existing item
@@ -235,8 +229,8 @@ Item objects
 
     Return a new Item optionally initialized from the given argument.
 
-    Items replicate the standard `dict API`_, including its ``__init__`` method, and
-    also provide the following additional API members:
+    Items replicate the standard :class:`dict` API, including its ``__init__``
+    method, and also provide the following additional API members:
 
     .. automethod:: copy
 
@@ -249,22 +243,17 @@ Item objects
         :class:`Field` objects used in the :ref:`Item declaration
         <topics-items-declaring>`.
 
-.. _dict API: https://docs.python.org/2/library/stdtypes.html#dict
-
 Field objects
 =============
 
 .. class:: Field([arg])
 
-    The :class:`Field` class is just an alias to the built-in `dict`_ class and
+    The :class:`Field` class is just an alias to the built-in :class:`dict` class and
     doesn't provide any extra functionality or attributes. In other words,
     :class:`Field` objects are plain-old Python dicts. A separate class is used
     to support the :ref:`item declaration syntax <topics-items-declaring>`
     based on class attributes.
 
-.. _dict: https://docs.python.org/2/library/stdtypes.html#dict
-
-
 Other classes related to Item
 =============================