Skip to content

Commit

Permalink
Major update to "using mypy with existing codebase"
Browse files Browse the repository at this point in the history
See python#13681

In particular, it's really common to want to make progress towards
`--strict`, and we currently don't really have any guidance on doing so.
Per-module ignore_errors is also a really useful tool for adopting mypy.

Also try to link more to existing documentation elsewhere.
  • Loading branch information
hauntsaninja committed Sep 19, 2022
1 parent 11be378 commit 0dee1f4
Showing 1 changed file with 155 additions and 90 deletions.
245 changes: 155 additions & 90 deletions docs/source/existing_code.rst
Original file line number Diff line number Diff line change
Expand Up @@ -7,46 +7,94 @@ This section explains how to get started using mypy with an existing,
significant codebase that has little or no type annotations. If you are
a beginner, you can skip this section.

These steps will get you started with mypy on an existing codebase:
Start small
-----------

1. Start small -- get a clean mypy build for some files, with few
annotations
If your codebase is large, pick a subset of your codebase (say, 5,000 to 50,000
lines) and get mypy to run successfully only on this subset at first, *before
adding annotations*. This should be doable in a day or two. The sooner you get
some form of mypy passing on your codebase, the sooner you benefit.

2. Write a mypy runner script to ensure consistent results
You'll likely need to fix some mypy errors, either by inserting
annotations requested by mypy or by adding ``# type: ignore``
comments to silence errors you don't want to fix now.

3. Run mypy in Continuous Integration to prevent type errors
We'll mention some tips for getting mypy passing on your codebase in various
sections below.

4. Gradually annotate commonly imported modules
Run mypy consistently and prevent regressions
---------------------------------------------

5. Write annotations as you modify existing code and write new code
Make sure all developers on your codebase run mypy the same way.
One way to ensure this is adding a small script with your mypy
invocation to your codebase, or adding your mypy invocation to
existing tools you use to run tests, like ``tox``.

6. Use :doc:`monkeytype:index` or `PyAnnotate`_ to automatically annotate legacy code
* Make sure everyone runs mypy with the same options. Checking a mypy
:ref:`configuration file <config-file>` into your codebase can help
with this.

We discuss all of these points in some detail below, and a few optional
follow-up steps.
* Make sure everyone type checks the same set of files. See
:ref:`specifying-code-to-be-checked` for details.

Start small
-----------
* Make sure everyone runs mypy with the same version of mypy, for instance
by pinning mypy with the rest of your dev requirements.

If your codebase is large, pick a subset of your codebase (say, 5,000
to 50,000 lines) and run mypy only on this subset at first,
*without any annotations*. This shouldn't take more than a day or two
to implement, so you start enjoying benefits soon.
In particular, you'll want to make sure to run mypy as part of your
Continuous Integration (CI) system as soon as possible. This will
prevent new type errors from being introduced into your codebase.

You'll likely need to fix some mypy errors, either by inserting
annotations requested by mypy or by adding ``# type: ignore``
comments to silence errors you don't want to fix now.
A simple CI script could look something like this:

.. code-block:: text
python3 -m pip install mypy==0.971
# Run your standardised mypy invocation, e.g.
mypy my_project
# This could also look like `scripts/run_mypy.sh`, `tox -e mypy`, `make mypy`, etc
Ignoring errors from certain modules
------------------------------------

In particular, mypy often generates errors about modules that it can't
find or that don't have stub files:
By default mypy will follow imports in your code and try to check everything.
This means even if you only pass in a few files to mypy, it may still process a
large number of imported files. This could potentially result in lots of errors
you don't want to deal with at the moment.

One way to deal with this is to ignore errors in modules you aren't yet ready to
type check. The :confval:`ignore_errors` option is useful for this, for instance,
if you aren't yet ready to deal with errors from ``package_to_fix_later``:

.. code-block:: text
[mypy-package_to_fix_later.*]
ignore_errors = True
You could even invert this, by setting ``ignore_errors = True`` in your global
config section and only enabling error reporting with ``ignore_errors = False``
for the set of modules you are ready to type check.

Fixing errors related to imports
--------------------------------

A common class of error you will encounter is errors from mypy about modules
that it can't find, that don't have types, or don't have stub files:

.. code-block:: text
core/config.py:7: error: Cannot find implementation or library stub for module named 'frobnicate'
core/model.py:9: error: Cannot find implementation or library stub for module named 'acme'
...
This is normal, and you can easily ignore these errors. For example,
Sometimes these can be fixed by installing the relevant packages or
stub libraries in the environment you're running ``mypy`` in.

See :ref:`ignore-missing-imports` for a complete reference on these errors
and the ways in which you can fix them.

You'll likely find that you want to suppress all errors from importing
a given module that doesn't have types. If you only import that module
in one or two places, you can use ``# type: ignore`` comments. For example,
here we ignore an error about a third-party module ``frobnicate`` that
doesn't have stubs using ``# type: ignore``:

Expand All @@ -56,9 +104,9 @@ doesn't have stubs using ``# type: ignore``:
...
frobnicate.initialize() # OK (but not checked)
You can also use a mypy configuration file, which is convenient if
there are a large number of errors to ignore. For example, to disable
errors about importing ``frobnicate`` and ``acme`` everywhere in your
But if you import the module in many places, this becomes unwieldy. In this
case, we recommend using a :ref:`configuration file <config-file>`. For example,
to disable errors about importing ``frobnicate`` and ``acme`` everywhere in your
codebase, use a config like this:

.. code-block:: text
Expand All @@ -69,69 +117,33 @@ codebase, use a config like this:
[mypy-acme.*]
ignore_missing_imports = True
You can add multiple sections for different modules that should be
ignored.

If your config file is named ``mypy.ini``, this is how you run mypy:

.. code-block:: text
mypy --config-file mypy.ini mycode/
If you get a large number of errors, you may want to ignore all errors
about missing imports. This can easily cause problems later on and
hide real errors, and it's only recommended as a last resort.
For more details, look :ref:`here <follow-imports>`.
about missing imports, for instance by setting :confval:`ignore_missing_imports`
to true globally. This can hide errors later on, so we recommend avoiding this
if possible.

Mypy follows imports by default. This can result in a few files passed
on the command line causing mypy to process a large number of imported
files, resulting in lots of errors you don't want to deal with at the
moment. There is a config file option to disable this behavior, but
since this can hide errors, it's not recommended for most users.
Finally, mypy allows fine-grained control over specific import following
behaviour. It's very easy to silently shoot yourself in the foot when playing
around with these, so it's mostly recommended as a last resort. For more
details, look :ref:`here <follow-imports>`.

Mypy runner script
------------------

Introduce a mypy runner script that runs mypy, so that every developer
will use mypy consistently. Here are some things you may want to do in
the script:

* Ensure that the correct version of mypy is installed.

* Specify mypy config file or command-line options.

* Provide set of files to type check. You may want to implement
inclusion and exclusion filters for full control of the file
list.

Continuous Integration
----------------------

Once you have a clean mypy run and a runner script for a part
of your codebase, set up your Continuous Integration (CI) system to
run mypy to ensure that developers won't introduce bad annotations.
A simple CI script could look something like this:

.. code-block:: text
python3 -m pip install mypy==0.790 # Pinned version avoids surprises
scripts/mypy # Run the mypy runner script you set up
Annotate widely imported modules
--------------------------------
Prioritise annotating widely imported modules
---------------------------------------------

Most projects have some widely imported modules, such as utilities or
model classes. It's a good idea to annotate these pretty early on,
since this allows code using these modules to be type checked more
effectively. Since mypy supports gradual typing, it's okay to leave
some of these modules unannotated. The more you annotate, the more
useful mypy will be, but even a little annotation coverage is useful.
effectively.

Mypy is designed to support gradual typing, i.e. letting you add annotations at
your own pace, so it's okay to leave some of these modules unannotated. The more
you annotate, the more useful mypy will be, but even a little annotation
coverage is useful.

Write annotations as you go
---------------------------

Now you are ready to include type annotations in your development
workflows. Consider adding something like these in your code style
Consider adding something like these in your code style
conventions:

1. Developers should add annotations for any new code.
Expand All @@ -143,9 +155,9 @@ codebase without much effort.
Automate annotation of legacy code
----------------------------------

There are tools for automatically adding draft annotations
based on type profiles collected at runtime. Tools include
:doc:`monkeytype:index` (Python 3) and `PyAnnotate`_.
There are tools for automatically adding draft annotations based on simple
static analysis or on type profiles collected at runtime. Tools include
:doc:`monkeytype:index`, `autotyping`_ and `PyAnnotate`_.

A simple approach is to collect types from test runs. This may work
well if your test coverage is good (and if your tests aren't very
Expand All @@ -156,6 +168,68 @@ fraction of production network requests. This clearly requires more
care, as type collection could impact the reliability or the
performance of your service.

Introduce stricter options
--------------------------

Mypy is very configurable. Once you get started with static typing, you may want
to explore the various strictness options mypy provides to catch more bugs. For
example, you can ask mypy to require annotations for all functions in certain
modules to avoid accidentally introducing code that won't be type checked using
:confval:`disallow_untyped_defs`. Refer to :ref:`config-file` for the details.

An excellent goal to aim for is to have your codebase pass when run against ``mypy --strict``.
This basically ensures that you will never have a type related error without an explicit
circumvention somewhere (such as a ``# type: ignore`` comment).

The following config is equivalent to ``--strict``:

.. code-block:: text
# Start off with these
warn_unused_configs = True
warn_redundant_casts = True
warn_unused_ignores = True
no_implicit_optional = True
# Getting these passing should be easy
strict_equality = True
strict_concatenate = True
# Strongly recommend enabling this one as soon as you can
check_untyped_defs = True
# These shouldn't be too much additional work, but may be tricky to
# get passing if you use a lot of untyped libraries
disallow_subclassing_any = True
disallow_untyped_decorators = True
disallow_any_generics = True
# These next few are various gradations of forcing use of type annotations
disallow_untyped_calls = True
disallow_incomplete_defs = True
disallow_untyped_defs = True
# This one isn't too hard to get passing, but return on investment is lower
no_implicit_reexport = True
# This one can be tricky to get passing if you use a lot of untyped libraries
warn_return_any = True
Note that you can also start with ``--strict`` and subtract, for instance:

.. code-block:: text
strict = True
warn_return_any = False
Remember that many of these options can be enabled on a per-module basis. For instance,
you may want to enable ``disallow_untyped_defs`` for modules which you've completed
annotations for, in order to prevent new code from being added without annotations.

And if you want, it doesn't stop at ``--strict``. Mypy has additional checks
that are not part of ``--strict`` that can be useful. See the complete
:ref:`command-line` reference and :ref:`error-codes-optional`.

Speed up mypy runs
------------------

Expand All @@ -165,14 +239,5 @@ this will be. If your project has at least 100,000 lines of code or
so, you may also want to set up :ref:`remote caching <remote-cache>`
for further speedups.

Introduce stricter options
--------------------------

Mypy is very configurable. Once you get started with static typing, you may want
to explore the various strictness options mypy provides to catch more bugs. For
example, you can ask mypy to require annotations for all functions in certain
modules to avoid accidentally introducing code that won't be type checked using
:confval:`disallow_untyped_defs`, or type check code without annotations as well
with :confval:`check_untyped_defs`. Refer to :ref:`config-file` for the details.

.. _PyAnnotate: https://github.com/dropbox/pyannotate
.. _autotyping: https://github.com/JelleZijlstra/autotyping

0 comments on commit 0dee1f4

Please sign in to comment.