Creating Stubs For Python Modules

Frederick edited this page Jun 15, 2017 · 5 revisions

Introduction

The type checker only needs a stubs file to let programs access a Python module. There is no need to port the entire module to mypy. A stubs file is also a good starting point for porting an entire Python module to mypy. They can also highlight potential areas of improvement in the mypy type system.

A stubs file only contains a description of the public interface of the module without any implementations. Stubs can be dynamically typed, statically typed or a mixture of both. Based on preliminary results, it seems that over 95% of the names defined in the Python standard library can be given static types. The rest will be dynamically typed (use type 'Any') or have 'Any' components, but that is fine, since mypy supports seamless mixing of dynamic and static types in programs.

Mypy uses stubs from https://github.com/python/typeshed -- all stub changes should be contributed there. Mypy developers will periodically pull latest changes from typeshed to be included with mypy. There are stubs for both Python 2.7 and 3.x, though not every stub supports both 2 and 3. Some stubs can be shared between Python 2.7 and 3.x.

Examples

The file stubs/builtins.py is the stubs file for Python builtins. Here are a few examples from the stubs file (module global variables and a function):

True = ... # type: bool
False = ... # type: bool
...
def chr(code: int) -> str: ...

The unittest module stubs contain a good example of a stub for a class:

class TestSuite(Testable):
    def __init__(self, tests: Iterable[Testable] = None) -> None: ...
    def addTest(self, test: Testable) -> None: ...
    def addTests(self, tests: Iterable[Testable]) -> None: ...
    def run(self, result: TestResult) -> None: ...
    def debug(self) -> None: ...
    def countTestCases(self) -> int: ...

Using and testing new stubs files

Follow these steps to be able to use your newly created stubs file:

  • Create a directory named 'stubs' (it can be anywhere, but the name has to be exactly 'stubs').
  • Set the environment variable MYPYPATH to refer to the above directory. For example:
$ export MYPYPATH=/home/jukka/work/stubs
  • Add the stub file to the stubs directory. Use the normal Python file name conventions for modules, e.g. 'csv.py' for module 'csv'; use subdirectory with __init__.py for packages.
  • That's it! Now you can access the module in mypy programs and test that it works correctly.

Hints

Use the Python interactive interpreter to introspect modules and try them interactively. You should also consult the Python library reference, of course, since that is more definitive than doc strings. Example:

>>> import os
>>> dir(os)
[..., 'abort', 'access', ...]   # List of names defined in os
>>> help(os)
Help on module os:
... (contents omitted)
>>> help(os.abort)
Help on built-in function abort in module Posix:

abort(...):
    abort() -> does not return!
    ...

You can use help() with functions, methods, classes and modules.

Dynamically typed stubs

Often it's easier to start with dynamically typed stubs for a module. These still let mypy code to access the module, but static checking and type inference will be limited.

Here is a small example:

from typing import Any

x = ... # type: Any   # variable

def f(x, y=None): ...     # function

class C:
    def __init__(self, foo): ...    # method

Naming

If you introduce helpers not part of the actual module interface in your stubs, such as type variables or type aliases, prefix them with an underscore. This way if someone uses from m import * it doesn't pollute the namespace with the internal helpers. Generally avoid defining __all__, since it's mostly redundant as stubs mostly have public functionality. Imports made within the stub are currently visible via a * import, but this will likely change, so that stubs don't export imported names by default.

Common issues

Sometimes it may not be obvious how to add static types to some parts of a Python module. This section covers some of the more common issues.

Named tuples

You can using typing.NamedTuple to define named tuples with statically typed items. Here is an example:

from typing import NamedTuple

ClassName = NamedTuple('ClassName', [('item', int), ('another', str)])

Type of function return value depends on external state

Sometimes the type of a function return value depends on the value of a variable or other external state. This case generally can't be modelled statically (or it's not worth the complexity). However, this is not a problem per se: just use type 'Any' for the return value. Alternatively, you can use a type with some 'Any' components, such as 'Tuple[int, Any]'. Accompany these with a comment explaining your reasoning.

Python documentation does not give enough detail

The Python documentation sometimes fails to properly explain the type of some function arguments or return values. In this case you can always leave these types as 'Any', and add a TODO comment describing why. Somebody can later figure out the precise type. If you are more persistent, you can try doing some of the following:

  • Try calling the function with various argument types and see what works. This has the potential problem that sometimes a function accepts certain values by accident, and it might not be portable across Python implementations.
  • Browse the module source code (C or Python) or unit tests within the CPython source package and see what's going on. This is obviously only realistic if you are already somewhat familiar with the CPython source code structure or motivated enough to learn.

A function returns an anonymous class instance

Sometimes a function return an instance, and only the structure of the instance is described in the Python documentation -- the class has no documented name in Python. In this case, you can introduce a new class or interface with a short and descriptive name, and use this for the return value type. Explain prominently in comments that the class name is not derived from the Python implementation. The mypy-to-Python translator has to be modified to remove references to these synthetic class names during translation to Python, since they are not valid in Python code (e.g. from m import MyClassName would be an error in Python code). You don't have to do this; just mention the issue in a commit message and somebody else can deal with it. (TODO document how this is done)

Alternatively, the class may actually have a name, but perhaps that name is undocumented. You can use Python introspection to figure this out: just use print(type(ret_val)) and see if the result looks like something you could use (if the name is _xvf_internal_wrapper or similar you should not use it in the stubs; the name is likely to change between Python versions and is not intended to be a part of the public interface).

Function accepts arguments of various types (overloaded function)

If a function accepts arguments of various types, you generally need an overloaded function stub or a generic function stub.

TODO add examples

Protocols such as Iterable and Sequence

Often a functions expects an argument that implements a protocol such as Iterable.

Mypy builtins (import these from typing) include interfaces that match common Python protocols. Here is a partial list (have a look at stubs/3.2/typing.py for more):

  • Iterable[t] (this is a generic type -- you can replace t with any type, e.g. Iterable[int], Iterable[Any])
  • Iterator[t]
  • Sequence[t]
  • Sized (can be used with len)

File objects

TODO

Important modules that don't have stubs

It would be useful to have stubs for additional third-party modules. Here are links to pages that suggest useful modules: