Collection of Python files that contain API definitions of third-party libraries extended for Python static analysis tools
Python
Latest commit d5c7a6c Aug 31, 2016 @vlasovskikh vlasovskikh committed on GitHub Merge pull request #20 from nicoddemus/pytest-30
Update skeletons for pytest 3.0
Permalink
Failed to load latest commit information.
asyncio Skeleton for 'asyncio.events.get_event_loop()` Jul 1, 2014
django Django form models added Mar 10, 2016
lettuce each_outline added to lettuce (supported since 0.22) May 10, 2016
multiprocessing Fix misspelling of "children" in multiprocessing skeleton Feb 18, 2016
nose Headers for skeletons of third-party libraries Oct 31, 2013
numpy updated skeletons for unary numpy.ndarray operators May 16, 2016
os Sync with PyCharm 4.5 Sep 9, 2015
py Added skeleton for py.error. May 11, 2016
pytest Add pytest.register_assert_rewrite function Aug 31, 2016
AUTHORS.txt Added license and authors files Nov 12, 2013
LICENSE.txt Added license and authors files Nov 12, 2013
README.md Sync with PyCharm 4.5 Sep 9, 2015
StringIO.py Added skeleton for StringIO Oct 22, 2013
__builtin__.py Add skeletons for the built-in collections `set` and `frozenset` Jun 20, 2016
alembic.py Add skeleton for alembic. Aug 14, 2015
behave.py PY-13641 Behave: unresolved reference: false positive for uppercase s… Aug 7, 2014
builtins.py Merge pull request #17 from east825/set-frozenset-skeletons Jun 21, 2016
cStringIO.py Provide return values of the correct type where possible Oct 30, 2013
collections.py Sync with PyCharm 5.0.2 Dec 14, 2015
datetime.py Fixed encoding issues Oct 30, 2013
decimal.py Added skeleton for 'decimal' module Oct 28, 2013
functools.py Moved reduce() skeleton from builtins to functools for Python 3 Sep 19, 2014
io.py Python 2.6-3.3 compatibility Oct 31, 2013
itertools.py Sync with PyCharm 4.5 Sep 9, 2015
logging.py Add logging.getLogger to skeletons Jul 15, 2016
math.py Added skeleton for 'math' module Oct 25, 2013
pathlib.py added exist_ok param to pathlib.Path.mkdir() Dec 8, 2015
pickle.py Skeleton for 'pickle' module (PY-13432) Jul 7, 2014
re.py Add re.fullmatch and __Regex.fullmatch to skeletons Jul 15, 2016
shutil.py Modify the return type of shutil functions. Jul 14, 2016
sqlite3.py Added skeleton for 'sqlite3' module Oct 29, 2013
struct.py Python 2.6-3.3 compatibility Oct 31, 2013
subprocess.py Python 2.6-3.3 compatibility Oct 31, 2013
sys.py Sync with PyCharm 4.5 Sep 9, 2015

README.md

Python Skeletons

This proposal is a draft.

Python skeletons are Python files that contain API definitions of existing libraries extended for static analysis tools.

Rationale

Python is a dynamic language less suitable for static code analysis than static languages like C or Java. Although Python static analysis tools can extract some information from Python source code without executing it, this information is often very shallow and incomplete.

Dynamic features of Python are very useful for user code. But using these features in APIs of third-party libraries and the standard library is not always a good idea. Tools (and users, in fact) need clear definitions of APIs. Often library API definitions are quite static and easy to grasp (defined using class, def), but types of function parameters and return values usually are not specified. Sometimes API definitions involve metaprogramming.

As there is not enough information in API definition code of libraries, developers of static analysis tools collect extended API data themselves and store it in their own formats. For example, PyLint uses imperative AST transformations of API modules in order to extend them with hard-coded data. PyCharm extends APIs via its proprietary database of declarative type annotations. The absence of a common extended API information format makes it hard for developers and users of tools to collect and share data.

Proposal

The proposal is to create a common database of extended API definitions as a collection of Python files called skeletons. Static analysis tools already understand Python code, so it should be easy to start extracting API definitions from these Python skeleton files. Regular function and class definitions can be extended with additional docstrings and decorators, e.g. for providing types of function parameters and return values. Static analysis tools may use a subset of information contained in skeleton files needed for their operation. Using Python files instead of a custom API definition format will also make it easier for users to populate the skeletons database.

Declarative Python API definitions for static analysis tools cannot cover all dynamic tricks used in real APIs of libraries: some of them still require library-specific code analysis. Nevertheless the skeletons database is enough for many libraries.

The proposed python-skeletons repository is hosted on GitHub.

Conventions

Skeletons should contain syntactically correct Python code, preferably compatible with Python 2.6-3.3.

Skeletons should respect PEP-8 and PEP-257 style guides.

If you need to reference the members of the original module of a skeleton, you should import it explicitly. For example, in a skeleton for the foo module:

import foo


class C(foo.B):
    def bar():
        """Do bar and return Bar.

        :rtype: foo.Bar
        """
        return foo.Bar()

Modules can be referenced in docstring without explicit imports.

The body of a function in a skeleton file should consist of a single return statement that returns a simple value of the declared return type (e.g. 0 for int, False for bool, Foo() for Foo). If the function returns something non-trivial, its may consist of a pass statement.

Types

There is no standard notation for specifying types in Python code. We would like this standard to emerge, see the related work below.

The current understanding is that a standard for optional type annotations in Python could use the syntax of function annotations in Python 3 and decorators as a fallback in Python 2. The type system should be relatively simple, but it has to include parametric (generic) types for collections and probably more.

As a temporary solution, we propose a simple way of specifying types in skeletons using Sphinx docstrings using the following notation:

Foo                # Class Foo visible in the current scope
x.y.Bar            # Class Bar from x.y module
Foo | Bar          # Foo or Bar
(Foo, Bar)         # Tuple of Foo and Bar
list[Foo]          # List of Foo elements
dict[Foo, Bar]     # Dict from Foo to Bar
T                  # Generic type (T-Z are reserved for generics)
T <= Foo           # Generic type with upper bound Foo
Foo[T]             # Foo parameterized with T
(Foo, Bar) -> Baz  # Function of Foo and Bar that returns Baz

There are several shortcuts available:

unknown            # Unknown type
None               # type(None)
string             # Py2: str | unicode, Py3: str
bytestring         # Py2: str | unicode, Py3: bytes
bytes              # Py2: str, Py3: bytes
unicode            # Py2: unicode, Py3: str

The syntax is a subject to change. It is almost compatible to Python (except function types), but its semantics differs from Python (no |, no implicitly visible names, no generic types). So you cannot use these expressions in Python 3 function annotations.

If you want to create a parameterized class, you should define its parameters in the mock return type of a constructor:

class C(object):
    """Some collection C that can contain values of T."""

    def __init__(self, value):
        """Initialize C.

        :type value: T
        :rtype: C[T]
        """
        pass

    def get(self):
        """Return the contained value.

        :rtype: T
        """
        pass

Versioning

The recommended way of checking the version of Python is:

import sys


if sys.version_info >= (2, 7) and sys.version_info < (3,):
    def from_27_until_30():
        pass

A skeleton should document the most recently released version of a library. Use deprecation warnings for functions that have been removed from the API.

Skeletons for built-in symbols is an exception. There are two modules: __builtin__ for Python 2 and builtins for Python 3.

Related Work

The JavaScript community is also interested in formalizing API definitions and specifying types. They have come up with several JavaScript dialects that support optional types: TypeScript, Dart. There is a JavaScript initiative similar to the proposed Python skeletons called DefinitelyTyped. The idea is to use TypeScript API stubs for various JavaScript libraries.

There are many approaches to specifying types in Python, none of them is widely adopted at the moment:

See also the notes on function annotations in PEP-8.

PyCharm / IntelliJ

PyCharm 3 and the Python plugin 3.x for IntelliJ can extract the following information from the skeletons:

  • Parameters of functions and methods
  • Return types and parameter types of functions and methods
  • Types of assignment targets
  • Extra module members
  • Extra class members
  • TODO

PyCharm 3 comes with a snapshot of the Python skeletons repository (Python plugin 3.0.1 for IntelliJ still doesn't include this repository). You should not modify it, because it will be updated with the PyCharm / Python plugin for IntelliJ installation. If you want to change the skeletons, clone the skeletons GitHub repository into your PyCharm/IntelliJ config directory:

cd <config directory>
git clone https://github.com/JetBrains/python-skeletons.git

where <config directory> is:

  • PyCharm
    • Mac OS X: ~/Library/Preferences/PyCharmXX
    • Linux: ~/.PyCharmXX/config
    • Windows: <User home>\.PyCharmXX\config
  • IntelliJ
    • Mac OS X: ~/Library/Preferences/IntelliJIdeaXX
    • Linux: ~/.IntelliJIdeaXX/config
    • Windows: <User home>\.IntelliJIdeaXX\config

Please send your PyCharm/IntelliJ-related bug reports and feature requests to PyCharm issue tracker.

Feedback

If you want to contribute, send your pull requests to the Python skeletons repository on GitHub. Please make sure, that you follow the conventions above.

Use code-quality mailing list to discuss Python skeletons.