Skip to content

enhancements cep1001

DagSverreSeljebotn edited this page May 27, 2012 · 17 revisions

CEP 1001 - Custom PyTypeObject extensions

NOTE

A newer version of this idea available here:

https://github.com/numfocus/sep/blob/master/sep200.rst

Once that SEP has been approved, the below should be deleted.

Overview

Often Python extensions needs to communicate things on the ABI level about PyObject``s. In essence, one would like more slots in ``PyTypeObject for a custom purpose (dictionary lookups would be too slow).

The solution today is often to rely on PyObject_TypeCheck. However, this works against standardizing things across libraries, and creates one-to-many situations (only one implementor of an API, but many consumers), rather than many-to-many where there can be multiple implementors and multiple consumers of a mutually agreed-upon standard (think "domain-specific PEP 3118s").

To overcome this problem, the usual approach is to propose a PEP. However, that process is a) slow, b) not suitable for very domain-specific tasks, and c) the result is not backwards-compatible with earlier versions of Python.

python-dev will be consulted about what our long term goal should be. In the short term, the hack below is used to make this supported in currently released versions of Python.

Current implementation

We hack more type information into existing and future CPython implementations in the following way: This CEP provides a C header that for all Python versions define a macro Py_TPFLAGS_HAS_EXTENSIONS for a free bit in tp_flags in the PyTypeObject.

If present, then we assume that the PyTypeObject struct is followed by a pointer to extension information, as follows:

typedef struct {
    unsigned long tpe_extension_id;
    void *tpe_data;
} PyTypeObjectExtensionEntry;

typedef struct {
    Py_ssize_t tpe_count; /* length of tpe_entries array */
    PyTypeExtensionEntry tpe_entries[0]; /* variable size array */
} PyTypeObjectExtensionList;

typedef struct {
   PyTypeObject tp_main;
   PyTypeObjectExtensionList *tp_extensions;
} PyExtendedTypeObject;

Consumers scan the tp_extensions list for IDs they recognize. If an entry is found, tpe_data contains custom information about the type, and is project-specific, although it is expected to be similar to other type pointers such as tp_as_buffer.

Extensions are not required to have any order; types are expected to know which extension is most performance critical and put that first in the list.

Problems

  • The above doesn't work if you subclass a subclass of the default type (subclass a metaclass).

Partition of ID space

The unsigned long ID is expected to be at least 32 bits.

The most significant 8 bits (of a 32 bit uint) denote a "registrar". Each registrar determines the use of the remaining 24 bits, but a recommendation is to use 8 bits to denote which extension "idea", and leave the least significant 16 bits to the extension to denote version information and/or flags.

Registrar IDs:

  • 0-10: Reserved for code that is not released into the wild
  • 11: Cython
  • 12-30: Other languages/Python implementations that interoperate with CPython in some way
  • 31-40: Scientific Python
  • ...
  • 128-255: Reserved for PSF use
Clone this wiki locally