-
-
Notifications
You must be signed in to change notification settings - Fork 1.6k
Description
As some of you might know, Stackless Python is an old and venerable fork of regular C-Python. Stackless Python enhances C-Python and modifies the core of the interpreter in a way, that can't be done as an extension module. But otherwise Stackless Python is fully API and ABI compatible with the corresponding C-Python version, for instance you can replace C-Python 3.6.4 with Stackless Python 3.6.4 and everything should continue to work. This works well, except for Cython modules, because Cython uses undocumented implementation details of C-Python. Additionally there is Stackless bug 166, which is already fixed for Stackless 2.7 and will soon be fixed for 3.x.
Currently, I know of the following problems:
PyMethodDef ml_flags
In Cython/Utility/ModuleSetupCode.c and Cython/Utility/ObjectHandling.c Cython uses METH_xxx-flags but ignores METH_STACKLESS. Probably this affects Stackless 3.6 and 3.7 and is is trivial to fix. I'll created pull request #2554.
Incompatible PyFrameObject
Cython uses fields of PyFameObject, which are documented to be subject to change at any time. Unfortunately Stackless has a different PyFrameObject. Let's compare them:
C-Python:
typedef struct _frame {
PyObject_VAR_HEAD
struct _frame *f_back; /* previous frame, or NULL */
PyCodeObject *f_code; /* code segment */
... other fields
PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */
PyObject *f_localsplus[1]; /* locals+stack, dynamically sized */
} PyFrameObject;
Stackless Python:
typedef struct _frame {
PyObject_VAR_HEAD
struct _frame *f_back; /* previous frame, or NULL */
PyFrame_ExecFunc *f_execute;/* support for soft stackless */
... other fields, identical to C-Python
PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */
PyCodeObject *f_code; /* code segment */
PyObject *f_localsplus[1]; /* locals+stack, dynamically sized */
} PyFrameObject;
As you can see, Stackless has an additional field f_execute. And the following fields have a different offset: f_code and f_localsplus. This makes __Pyx_PyFunction_FastCallNoKw() ABI incompatible between C-Python and Stackless, because this function uses f_localsplus. This is fatal, because it breaks compatibility to binary wheels, which are almost always compiled using C-Python. How can we resolve this incompatibility? Currently I see three options:
-
Change Cython to use only documented API. Not realistic for performance reasons.
-
Change Cython to determinate the offset of
f_localsplusat runtime. Probably the most clean solution. It is possible to compute the offset off_localsplusfrom the size of a frame (Py_SIZE(f)) and the information in the code object. Of course the offset is constant at run time and may be cached. -
Change Stackless to detect and "repair" a corrupted
PyFrameObject. This would require a change to the layout of the StacklessPyFrameObjectfirst, because currently Stacklessf_codegets overwritten, when__Pyx_PyFunction_FastCallNoKw()writesf_localsplus[0]. Oncef_codeis preserved, Stackless can shift thef_localsplusarray to the correct offset. It could work, but it would be an ugly hack, that relies on Cython implementation details. The advantage is, that it restores the compatibility with already existing Cython based extension modules.
Perhaps there are better options.