Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Compatibility with Stackless Python #2534

Closed
akruis opened this issue Aug 6, 2018 · 3 comments
Closed

Compatibility with Stackless Python #2534

akruis opened this issue Aug 6, 2018 · 3 comments

Comments

@akruis
Copy link

akruis commented Aug 6, 2018

As some of you might know, Stackless Python is an old and venerable fork of regular C-Python. Stackless Python enhances C-Python and modifies the core of the interpreter in a way, that can't be done as an extension module. But otherwise Stackless Python is fully API and ABI compatible with the corresponding C-Python version, for instance you can replace C-Python 3.6.4 with Stackless Python 3.6.4 and everything should continue to work. This works well, except for Cython modules, because Cython uses undocumented implementation details of C-Python. Additionally there is Stackless bug 166, which is already fixed for Stackless 2.7 and will soon be fixed for 3.x.

Currently, I know of the following problems:

PyMethodDef ml_flags

In Cython/Utility/ModuleSetupCode.c and Cython/Utility/ObjectHandling.c Cython uses METH_xxx-flags but ignores METH_STACKLESS. Probably this affects Stackless 3.6 and 3.7 and is is trivial to fix. I'll created pull request #2554.

Incompatible PyFrameObject

Cython uses fields of PyFameObject, which are documented to be subject to change at any time. Unfortunately Stackless has a different PyFrameObject. Let's compare them:

C-Python:

typedef struct _frame {
    PyObject_VAR_HEAD
    struct _frame *f_back;      /* previous frame, or NULL */
    PyCodeObject *f_code;       /* code segment */
    ... other fields
    PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */
    PyObject *f_localsplus[1];  /* locals+stack, dynamically sized */
} PyFrameObject;

Stackless Python:

typedef struct _frame {
    PyObject_VAR_HEAD
    struct _frame *f_back;      /* previous frame, or NULL */
    PyFrame_ExecFunc *f_execute;/* support for soft stackless */
    ... other fields, identical to C-Python
    PyTryBlock f_blockstack[CO_MAXBLOCKS]; /* for try and loop blocks */
    PyCodeObject *f_code;           /* code segment */
    PyObject *f_localsplus[1];  /* locals+stack, dynamically sized */
} PyFrameObject;

As you can see, Stackless has an additional field f_execute. And the following fields have a different offset: f_code and f_localsplus. This makes __Pyx_PyFunction_FastCallNoKw() ABI incompatible between C-Python and Stackless, because this function uses f_localsplus. This is fatal, because it breaks compatibility to binary wheels, which are almost always compiled using C-Python. How can we resolve this incompatibility? Currently I see three options:

  1. Change Cython to use only documented API. Not realistic for performance reasons.

  2. Change Cython to determinate the offset of f_localsplus at runtime. Probably the most clean solution. It is possible to compute the offset of f_localsplus from the size of a frame (Py_SIZE(f)) and the information in the code object. Of course the offset is constant at run time and may be cached.

  3. Change Stackless to detect and "repair" a corrupted PyFrameObject. This would require a change to the layout of the Stackless PyFrameObject first, because currently Stackless f_code gets overwritten, when __Pyx_PyFunction_FastCallNoKw() writes f_localsplus[0]. Once f_code is preserved, Stackless can shift the f_localsplus array to the correct offset. It could work, but it would be an ugly hack, that relies on Cython implementation details. The advantage is, that it restores the compatibility with already existing Cython based extension modules.

Perhaps there are better options.

@scoder
Copy link
Contributor

scoder commented Aug 10, 2018

  1. Users can disable the usage of f_localsplus by setting the macro CYTHON_FAST_PYCALL=0 at C compile time, and Cython could also do that automatically when compiling against Stackless. (Ok, doesn't solve the problem of exchanging extensions between CPython and Stackless.)

I'd vote for 2). A static C variable in the call helper function to cache the offset should be enough.

I should also note that most of the fast-call code in Cython is essentially a copy from CPython. There is PEP 580 to generalise the fast-call protocol (and some other things) in Py3.8, assuming that it finally gets some traction (endorsing it on python-dev is welcome).

@akruis
Copy link
Author

akruis commented Aug 14, 2018

@scoder I'm glad you like option 2). I created pull request #2556.

@scoder scoder added this to the 0.29 milestone Aug 14, 2018
@scoder scoder closed this as completed Aug 15, 2018
@akruis
Copy link
Author

akruis commented Aug 17, 2018

Thanks for resolving this issue so quickly.
Just in case you need to access other fields from PyFrameObject too, here is a small documentation: https://github.com/stackless-dev/stackless/wiki/Portable-usage-of-PyFrameObject

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

2 participants