New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
List of Python internals #4635
Comments
I have an idea. Can we create internals documentation and add it there? Cython codebase is pretty complex so adding details about cython internals to documentation could be useful and can help people in contributing to this project. There is already HackerGuide so maybe we can extend the documentation to add also this. |
So the background to this is that:
Internals documentation might be useful (but unhelpful very fast if code is changed but the documentation isn't). I'm not sure this list would be a useful part of it though. |
Thanks @da-woods for digging this up. I have a couple of comments.
|
Yeah that is what I meant. I'd missed that detail.
Yes agree (especially when older Python versions is Py2). I'm trying to be thorough at this stage, but once I've gone through everything I'll then start trying to work out what might genuinely be a problem. |
I'm calling this list as complete as it's going to be at this stage I think |
This one has been made public in python/cpython#114626 but that now causes my Cython build to fail:
|
Actually looks like that was fixed already in python/cpython#115561. Sorry for the noise. |
Just now noticing
|
Not really a bug report or an enhancement: I'm just trying to document all the places that Cython uses Python internals (non-public APIs) so that we have a reasonable idea of what might break with C API changes. Will also document where Cython feature flags provide an "public API only" code path.
The list is a likely incomplete (currently based on a very crude regex search for
re.compile(r"""(?<=[^\w])_Py[\w]*""", flags=re.IGNORECASE)
plus a few other bits that I know about.Cython .pxd includes
array.pxd
https://github.com/cython/cython/blob/master/Cython/Includes/cpython/array.pxd - Cython provides a pxd file allowing users access to thearray.array
internals. This is documented as being CPython-specific internals and made available to users on that basis.Cython/Includes/cpython/pylifecycle.pxd
provides access to_Py_InitializeEx_Private
,_Py_PyAtExit
,_Py_RestoreSignals
,_Py_CheckPython3
,_Py_gitidentifier
,_Py_gitversion
or a similar "own risk" basis"Internal functions
_Py_NewReference
is used in Cython/Utility/AsyncGen.c and Coroutine.c. Can probably be replaced with__Pyx_NewRef
_Py_TPFLAGS_HAVE_VECTORCALL
in Cython/Utility/CythonFunction.c. Does not look to be guarded._Py_AS_GC
used in Cython/Utility/Coroutine.c to access field->gc.gc_refs
. Guarded only byCYTHON_COMPILING_IN_CPYTHON
_Py_DEC_REFTOTAL
in Coroutine.c. Guarded byCYTHON_COMPILING_IN_CPYTHON
_PyEval_EvalFrameDefault
,_PySet_Dummy
_PyStack_AsDict
used in Cython/Utility/FunctionArguments. Guarded byCYTHON_METH_FASTCALL
_PyObject_GetDictPtr
in Cython/Utility/Exceptions.c, ObjectHandling.c_PyTraceback_Add
in Cython/Utility/Exceptions.c. Used in "limited API" code path(!), presumably because the public API code path goes into more internals_PySet_NextEntry
,_PyList_Extend
, and_PyDict_Pop
- they are all guarded byCYTHON_COMPILING_IN_CPYTHON
with alternative code paths in place_PyTrash_thread_deposit_object
and_PyTrash_thread_destroy_chain
are used in ExtensionTypes.c for old versions of CPython. Not needed for newer versions ._PyErr_FormatFromCause
is used in Coroutine.c with a version check_PyGen_Send
is used in Coroutine.c with version checks and an alternate codepath available_PyGen_SetStopIterationValue
is used in the alternate codepath for_PyGen_Send
._PyBytes_Join
(and_PyString_Join
) are used in StringTools.c but with an alternative implementation available for non-CPython_PyUnicode_FastCopyCharacters
is used in StringTools.c and ObjectHandling.c but with version checks and an alternative implementation._PyObject_NextNotImplemented
- ObjectHandling.c. Guarded byCYTHON_USE_TYPE_SLOTS
so alternative code path exists_PyDict_SetItem_KnownHash
and_PyDict_GetItem_KnownHash
- ObjectHandling.c Guarded by a version-check so alternative code path exists_PyObject_GenericGetAttrWithDict
- ObjectHandling.c. Guarded by a version check (andCYTHON_USE_TYPE_SLOTS
) so alternative code path exists_PyObject_GetDictPtr
in ObjectHandling.c. Used in a few places, but looks to be guarded. The guards are inconsistent between uses (CYTHON_UNPACK_METHODS && CYTHON_COMPILING_IN_CPYTHON && CYTHON_USE_PYTYPE_LOOKUP
,CYTHON_USE_DICT_VERSIONS && CYTHON_USE_TYPE_SLOTS
) so a little fiddly to replace if changed, but not impossible._PyCFunction_FastCallDict
and_PyCFunction_FastCallKeywords
- used in ObjectHandling.c for older Python versions_PyMethodDescr_FastCallKeywords
used in ObjectHandling.c for current Python versions. Looks like a shortcut that would be easily disabled if needed._PyLong_FromByteArray
is used in TypeConversion.c unguard_PyLong_AsByteArray
is used in TypeConversion.c with a version guard. It looks like conversion of large number string to Python longs fail with a runtime exception without it_PyAsyncGen_MAXFREELIST
is used inAsyncGen.c
. There is a check that it's defined (and redefinition). The assumption is that it's a macro (which probably has to be true in C?). Potentially risky because I think there's plans to unify freelist implementations in CPython (but probably easily removed from Cython if needed)_PyCFunctionFast
and_PyCFunctionFastWithKeywords
are used in current Python versions (ModuleSetupCode.c and CythonFunction.c). Although underscore-prefixed they are in the Python documentation._PyThreadState_UncheckedGet
is used in current Python versions (ModuleSetupCode.c) but alternative code paths exist if it ever goes missing_PyThreadState_Current
used in ModuleSetupCode.c in very old Python versions_PyDict_NewPresized
used in ModuleSetupCode.c - it's easily replace with the less efficientPyDict_New
if needed though_PyDict_GetItem_KnownHash
is used in ModuleSetupCode.c with version checks. Alternative code paths are available._PyUnicode_Ready
is used in ModuleSetupCode.c. Alternative code is in place for the expected removal of the concept of "unicode readiness" in Python 3.12.Cython feature flags
_PyType_Lookup
is used in a few places but guarded byCYTHON_USE_PYTYPE_LOOKUP
_PyGC_FINALIZED
in ModuleNode.py - Guarded byCYTHON_USE_TP_FINALIZE
. However, turning this off does disable some features of cdef classes_PyErr_StackItem
- guarded byCYTHON_USE_EXC_INFO_STACK
_PyString_Eq
is used in FunctionArguments.c but only for very old Python versions_PyStack_AsDict
is used in the macro__Pyx_KwargsAsDict_FASTCALL
in FunctionArguments.c on recent versions of Python. It's guarded byCYTHON_METH_FASTCALL
but realistically this is a flag we won't want to disable.CYTHON_USE_UNICODE_WRITER
guards use of_PyUnicodeWriter_Init
and related functions. It's currently turned off on Python 3.11a since_PyFloat_FormatAdvancedWriter
and_PyLong_FormatAdvancedWriter
disappeared.CYTHON_VECTORCALL
guards_PyVectorcall_Function
It looks like it has now been made public withPyVectorcall_Function
though, so non-issue.CYTHON_PEP393_ENABLED
(true for recent versions I think) guards_PyUnicode_AsDefaultEncodedString
CYTHON_USE_PYLONG_INTERNALS
enables the use ofob_digit
on long object (with all the assumptions about how those internals are stored). Also enables_PyLong_Copy
in Builtins.cCYTHON_USE_PYLIST_INTERNALS
uses internal fields on list objects (e.g.->allocated
). Fallback code-paths exist for everythingCYTHON_USE_UNICODE_INTERNALS
guards access to internal fields on unicode (and also bytes). Includingob_shash
, but also direct access into the memory buffer. Fallback code-paths existCYTHON_USE_EXC_INFO_STACK
accesses_PyErr_StackItem
including fields likeprevious_item
mainly in Coroutine.c and Exceptions.c. Replacement code-paths exist, but I'm not sure if they cover all functionality. It accesses from thePyThreadState
object.Internal field access
This section is fairly incomplete since I haven't yet worked out a good way of searching for these
self->ob_refcnt
in Coroutine.c--Py_TYPE(self)->tp_frees;
--Py_TYPE(self)->tp_allocs;
in Coroutine.c (Guarded byCYTHON_COMPILING_IN_CPYTHON
)ob_item
of tuple and list. Guarded only byCYTHON_COMPILING_IN_CPYTHON
Frames/Tracebacks
PyTracebackObject
(tb_frame
) andPyFrameObject
(f_back
mainly). The alternative code paths don't really work in PyPy so this is probably the Cython feature most dependent on internal detail.PyFrameObject
using the publicPyFrame_New
but doesn't access the internal fields of it. This is to create exception tracebacks so is used everywhere in Cython.f_localsplus
of frame objects only on old versions of Python I think (on new versions it's covered by vectorcall)f_trace
,f_lineno
(but via a macro that can become a no-op easily)c_tracefunc
,c_traceobj
,c_profilefunc
,c_profileobj
,use_tracing
,tracing
co_flags
These are only used if linetracing/profiling is enabled, so not required for the "normal" functioning of Cython.
Other
Py_TPFLAGS_HEAPTYPE
and disabling the GC to enable multiple inheritance on non-heap types. Known to cause problems on some alternative implementations [BUG] Py_TPFLAGS_HEAPTYPE is set on static PyTypeObject #4200The text was updated successfully, but these errors were encountered: