Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Isolate the _io extension module #101819

Closed
Tracked by #103092
erlend-aasland opened this issue Feb 11, 2023 · 10 comments
Closed
Tracked by #103092

Isolate the _io extension module #101819

erlend-aasland opened this issue Feb 11, 2023 · 10 comments
Assignees
Labels
3.12 bugs and security fixes extension-modules C modules in the Modules dir topic-subinterpreters type-feature A feature request or enhancement

Comments

@erlend-aasland
Copy link
Contributor

erlend-aasland commented Feb 11, 2023

Isolate the _io extension module by moving all global variables to module state, porting static types to heap types, and implementing multi-phase init. All global variables in the _io module are static types:

  • Modules/_io/bufferedio.c: PyBufferedIOBase_Type
  • Modules/_io/bufferedio.c: PyBufferedRWPair_Type
  • Modules/_io/bufferedio.c: PyBufferedRandom_Type
  • Modules/_io/bufferedio.c: PyBufferedReader_Type
  • Modules/_io/bufferedio.c: PyBufferedWriter_Type
  • Modules/_io/bytesio.c: PyBytesIO_Type
  • Modules/_io/bytesio.c: _PyBytesIOBuffer_Type
  • Modules/_io/fileio.c: PyFileIO_Type
  • Modules/_io/iobase.c: PyIOBase_Type
  • Modules/_io/iobase.c: PyRawIOBase_Type
  • Modules/_io/textio.c: PyIncrementalNewlineDecoder_Type
  • Modules/_io/textio.c: PyTextIOBase_Type
  • Modules/_io/textio.c: PyTextIOWrapper_Type
  • Modules/_io/winconsoleio.c: PyWindowsConsoleIO_Type

Converting the static types to heap types involves applying PEP-687 to _io.
Adapting multi-phase init involves applying PEP-489 to _io.

Linked PRs

@erlend-aasland erlend-aasland added type-feature A feature request or enhancement extension-modules C modules in the Modules dir topic-subinterpreters labels Feb 11, 2023
@erlend-aasland
Copy link
Contributor Author

cc. @kumaraditya303

@kumaraditya303 kumaraditya303 added the 3.12 bugs and security fixes label Feb 11, 2023
@erlend-aasland
Copy link
Contributor Author

erlend-aasland commented Feb 13, 2023

I see that _PyWindowsConsoleIO_Type is marked with PyAPI_DATA, but I can't find it in Misc/stable_abi.toml. @zooba or @eryksun, do you know why this is so? I did a quick code search on GitHub, but I only got hits in various clones of the CPython code base.

#ifndef Py_LIMITED_API
#ifdef MS_WINDOWS
extern PyTypeObject PyWindowsConsoleIO_Type;
PyAPI_DATA(PyObject *) _PyWindowsConsoleIO_Type;
#define PyWindowsConsoleIO_Check(op) (PyObject_TypeCheck((op), (PyTypeObject*)_PyWindowsConsoleIO_Type))
#endif /* MS_WINDOWS */
#endif /* Py_LIMITED_API */

UPDATE: I see it was added by Steve in 312cef7, probably because of the PyWindowsConsoleIO_Check checks in PC/_testconsole.c and Python/pylifecycle.c:

if (!PyWindowsConsoleIO_Check(file)) {
PyErr_SetString(PyExc_TypeError, "expected raw console object");
return NULL;
}

extern PyTypeObject PyWindowsConsoleIO_Type;
# define PyWindowsConsoleIO_Check(op) \
(PyObject_TypeCheck((op), &PyWindowsConsoleIO_Type))
#endif

cpython/Python/pylifecycle.c

Lines 2359 to 2363 in 2db2c4b

#ifdef MS_WINDOWS
/* Windows console IO is always UTF-8 encoded */
if (PyWindowsConsoleIO_Check(raw))
encoding = L"utf-8";
#endif

Perhaps exposing _PyWindowsConsoleIO_Type through a CPython internal capsulated C API could be an option?

@erlend-aasland
Copy link
Contributor Author

erlend-aasland commented Feb 13, 2023

Other issues that must be resolved before continuing: _PyBytesIOBuffer_Type is used by _testcapimodule. Since that type is not exposed in _io, we cannot simply use PyImport_Import and look it up. One possibility is to expose it through a CPython internal capsule C API.

@erlend-aasland
Copy link
Contributor Author

... or we could simply expose the types mentioned in #101819 (comment) and #101819 (comment) in the _io module.

@zooba
Copy link
Member

zooba commented Feb 13, 2023

Internal types are allowed to be exported without being stable API. On Windows, only explicitly exported names are available outside the DLL - it's different on other platforms, so please don't bring their assumptions over 😉

@encukou
Copy link
Member

encukou commented Feb 13, 2023

The type is explicitly not part of the limited API/stable ABI, see #ifndef Py_LIMITED_API.

@erlend-aasland
Copy link
Contributor Author

Internal types are allowed to be exported without being stable API. On Windows, only explicitly exported names are available outside the DLL - it's different on other platforms, so please don't bring their assumptions over 😉

IIUC, we can remove this from the DLL and instead export it via _io and look it up from the module using the C API (PyImport_Import, etc.).

@zooba
Copy link
Member

zooba commented Feb 13, 2023

Yeah, if it's got a leading underscore then we can do whatever we need to make our tests work.

erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 14, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 14, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 14, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 14, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 14, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 14, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 14, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 14, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue Feb 14, 2023
miss-islington pushed a commit that referenced this issue Feb 15, 2023
carljm added a commit to carljm/cpython that referenced this issue May 10, 2023
* main:
  pythonGH-102181: Improve specialization stats for SEND (pythonGH-102182)
  pythongh-103000: Optimise `dataclasses.asdict` for the common case (python#104364)
  pythongh-103538: Remove unused TK_AQUA code (pythonGH-103539)
  pythonGH-87695: Fix OSError from `pathlib.Path.glob()` (pythonGH-104292)
  pythongh-104263: Rely on Py_NAN and introduce Py_INFINITY (pythonGH-104202)
  pythongh-104010: Separate and improve docs for `typing.get_origin` and `typing.get_args` (python#104013)
  pythongh-101819: Adapt _io._BufferedIOBase_Type methods to Argument Clinic (python#104355)
  pythongh-103960: Dark mode: invert image brightness (python#103983)
  pythongh-104252: Immortalize Py_EMPTY_KEYS (pythongh-104253)
  pythongh-101819: Clean up _io windows console io after pythongh-104197 (python#104354)
  pythongh-101819: Harden _io init (python#104352)
  pythongh-103247: clear the module cache in a test in test_importlib/extensions/test_loader.py (pythonGH-104226)
  pythongh-103848: Adds checks to ensure that bracketed hosts found by urlsplit are of IPv6 or IPvFuture format (python#103849)
  pythongh-74895: adjust tests to work on Solaris (python#104326)
  pythongh-101819: Refactor _io in preparation for module isolation (python#104334)
  pythongh-90953: Don't use deprecated AST nodes in clinic.py (python#104322)
  pythongh-102327: Extend docs for "url" and "headers" parameters to HTTPConnection.request()
  pythongh-104328: Fix typo in ``typing.Generic`` multiple inheritance error message (python#104335)
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue May 11, 2023
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue May 11, 2023
…ument Clinic

Pass defining class, so we can easily fetch module state in the future.
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue May 11, 2023
- Add PyIOBase_Type to _io module state
- Pass defining class to _io._IOBase.fileno
carljm added a commit to carljm/cpython that referenced this issue May 11, 2023
* main: (27 commits)
  pythongh-87849: fix SEND specialization family definition (pythonGH-104268)
  pythongh-101819: Adapt _io.IOBase.seek and _io.IOBase.truncate to Argument Clinic (python#104384)
  pythongh-101819: Adapt _io._Buffered* methods to Argument Clinic (python#104367)
  pythongh-101819: Refactor `_io` futher in preparation for module isolation (python#104369)
  pythongh-101819: Adapt _io.TextIOBase methods to Argument Clinic (python#104383)
  pythongh-101117: Improve accuracy of sqlite3.Cursor.rowcount docs (python#104287)
  pythonGH-92184: Convert os.altsep to '/' in filenames when creating ZipInfo objects (python#92185)
  pythongh-104357: fix inlined comprehensions that close over iteration var (python#104368)
  pythonGH-90208: Suppress OSError exceptions from `pathlib.Path.glob()` (pythonGH-104141)
  pythonGH-102181: Improve specialization stats for SEND (pythonGH-102182)
  pythongh-103000: Optimise `dataclasses.asdict` for the common case (python#104364)
  pythongh-103538: Remove unused TK_AQUA code (pythonGH-103539)
  pythonGH-87695: Fix OSError from `pathlib.Path.glob()` (pythonGH-104292)
  pythongh-104263: Rely on Py_NAN and introduce Py_INFINITY (pythonGH-104202)
  pythongh-104010: Separate and improve docs for `typing.get_origin` and `typing.get_args` (python#104013)
  pythongh-101819: Adapt _io._BufferedIOBase_Type methods to Argument Clinic (python#104355)
  pythongh-103960: Dark mode: invert image brightness (python#103983)
  pythongh-104252: Immortalize Py_EMPTY_KEYS (pythongh-104253)
  pythongh-101819: Clean up _io windows console io after pythongh-104197 (python#104354)
  pythongh-101819: Harden _io init (python#104352)
  ...
carljm added a commit to carljm/cpython that referenced this issue May 11, 2023
* main:
  pythongh-87849: fix SEND specialization family definition (pythonGH-104268)
  pythongh-101819: Adapt _io.IOBase.seek and _io.IOBase.truncate to Argument Clinic (python#104384)
  pythongh-101819: Adapt _io._Buffered* methods to Argument Clinic (python#104367)
  pythongh-101819: Refactor `_io` futher in preparation for module isolation (python#104369)
  pythongh-101819: Adapt _io.TextIOBase methods to Argument Clinic (python#104383)
  pythongh-101117: Improve accuracy of sqlite3.Cursor.rowcount docs (python#104287)
  pythonGH-92184: Convert os.altsep to '/' in filenames when creating ZipInfo objects (python#92185)
  pythongh-104357: fix inlined comprehensions that close over iteration var (python#104368)
  pythonGH-90208: Suppress OSError exceptions from `pathlib.Path.glob()` (pythonGH-104141)
carljm added a commit to carljm/cpython that referenced this issue May 11, 2023
* main:
  pythongh-104057: Fix direct invocation of test_support (pythonGH-104069)
  pythongh-87729: improve hit rate of LOAD_SUPER_ATTR specialization (python#104270)
  pythongh-101819: Fix inverted debug preprocessor check in winconsoleio.c (python#104388)
erlend-aasland added a commit that referenced this issue May 12, 2023
- Add PyIOBase_Type to _io module state
- Pass defining class to _io._IOBase.fileno
erlend-aasland added a commit to erlend-aasland/cpython that referenced this issue May 12, 2023
When preparing the _io extension module for isolation, many methods were
adapted to Argument Clinic. Some of these used the '*args: object'
signature, which is incorrect. These are now corrected to an exact
signature, and marked unused, since they are stub methods.
carljm added a commit to carljm/cpython that referenced this issue May 12, 2023
* main:
  pythongh-91896: Fixup some docs issues following ByteString deprecation (python#104422)
  pythonGH-104371: check return value of calling `mv.release` (python#104417)
  pythongh-104415: Fix refleak tests for `typing.ByteString` deprecation (python#104416)
  pythonGH-86275: Implementation of hypothesis stubs for property-based tests, with zoneinfo tests (python#22863)
  pythonGH-103082: Filter LINE events in VM, to simplify tool implementation. (pythonGH-104387)
  pythongh-93649: Split gc- and allocation tests from _testcapimodule.c (pythonGH-104403)
  pythongh-104389: Add 'unused' keyword to Argument Clinic C converters (python#104390)
  pythongh-101819: Prepare _io._IOBase for module state (python#104386)
  pythongh-104413: Fix refleak when super attribute throws AttributeError (python#104414)
  Fix refleak in `super_descr_get` (python#104408)
  pythongh-87526: Remove dead initialization from _zoneinfo parse_abbr() (python#24700)
  pythongh-91896: Improve visibility of `ByteString` deprecation warnings (python#104294)
  pythongh-104371: Fix calls to `__release_buffer__` while an exception is active (python#104378)
  pythongh-104377: fix cell in comprehension that is free in outer scope (python#104394)
  pythongh-104392: Remove _paramspec_tvars from typing (python#104393)
  pythongh-104396: uuid.py to skip platform check for emscripten and wasi (pythongh-104397)
  pythongh-99108: Refresh HACL* from upstream (python#104401)
  pythongh-104301: Allow leading whitespace in disambiguated pdb statements (python#104342)
kumaraditya303 added a commit that referenced this issue May 15, 2023
Co-authored-by: Kumar Aditya <59607654+kumaraditya303@users.noreply.github.com>
Co-authored-by: Victor Stinner <vstinner@python.org>
erlend-aasland added a commit that referenced this issue May 15, 2023
…104418)

When preparing the _io extension module for isolation, many methods were
adapted to Argument Clinic. Some of these used the '*args: object'
signature, which is incorrect. These are now corrected to an exact
signature, and marked unused, since they are stub methods.
carljm added a commit to carljm/cpython that referenced this issue May 15, 2023
* main: (29 commits)
  pythongh-101819: Fix _io clinic input for unused base class method stubs (python#104418)
  pythongh-101819: Isolate `_io` (python#101948)
  Bump mypy from 1.2.0 to 1.3.0 in /Tools/clinic (python#104501)
  pythongh-104494: Update certain Tkinter pack/place tests for Tk 8.7 errors (python#104495)
  pythongh-104050: Run mypy on `clinic.py` in CI (python#104421)
  pythongh-104490: Consistently define phony make targets (python#104491)
  pythongh-67056: document that registering/unregistering an atexit func from within an atexit func is undefined (python#104473)
  pythongh-104487: PYTHON_FOR_REGEN must be minimum Python 3.10 (python#104488)
  pythongh-101282: move BOLT config after PGO (pythongh-104493)
  pythongh-104469 Convert _testcapi/float.c to use AC (pythongh-104470)
  pythongh-104456: Fix ref leak in _ctypes.COMError (python#104457)
  pythongh-98539: Make _SSLTransportProtocol.abort() safe to call when closed (python#104474)
  pythongh-104337: Clarify random.gammavariate doc entry  (python#104410)
  Minor improvements to typing docs (python#104465)
  pythongh-87092: avoid gcc warning on uninitialized struct field in assemble.c (python#104460)
  pythonGH-71383: IDLE - Document testing subsets of modules (python#104463)
  pythongh-104454: Fix refleak in AttributeError_reduce (python#104455)
  pythongh-75710: IDLE - add docstrings and comments to editor module (python#104446)
  pythongh-91896: Revert some very noisy DeprecationWarnings for `ByteString` (python#104424)
  Add a mention of PYTHONBREAKPOINT to breakpoint() docs (python#104430)
  ...
carljm added a commit to carljm/cpython that referenced this issue May 15, 2023
* main: (204 commits)
  pythongh-101819: Fix _io clinic input for unused base class method stubs (python#104418)
  pythongh-101819: Isolate `_io` (python#101948)
  Bump mypy from 1.2.0 to 1.3.0 in /Tools/clinic (python#104501)
  pythongh-104494: Update certain Tkinter pack/place tests for Tk 8.7 errors (python#104495)
  pythongh-104050: Run mypy on `clinic.py` in CI (python#104421)
  pythongh-104490: Consistently define phony make targets (python#104491)
  pythongh-67056: document that registering/unregistering an atexit func from within an atexit func is undefined (python#104473)
  pythongh-104487: PYTHON_FOR_REGEN must be minimum Python 3.10 (python#104488)
  pythongh-101282: move BOLT config after PGO (pythongh-104493)
  pythongh-104469 Convert _testcapi/float.c to use AC (pythongh-104470)
  pythongh-104456: Fix ref leak in _ctypes.COMError (python#104457)
  pythongh-98539: Make _SSLTransportProtocol.abort() safe to call when closed (python#104474)
  pythongh-104337: Clarify random.gammavariate doc entry  (python#104410)
  Minor improvements to typing docs (python#104465)
  pythongh-87092: avoid gcc warning on uninitialized struct field in assemble.c (python#104460)
  pythonGH-71383: IDLE - Document testing subsets of modules (python#104463)
  pythongh-104454: Fix refleak in AttributeError_reduce (python#104455)
  pythongh-75710: IDLE - add docstrings and comments to editor module (python#104446)
  pythongh-91896: Revert some very noisy DeprecationWarnings for `ByteString` (python#104424)
  Add a mention of PYTHONBREAKPOINT to breakpoint() docs (python#104430)
  ...
@vstinner
Copy link
Member

@erlend-aasland erlend-aasland closed this as completed

Great achievement!

@serhiy-storchaka
Copy link
Member

It causes a crash during garbage collection. See #111049, #111211.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
3.12 bugs and security fixes extension-modules C modules in the Modules dir topic-subinterpreters type-feature A feature request or enhancement
Projects
Status: Done
Development

No branches or pull requests

6 participants