Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Tracking issue for no-gil/freethreaded work #4265

Open
10 tasks
alex opened this issue Jun 20, 2024 · 9 comments
Open
10 tasks

Tracking issue for no-gil/freethreaded work #4265

alex opened this issue Jun 20, 2024 · 9 comments

Comments

@alex
Copy link
Contributor

alex commented Jun 20, 2024

We didn't have a dedicated issue for this, so now there's one.

TODO:

  • Add a cfg for no-gil, but only allowed behind an experimental feature
  • Adopt new owned-reference-friendly C APIs
    • PyDict_GetItemRef
    • PyList_GetItemRef
    • PyDict_Next
    • PyWeakref_GetRef
    • PyImport_AddModuleRef
  • Identify places that assume a Python<'_> indicates only a single thread is executing:
    • pyo3::sync::GILOnceCell
    • PyClassBorrowChecker
    • GILProtected
    • ...
@ngoldbaum
Copy link
Contributor

As a tiny piece of this and to try to learn the library better, I'm working on adding wrappers for the new GetItemRef C API functions in the 3.13 stable API. These are needed to be fully safe for free-threaded python and are nice to have anyway on older versions because strong references are easier to reason about.

@ngoldbaum
Copy link
Contributor

ngoldbaum commented Jul 30, 2024

Just to update the current state of things: pyo3 builds against the free-threaded build if you do:

UNSAFE_PYO3_BUILD_FREE_THREADED=1 cargo build

If you use pyenv, you'll also need to locally delete or modify the .python-version file.

This very quicky crashes inside of mimalloc internals, ultimately inside of Py_InitializeEx:

  * frame #0: 0x000000010135af60 libpython3.13t.dylib`chacha_block + 448
    frame #1: 0x000000010134f7e8 libpython3.13t.dylib`_mi_os_get_aligned_hint + 172
    frame #2: 0x000000010135cd68 libpython3.13t.dylib`unix_mmap_prim + 136
    frame #3: 0x0000000101355e80 libpython3.13t.dylib`_mi_prim_alloc + 220
    frame #4: 0x000000010134fb30 libpython3.13t.dylib`mi_os_prim_alloc + 68
    frame #5: 0x0000000101348560 libpython3.13t.dylib`_mi_os_alloc_aligned + 352
    frame #6: 0x0000000101349a9c libpython3.13t.dylib`mi_reserve_os_memory_ex + 80
    frame #7: 0x0000000101347ee8 libpython3.13t.dylib`_mi_arena_alloc_aligned + 392
    frame #8: 0x000000010135bd00 libpython3.13t.dylib`mi_segment_alloc + 468
    frame #9: 0x0000000101354950 libpython3.13t.dylib`mi_segments_page_alloc + 1468
    frame #10: 0x000000010135ab94 libpython3.13t.dylib`mi_page_fresh_alloc + 56
    frame #11: 0x0000000101351f5c libpython3.13t.dylib`mi_find_page + 528
    frame #12: 0x0000000101344070 libpython3.13t.dylib`_mi_malloc_generic + 208
    frame #13: 0x000000010144e3d8 libpython3.13t.dylib`gc_alloc + 284
    frame #14: 0x000000010144e268 libpython3.13t.dylib`_PyObject_GC_New + 96
    frame #15: 0x000000010132042c libpython3.13t.dylib`PyDict_New + 84
    frame #16: 0x00000001013b3d24 libpython3.13t.dylib`_PyUnicode_InitGlobalObjects + 236
    frame #17: 0x000000010147e4dc libpython3.13t.dylib`pycore_interp_init + 72
    frame #18: 0x000000010147bb88 libpython3.13t.dylib`Py_InitializeFromConfig + 1360
    frame #19: 0x000000010147bc9c libpython3.13t.dylib`Py_InitializeEx + 144
    frame #20: 0x000000010006e5f0 pyo3-1eb544a7db3e1a47`pyo3::gil::prepare_freethreaded_python::_$u7b$$u7b$closure$u7d$$u7d$::h922d8fd5db1fd90c((null)={closure_env#0} @ 0x00000001710665c7, (null)=0x0000000171066640) at gil.rs:69:13
    frame #21: 0x0000000100047a4c pyo3-1eb544a7db3e1a47`std::sync::once::Once::call_once_force::_$u7b$$u7b$closure$u7d$$u7d$::h15baf6dd1f7316ea(p=0x0000000171066640) at once.rs:208:40
    frame #22: 0x000000010031a770 pyo3-1eb544a7db3e1a47`std::sys::sync::once::queue::Once::call::heacc08786c6d7dfa at queue.rs:183:21 [opt]
    frame #23: 0x00000001000478b4 pyo3-1eb544a7db3e1a47`std::sync::once::Once::call_once_force::h7b8eb88c3a02f292(self=0x00000001004dce70, f={closure_env#0} @ 0x000000017106671f) at once.rs:208:9
    frame #24: 0x00000001001d68b4 pyo3-1eb544a7db3e1a47`pyo3::gil::prepare_freethreaded_python::h316cd04b406e24c0 at gil.rs:66:5
    frame #25: 0x00000001001d6924 pyo3-1eb544a7db3e1a47`pyo3::gil::GILGuard::acquire::h2127069d9988a593 at gil.rs:174:21
    frame #26: 0x0000000100053000 pyo3-1eb544a7db3e1a47`pyo3::marker::Python::with_gil::h38979cd5e69873c3(f={closure_env#0} @ 0x000000017106679f) at marker.rs:403:21
    frame #27: 0x00000001001c6548 pyo3-1eb544a7db3e1a47`pyo3::conversions::std::array::tests::test_extract_non_iterable_to_array::h3c220ef1fe379cdf at array.rs:226:9
    frame #28: 0x000000010004a1b4 pyo3-1eb544a7db3e1a47`pyo3::conversions::std::array::tests::test_extract_non_iterable_to_array::_$u7b$$u7b$closure$u7d$$u7d$::h7de4fa3687a88518((null)=0x00000001710667fe) at array.rs:225:44

Just to make sure all of this is reproducible and we have some feedback on CI, I think I'm going to add a free-threaded CI job marked with continue-on-error with a test run that crashes like this.

@davidhewitt
Copy link
Member

That sounds great to me, thanks!

ngoldbaum added a commit to ngoldbaum/pyo3 that referenced this issue Jul 31, 2024
ngoldbaum added a commit to ngoldbaum/pyo3 that referenced this issue Aug 1, 2024
github-merge-queue bot pushed a commit that referenced this issue Aug 1, 2024
* Update dict.get_item binding to use PyDict_GetItemRef

Refs #4265

* test: add test for dict.get_item error path

* test: add test for dict.get_item error path

* test: add test for dict.get_item error path

* fix: fix logic error in dict.get_item bindings

* update: apply david's review suggestions for dict.get_item bindings

* update: create ffi::compat to store compatibility shims

* update: move PyDict_GetItemRef bindings to spot in order from dictobject.h

* build: fix build warning with --no-default-features

* doc: expand release note fragments

* fix: fix clippy warnings

* respond to review comments

* Apply suggestion from @mejrs

* refactor so cfg is applied to functions

* properly set cfgs

* fix clippy lints

* Apply @davidhewitt's suggestion

* deal with upstream deprecation of new_bound
@alex
Copy link
Contributor Author

alex commented Aug 2, 2024

I added a new checkbox for " Adopt new owned-reference-friendly C APIs". If we have a list of all the ones we need, I can make those sub-checkboxes.

@ngoldbaum
Copy link
Contributor

If we have a list of all the ones we need, I can make those sub-checkboxes.

I think PyDict_GetItemRef and PyList_GetItemRef are the most important ones. There'a a listing of the remaining ones in the HOWOTO guide for free-threading in the CPython docs: https://docs.python.org/3.13/howto/free-threading-extensions.html#borrowed-references

I also had a chat with @davidhewitt today and in addition to GilOnceCell, he pointed to GILProtected and PyCell as spots that make strong assumptions about the GIL.

Our first idea is to make GILProtected a no-op on Py_GIL_DISABLED builds (although we'll need to see if that has major fallout on user code) and as a first pass PyCell needs atomic increments and decrements to avoid data races in the free-threaded build.

In addition we need to use pyo3_ffi_check to update the assumptions the FFI bindings make about the free-threaded ABI. Doing this should hopefully fix some of the most egregious build issues. I am planning to work on that step next week.

I looked at adding a failing CI job, but that won't work right now because of if you run the tests on a free-threaded build with --no-fail-fast the tests will eventually deadlock. At least as far as I can see there's no option in cargo to automatically kill hung tests that run longer than a configurable timeout. You can do it manually pretty easily with a macro but I'd prefer not to do that and instead hold off on adding CI until the tests are runnable without deadlocks. Hopefully that won't be too long :)

@alex
Copy link
Contributor Author

alex commented Aug 2, 2024

Ok, updated the tracking list.

PyCell no longer exists, should that be something else?

@ngoldbaum
Copy link
Contributor

I'm still learning the library and it shows...

I think David meant Bound in our discussion and he just got mixed up with the old API after a long day. I'll let him clarify.

@alex
Copy link
Contributor Author

alex commented Aug 2, 2024

My guess is it's a reference to PyClassBorrowChecker, which manages the various borrow flags. But I'll let David say for sure.

@davidhewitt
Copy link
Member

My mistake, yes we removed the PyCell name with the Gil refs API 👍

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

3 participants