Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

PyO3 is incompatible with ctypes #1683

Open
LegionMammal978 opened this issue Jun 18, 2021 · 8 comments
Open

PyO3 is incompatible with ctypes #1683

LegionMammal978 opened this issue Jun 18, 2021 · 8 comments

Comments

@LegionMammal978
Copy link

For a personal project, I'm trying to provide a Rust callback function to a Python module as a raw function pointer. However, if I attempt to acquire the GIL within the callback function, it causes the program to panic due to out-of-order GIL dropping. As a minimal example:

use pyo3::prelude::*;

extern "C" fn callback() {
    println!("begin callback");
    let _ = Python::acquire_gil();
    println!("end callback");
}

fn main() -> PyResult<()> {
    Python::with_gil(|py| {
        PyModule::import(py, "ctypes")?
            .call_method1("CFUNCTYPE", ((),))?
            .call1((callback as *const () as usize,))?
            .call0()?;
        Ok(())
    })
}

The program panics before end callback can be printed:

$ target/debug/gil_test 
begin callback
thread 'main' panicked at 'The first GILGuard acquired must be the last one dropped.', /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/gil.rs:290:17
note: run with `RUST_BACKTRACE=1` environment variable to display a backtrace
thread 'main' panicked at 'The first GILGuard acquired must be the last one dropped.', /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/gil.rs:290:17
stack backtrace:
   0:     0x5599ade9a3b0 - std::backtrace_rs::backtrace::libunwind::trace::h04d12fdcddff82aa
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/../../backtrace/src/backtrace/libunwind.rs:100:5
   1:     0x5599ade9a3b0 - std::backtrace_rs::backtrace::trace_unsynchronized::h1459b974b6fbe5e1
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/../../backtrace/src/backtrace/mod.rs:66:5
   2:     0x5599ade9a3b0 - std::sys_common::backtrace::_print_fmt::h9b8396a669123d95
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:67:5
   3:     0x5599ade9a3b0 - <std::sys_common::backtrace::_print::DisplayBacktrace as core::fmt::Display>::fmt::he009dcaaa75eed60
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:46:22
   4:     0x5599adeb383c - core::fmt::write::h77b4746b0dea1dd3
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/fmt/mod.rs:1078:17
   5:     0x5599ade98252 - std::io::Write::write_fmt::heb7e50902e98831c
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/io/mod.rs:1518:15
   6:     0x5599ade9c515 - std::sys_common::backtrace::_print::h2d880c9e69a21be9
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:49:5
   7:     0x5599ade9c515 - std::sys_common::backtrace::print::h5f02b1bb49f36879
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:36:9
   8:     0x5599ade9c515 - std::panicking::default_hook::{{closure}}::h658e288a7a809b29
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:208:50
   9:     0x5599ade9c1b8 - std::panicking::default_hook::hb52d73f0da9a4bb8
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:227:9
  10:     0x5599ade9cc51 - std::panicking::rust_panic_with_hook::hfe7e1c684e3e6462
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:593:17
  11:     0x5599ade81d4e - std::panicking::begin_panic::{{closure}}::h67f425c0631152f9
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:522:9
  12:     0x5599ade81bd9 - std::sys_common::backtrace::__rust_end_short_backtrace::h726b4232d880f43b
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:141:18
  13:     0x5599ade81c7b - std::panicking::begin_panic::h4b52fd2d813c0948
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:521:12
  14:     0x5599ade60896 - <pyo3::gil::GILGuard as core::ops::drop::Drop>::drop::{{closure}}::h8e6c3302036811a8
                               at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/gil.rs:290:17
  15:     0x5599ade62dc1 - std::thread::local::LocalKey<T>::try_with::hd7fb52379d7bc09d
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/thread/local.rs:272:16
  16:     0x5599ade60798 - <pyo3::gil::GILGuard as core::ops::drop::Drop>::drop::h5070dea984a3f707
                               at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/gil.rs:285:17
  17:     0x5599ade6c57f - core::ptr::drop_in_place::hca1b1d785a584a01
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ptr/mod.rs:175:1
  18:     0x5599ade6c2c0 - core::ptr::drop_in_place::h7d2dc00e35e21d91
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ptr/mod.rs:175:1
  19:     0x5599ade6c5ff - core::ptr::drop_in_place::hddef70a15a5993ae
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ptr/mod.rs:175:1
  20:     0x5599ade5e144 - pyo3::python::Python::with_gil::ha183e49a7426795a
                               at /home/lm978/.cargo/registry/src/github.com-1ecc6299db9ec823/pyo3-0.13.2/src/python.rs:158:5
  21:     0x5599ade5dc91 - pyo3_test::main::h0624e64f2681f1e6
                               at /home/lm978/nested/pyo3_test/src/main.rs:10:5
  22:     0x5599ade5e642 - core::ops::function::FnOnce::call_once::h7c181d0d3d6c9cdf
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:227:5
  23:     0x5599ade5dbaa - std::sys_common::backtrace::__rust_begin_short_backtrace::haa621927d913d7d2
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/sys_common/backtrace.rs:125:18
  24:     0x5599ade5deb6 - std::rt::lang_start::{{closure}}::he8806ad14534304d
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/rt.rs:66:18
  25:     0x5599ade9d167 - core::ops::function::impls::<impl core::ops::function::FnOnce<A> for &F>::call_once::h57e2a071d427b24c
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/core/src/ops/function.rs:259:13
  26:     0x5599ade9d167 - std::panicking::try::do_call::h81cbbe0c3b30a28e
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:381:40
  27:     0x5599ade9d167 - std::panicking::try::hbeeb95b4e1f0a876
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panicking.rs:345:19
  28:     0x5599ade9d167 - std::panic::catch_unwind::h59c48ccb40a0bf20
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/panic.rs:396:14
  29:     0x5599ade9d167 - std::rt::lang_start_internal::ha53ab63f88fee728
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/rt.rs:51:25
  30:     0x5599ade5de87 - std::rt::lang_start::h1e6a375f16e4e70e
                               at /rustc/e1884a8e3c3e813aada8254edfa120e85bf5ffca/library/std/src/rt.rs:65:5
  31:     0x5599ade5dcca - main
  32:     0x7f930dcc2bf7 - __libc_start_main
  33:     0x5599ade5d4ca - _start
  34:                0x0 - <unknown>
thread panicked while panicking. aborting.
Illegal instruction (core dumped)

However, it can clearly be seen that the GILGuards are dropped in the correct order. Is this panic caused by a bug in PyO3, or am I doing something incorrectly?

@davidhewitt
Copy link
Member

davidhewitt commented Jun 18, 2021

Hmm. Thank you for reporting, this is 100% a bug in PyO3.

For a bit of context of what's going on: we have some internal tracking inside PyO3 which ensures that the user cannot get into unsafety with two particular thorns:

  • GILGuard must be dropped in reverse order to acquisition (as seen in the error message)
  • In old PyO3 versions, multiple GILGuard with different lifetimes used to be able to lead to use-after-free in safe code (see Easy dangling PyObject with GILPool #864). This is because we have an internal GILPool construct to deal with the "GIL-bound references".

Unfortunately in your example the mitigation we've got in place for the first point is directly clashing with the mitigation for the second point.


PyO3 manages these defences by tracking the GIL acquire and release calls internally. It also uses this internal tracking to make some optimizations, e.g. to the implementation of Clone and Drop for Py<T>. An assumption PyO3 makes as part of this is that if PyO3 has acquired the GIL and calls into Python, when Python again calls Rust code the GIL is still held. This assumption is broken by the ctypes module as you demonstrate, because as ctypes says in its docs:

The function will release the GIL during the call.

Unless a way can be found to fix the internal tracking, I guess this implies that it's not sound. Although ctypes is a proven example I wouldn't be suprised if this means that other ways could also be found to innocently break the internal tracking.

If it's not sound:

  • we can't rely on it for optimizations
  • the mitigations listed above must be pessimistic and err on the side of caution

For the foreseeable future, I think that we should document PyO3 is incompatible with ctypes.

We also need to remove a couple of optimizations based on PyO3's internal tracking:

  • optimizations within Python::with_gil
  • Clone and Drop for Py<T>

I'll open some issues to track these in the morning.

@davidhewitt
Copy link
Member

... I think if we remove Python::acquire_gil (which is the problematic API) so that this particular check is not necessary, along with the optimizations mentioned above, then your example is sound.

This will take at least a couple of deprecation-and-release cycles to allow the ecosystem to adjust.

@davidhewitt davidhewitt changed the title Avoiding panics on recursive GIL acquisitions PyO3 is incompatible with ctypes Jun 18, 2021
@LegionMammal978
Copy link
Author

Thank you for your help; in my actual use case, the callback isn't called directly by ctypes but indirectly by a C library loaded via ctypes, and I was hoping to manipulate certain Python objects which I had stored in a static thread-local variable. Also, I have a few minor questions I hope you could answer:

  • What would the alternative to Python::with_gil and Python::acquire_gil look like?
  • In my current workflow, I run acquire_gil at the start of my program and store an Rc<GILGuard> within my context object and wrapper objects. (My program is entirely single-threaded.) Would it be more correct to re-acquire the GIL at all the points I need to use it? Would it be much more computationally expensive to do it that way?
  • What would be the alternative to Clone and Drop for Py<T>? I currently depend on my wrapper objects being directly Clone, although I suppose I could write a clone function that acquires the GIL.
  • Is there currently any way I can work around this issue, apart from not applying any changes until after the callback returns?

@ChillFish8
Copy link

I think depreciating Python::with_gil and Python::acquire_gil would cause a considerable challenge where we use it as and when it's needed rather than passing the GIL through many, many functions to get there. In this case, it would essentially involve completely re-designing the system due to the lifetime constraints.

When playing around with cytpes with Rust and PyO3 I ended up having the core functionality be set up agnostic and creating separate public APIs to Ctypes and Pyo3 to avoid crossing between the two bounds. (although I can safely say unless you really, really, really want performance PyPy support Ctypes is considerably more hassle than its worth in a lot of cases.

@mejrs
Copy link
Member

mejrs commented Jun 20, 2021

I think depreciating Python::with_gil and Python::acquire_gil

I think he meant: deprecate acquire_gil and gilguard, and possibly remove some optimizations in with_gil.

@davidhewitt
Copy link
Member

Yep absolutely. Python::with_gil is both incredibly useful and safe, because it doesn't have the drop order problems of acquire_gil.

@mejrs
Copy link
Member

mejrs commented Jun 24, 2021

Do we want to get rid of acquire_gil usage in pyo3, like in doc examples, tests, internal code etc? I can do some of that next week, I think.

@davidhewitt
Copy link
Member

Do we want to get rid of acquire_gil usage in pyo3

Yes, especially removing it from examples and the guide would be a great first step towards it's deprecation. Thanks ☺️

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants