Fix async Python functors invoking from multiple C++ threads (#1587) #1595

uentity · 2018-11-06T04:27:46Z

Ensure GIL is released after functor destructor finished (not only
during functor execution as in previous implementation).

This fixes #1587

wjakob · 2018-11-09T09:20:50Z

I'm a bit confused by the change, can you clarify? Specifically, I'm wondering if/why it is safe to std::move() the function within the lambda function. It seems to me that for lambdas with state, this operation will only work the first time. Afterwards, the lambda closure object will contain an invalid function object, which will potentially trigger undefined behavior the next time the function is called. What am I missing?

wjakob · 2018-11-09T09:21:36Z

FWIW what's also missing here is a testcase that would have lead to breakage in the previous implemenation.

uentity · 2018-11-09T13:47:24Z

@wjakob Your first comment is absolutely right, I missed that point, s.t. all functors became single-shot (can be called only once and fail/crash 2nd time).

Fixed that in next commit. Had to introduce local "function handle" struct with custom destructor. Seems that my code is working now.

Considering testcase: I can write a Python test with statefull lambda, and it should crash Python (and I suppose we can't catch/prevent it in testing code) without this PR and work fine with it.

uentity · 2018-11-09T20:01:05Z

@wjakob I've made a test case and squashed functional changes into a single commit.
Please, take a look.

[NOTE] test leads to segfault (on my machine) if run without latest commit (with previous functional version).

uentity · 2018-11-14T18:30:38Z

@wjakob there seems to be a problem with gil_scoped_acquire.
If async callbacks are launched from separate (non-main) Python thread, then following code will lead to deadlock:

~func_handle() {
	gil_scoped_acquire acq;
	function kill_f(std::move(f));
}

If I replace it with conventional Python API for acquiring GIL (straight from documentation):

~func_handle() {
	PyGILState_STATE gstate;
	gstate = PyGILState_Ensure();
	{
		function kill_f(std::move(f));
	}
	PyGILState_Release(gstate);
}

then everything works without issues.
Consider the latest couple of commits.

davidhewitt · 2018-12-01T12:33:52Z

Hello @wjakob @uentity, I've authored a separate PR #1556 which resolves the deadlock problem with gil_scoped_acquire. You might wish to check that out!

Below is my summary of what I understand of the problems these PRs are trying to solve.

std::function<...> extracted from pybind11 typically is some signature like std::function<int(int)> which has no obvious association to python types and can easily be passed to portions of the codebase which are not aware to things like the GIL. People doing this then expect these objects to be thread safe, however:

If C++ code spawns a thread and copies the function handle to give to that new thread, then in that new thread the function will correctly capture the GIL during calls but not during destruction. That is what this PR solves.
If Python spawns a thread and calls a pybind11-defined function which takes one of these std::function types as an argument, then on the first construction of a gil_scoped_acquire object we get a deadlock. That is what Resolve threading issue with gil_scoped_acquire #1556 solves.

I would love to contribute in any way possible to help resolve these!

uentity · 2018-12-01T18:57:51Z

Hi @davidhewitt!
If you look at my last commit on this topic, then you can figure out that I was trying to remedy the consequences of problem you described in item 2. It's great that you've fixed gil_scoped_aquire in first place!

With you PR #1556 applied I can bring back my original implementation of func_handle destructor in this PR, i.e.:

~func_handle() {
	gil_scoped_acquire acq;
	function kill_f(std::move(f));
}

Very soon after I thought my issue #1587 is solved I've faced the issue of your PR #1556 and fixed it by replacing gil_scoped_acquire with Python GIL API calls (see above). But my fix isn't perfect, because if doesn't follow RAII (for example, destructor of functor f throws...).

BTW can you give any comment on comparative "heaviness" of capturing GIL solutions I noted earlier? If I look at gil_scoped_acquire implementation, I have a (deceptive?) feeling that it is "heavy" :-)

davidhewitt · 2018-12-01T20:57:30Z

I understand gil_scoped_acquire is slightly more complicated because it is designed to allow python to be migrated to a different OS thread. I think because of this it's probably advisable to use gil_scoped_acquire everywhere. Plus RAII :)

NB I see that #1211 has just been merged which was exactly the same thing that I was trying to solve. So I think that gil_scoped_acquire will work properly for you in this PR if you rebase.

davidhewitt · 2018-12-04T19:58:35Z

include/pybind11/functional.h

+            func_handle(const func_handle&) = default;
+            ~func_handle() {
+                gil_scoped_acquire acq;
+                function kill_f(std::move(f));


It might be more clear to explicitly reset the function handle and decrease the reference count: f.release().dec_ref();

Agreed. Done.

@davidhewitt causes strange test failure on VS with Python 3.6 on x86 platform, so reverted back to original version.
Logs: https://ci.appveyor.com/project/wjakob/pybind11/builds/20894165

uentity · 2018-12-10T11:14:39Z

@wjakob style check is failing because of non-existent URL:

$ curl -fsSL ftp://ftp.stack.nl/pub/users/dimitri/doxygen-1.8.12.linux.bin.tar.gz | tar xz
curl: (6) Could not resolve host: ftp.stack.nl

hammer498 · 2019-06-03T16:52:29Z

I was also affected by this and your patch has unblocked me. Thank you @uentity!

…1587) Ensure GIL is held during functor destruction.

uentity · 2019-06-06T18:07:51Z

@hammer498 you're welcome =)

wjakob · 2019-06-11T20:23:00Z

This looks good to me now.

…1595) * Fix async Python functors invoking from multiple C++ threads (#1587) Ensure GIL is held during functor destruction. * Add async Python callbacks test that runs in separate Python thread

uentity mentioned this pull request Nov 6, 2018

Crash when Python lambdas (with captured state) are called from multiple C++ threads (fix inside) #1587

Closed

uentity force-pushed the fix-async-pycb branch 5 times, most recently from e619e19 to e3375b6 Compare November 9, 2018 19:51

uentity force-pushed the fix-async-pycb branch from e3375b6 to 296627b Compare November 13, 2018 07:19

uentity force-pushed the fix-async-pycb branch from 3637a3c to 9a4eb75 Compare November 14, 2018 18:35

lambdaknight mentioned this pull request Nov 29, 2018

Deadlock with std::function argument #1525

Closed

uentity force-pushed the fix-async-pycb branch from 9a4eb75 to b66623f Compare December 2, 2018 18:47

davidhewitt reviewed Dec 4, 2018

View reviewed changes

uentity force-pushed the fix-async-pycb branch from b66623f to 60decef Compare December 10, 2018 11:10

uentity force-pushed the fix-async-pycb branch from 60decef to b66623f Compare December 11, 2018 04:16

This was referenced Jan 1, 2019

dec_ref bug with asyncio #1521

Closed

SIGSEGV on exit robotpy/robotpy-cscore#52

Closed

uentity force-pushed the fix-async-pycb branch from b66623f to 2aaaa15 Compare March 29, 2019 07:06

bstaletic mentioned this pull request May 14, 2019

Unsorted pull requests YannickJadoul/pybind11-cleanup#8

Open

19 tasks

uentity added 2 commits June 6, 2019 22:58

Add async Python callbacks test that runs in separate Python thread

feede39

Fix async Python functors invoking from multiple C++ threads (pybind#…

2746e1d

…1587) Ensure GIL is held during functor destruction.

uentity force-pushed the fix-async-pycb branch from 2aaaa15 to 2746e1d Compare June 6, 2019 17:58

wjakob merged commit 8760a10 into pybind:master Jun 11, 2019

rwgk mentioned this pull request Feb 10, 2023

FWD pybind11 google/pybind11k#1595

Closed

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Fix async Python functors invoking from multiple C++ threads (#1587) #1595

Fix async Python functors invoking from multiple C++ threads (#1587) #1595

uentity commented Nov 6, 2018

wjakob commented Nov 9, 2018

wjakob commented Nov 9, 2018

uentity commented Nov 9, 2018

uentity commented Nov 9, 2018

uentity commented Nov 14, 2018 •

edited

Loading

davidhewitt commented Dec 1, 2018

uentity commented Dec 1, 2018 •

edited

Loading

davidhewitt commented Dec 1, 2018

davidhewitt Dec 4, 2018

uentity Dec 10, 2018

uentity Dec 11, 2018 •

edited

Loading

uentity commented Dec 10, 2018

hammer498 commented Jun 3, 2019

uentity commented Jun 6, 2019

wjakob commented Jun 11, 2019

Fix async Python functors invoking from multiple C++ threads (#1587) #1595

Fix async Python functors invoking from multiple C++ threads (#1587) #1595

Conversation

uentity commented Nov 6, 2018

wjakob commented Nov 9, 2018

wjakob commented Nov 9, 2018

uentity commented Nov 9, 2018

uentity commented Nov 9, 2018

uentity commented Nov 14, 2018 • edited Loading

davidhewitt commented Dec 1, 2018

uentity commented Dec 1, 2018 • edited Loading

davidhewitt commented Dec 1, 2018

davidhewitt Dec 4, 2018

Choose a reason for hiding this comment

uentity Dec 10, 2018

Choose a reason for hiding this comment

uentity Dec 11, 2018 • edited Loading

Choose a reason for hiding this comment

uentity commented Dec 10, 2018

hammer498 commented Jun 3, 2019

uentity commented Jun 6, 2019

wjakob commented Jun 11, 2019

uentity commented Nov 14, 2018 •

edited

Loading

uentity commented Dec 1, 2018 •

edited

Loading

uentity Dec 11, 2018 •

edited

Loading