Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Better usability for threads #810

Open
njsmith opened this issue Dec 19, 2018 · 12 comments

Comments

@njsmith
Copy link
Member

commented Dec 19, 2018

I was ok with run_sync_in_worker_thread having a slightly-awkward name because threads are going to make your life awkward so, you know, it's fair warning. But when you're stuck using threads to hack around missing trio libraries, it can feel like a bit of extra punishment on top, which is not so nice.

Maybe we should rename it to run_sync_in_thread?

@njsmith njsmith changed the title Rename run_sync_in_worker_thread Better usability for threads Dec 27, 2018

@njsmith

This comment has been minimized.

Copy link
Member Author

commented Dec 27, 2018

Well, that seems popular.

Related idea: the whole "portal" design seems a bit more awkward than I was anticipating. I was imagining that threads were a mostly-for-experts kind of thing, and "foreign" threads would be as common as trio-spawned threads, but I feel like maybe it's not turning out that way.

Alternative idea (inspired by curio and anyio): provide a standard way to re-enter trio from a trio thread. Basically stash the token in thread-local storage, and then use it.

One possible API:

trio.run_from_thread(async_fn, *args, *, token=None)
trio.run_sync_from_thread(fn, *args, *, token=None)
trio.sync_cm_from_thread(cm, *, token=None)
trio.async_cm_from_thread(cm, *, token=None)

trio.run_sync_in_thread(fn, *args, *, ...)
trio.sync_cm_in_thread(cm, *, ...)

So the general idea is that from_thread is for going thread->trio, and in_thread is for going trio->thread. (It's taken for granted that one end is trio, because of course trio is the center of the universe. Also it's in the name of the functions.)

This pattern might be used for other projects too, like trio-asyncio – run_in_aio, run_from_aio. trio-asyncio has really struggled with making these names easy to understand – @smurfix, do you think this would do any better?

Alternatively, we could group these into a namespace, like:

trio.from_thread.run(...)
trio.from_thread.run_sync(...)
trio.from_thread.sync_cm(...)
trio.from_thread.async_cm(...)

trio.in_thread.run_sync(...)
trio.in_thread.sync_cm(...)

This would group related functions together for tab-completion, and avoid "polluting" the main trio namespace. OTOH it looks a bit weird, and flat is better than nested...

Another alternative, or complement: we could do async with trio.in_thread: .../async with trio.from_thread – see python-trio/trio-asyncio#42 for a discussion of why this is nice – in particular it automatically works for context managers.

See also: #680 for context managers across threads, #606 for propagating cancellation across trio->thread->trio transitions, and #648 for preserving contextvars across thread switches

@smurfix

This comment has been minimized.

Copy link
Contributor

commented Dec 27, 2018

trio.from_thread.async_cm(...)
trio.in_thread.run_sync(...)

The opposite of "from" is "to", so to_thread.* would work for me. in_thread is confusing because I read that as "I'm supposed to call these in a thread".

@njsmith

This comment has been minimized.

Copy link
Member Author

commented Mar 11, 2019

Another possible approach:

trio.run_sync_in_thread(...)

trio.threadsafe.check_cancelled()
trio.threadsafe.run_in_trio(...)
trio.threadsafe.run_sync_in_trio(...)
trio.threadsafe.async_cm(...)

The idea is that in a thread, you're only allowed to use threadsafe functions (which makes the safety rules very easy to document!), and otherwise it uses the same naming conventions as regular trio.

@njsmith

This comment has been minimized.

Copy link
Member Author

commented Jun 11, 2019

Another reason to reconsider the BlockingTrioPortal API: it's not currently compatible with having a global deadlock detector (#1085).

@njsmith

This comment has been minimized.

Copy link
Member Author

commented Jun 13, 2019

I guess a downside to trio.threadsafe is that it suggests that they're safe to call anywhere, but we probably don't want allow them to be called from inside the trio thread itself.

I think these are unambiguous and fairly terse:

# These are the ones that are safe to use from a thread
trio.from_thread.run(...)
trio.from_thread.run_sync(...)
trio.from_thread.cm(...)
trio.from_thread.async_cm(...)
trio.from_thread.check_cancel()

# These are what you use from Trio
trio.to_thread.run_sync(...)
trio.to_thread.cm(...)

Alternative bikeshed color no. 1:

trio.to_trio.run(...)
trio.to_trio.run_sync(...)
trio.to_trio.cm(...)
trio.to_trio.async_cm(...)
trio.to_trio.check_cancel()

trio.to_thread.run_sync(...)
trio.to_thread.cm(...)

Alternative bikeshed color no. 2:

trio.trioify.run(...)
trio.trioify.run_sync(...)
trio.trioify.cm(...)
trio.trioify.async_cm(...)
trio.trioify.check_cancel()

trio.threadify.run_sync(...)
trio.threadify.cm(...)

Alternative bikeshed no. 3:

trio.as_trio.run(...)
trio.as_trio.run_sync(...)
trio.as_trio.cm(...)
trio.as_trio.async_cm(...)
trio.as_trio.check_cancel()

trio.as_thread.run_sync(...)
trio.as_thread.cm(...)

Alternative bikeshed no. I lost count:

# These are the ones that are safe to use from a thread
trio.from_thread.run(...)
trio.from_thread.run_sync(...)
trio.from_thread.cm(...)
trio.from_thread.async_cm(...)
trio.from_thread.check_cancel()

# These are what you use from Trio
trio.use_thread.run_sync(...)
trio.use_thread.cm(...)

Hmm. I think I prefer the "from thread" style over the "to trio" style, because if you're starting in asyncio or something then you need a different way to get "to trio". These are specifically: to trio from a thread that's running synchronous, blocking code. Which "from thread" doesn't exactly say, but if you had from_asyncio and from_thread next to each other then I think it would be pretty clear what each one meant, while if you had to_trio_from_asyncio and to_trio next to each other then it would just be confusing.

OTOH, trio.to_thread is OK because like, obviously you are starting from Trio. You don't use trio.to_thread to get from asyncio to a worker thread.

There's still the use_thread vs to_thread. Doesn't seem like a huge difference, and I guess to is shorter and more parallel, so maybe we'll just go with to. I'm pleased at how trio.run_sync_in_thread and trio.to_thread.run_sync have the same number of characters, and mostly the same characters at that.

The context manager terminology is a bit confusing:

  • For run/run_sync, the default is async, but for cm/async_cm, the default is sync
  • For trio.use_thread.cm, you give it a sync context manager and it returns an async context manager

Maybe it's better to explicitly use sync_cm & async_cm? I doubt there's any perfect answer here, because we're not going to rename trio.run to trio.run_async, and the larger Python community is never going to rename context manager → sync context manager, async context manager → context manager, so some inconsistency is unavoidable.

For the unusual case where someone wants to get into Trio from a thread that wasn't started by Trio: one option would be to have an object that implements the same stuff as from_thread. But another interesting possibility would be to lean into the thread-local idea – Trio threads get some kind of "entry handle" stashed in a thread-local by default, or, if you have an "entry handle" object, you could temporarily stash it in the thread-local by doing something like:

with entry_handle.register_thread():
    trio.from_thread.run(...)

Possibly this should automatically close the handle at the end?

I guess there's an argument for using a ContextVar instead of a thread-local. Imagine two trio threads talking to each other. Within a single trio thread, two different branches of the task tree might each want their own entry handles, maybe? Seems obscure but using a ContextVar is probably just as easy and fast as a thread-local, so why not.

@njsmith

This comment has been minimized.

Copy link
Member Author

commented Jun 13, 2019

Maybe #1099 was a mistake though... if we're going to settle on trio.to_thread.run_sync as the final name, then asking everyone to make a detour through run_sync_in_thread is silly.

@pquentin

This comment has been minimized.

Copy link
Member

commented Jun 13, 2019

(Note that it's cheap to change #1099 before the next release.)

@epellis

This comment has been minimized.

Copy link
Contributor

commented Jun 19, 2019

I am interested in working on this so that #1085 can be unblocked. From what I'm gathering from this conversation, what we want to do is change how Trio calls into threads by deprecating BlockingTrioPortal and renaming it into trio.from_thread.* in a similar manner as #1098 did for trio calling into threads, by keeping track of how many trio tokens are live and might call back into the Trio thread at a given time.
Is this the right line of thinking or do we want to do something different here?

@njsmith

This comment has been minimized.

Copy link
Member Author

commented Jun 19, 2019

@epellis OK good question :-)

So there are a few different issues with BlockingTrioPortal:

  • The BlockingTrioPortal design is kinda inconvenient for users, because it's this extra object you have to instantiate and carry around. The theory is that this is handy because it lets you be explicit about which Trio thread you want to re-enter, and you can do it from any thread... but in practice, when people want to re-enter Trio, it's almost always because they've used trio.run_sync_in_thread to make a thread, and want to get back into the Trio thread they started in.
  • The BlockingTrioPortal design is inconvenient for deadlock detection, because of the issues discussed in #1085 and #1098
  • One of the major frustrations with run_sync_in_thread is that it doesn't play well with cancellation. This can't be handled automatically like for real Trio tasks, but we want to make it easy for the sync code in the thread to at least check for cancellation. And as part of that, it would be nice if going Trio → thread → back into Trio would automatically re-enable cancellation, so if the "outer" Trio code was cancelled then the "inner" Trio code would notice and automatically raise Cancelled. Making this work has a number of challenges, because everything involving cancellation is wickedly complicated (see #961 for some more details), but the most fundamental problem is that right now BlockingTrioPortal and run_sync_in_thread are two totally disconnected mechanisms, so even in principle there's no way to make the connection and realize that a cancellation at run_sync_in_thread should be delivered to code running in a BlockingTrioPortal.

So trio.from_thread.run and friends are designed to address all these issues:

  • They aren't methods; they're just global functions. The idea is, whenever you call run_sync_in_thread it will automatically inject some thread-global state into the new thread, including the TrioToken-or-equivalent. Then when you call these functions, they look up this state, and use it to find the Trio thread. That way you don't have to pass around this object everywhere – any code in the thread can re-enter Trio any time it wants.
  • If run_sync_in_thread is automatically injected this magic ticket to re-enter Trio, then it can manage the lifetime, and automatically mark it closed when the thread exits. That makes the deadlock detector happy.
  • Now that run_sync_in_thread is managing the re-entry mechanism, we can potentially hook up cancellation signalling between run_sync_in_thread and re-entered code.

So that's the big idea. But, we don't have to solve all of these problems at once – in particular, it's probably simplest to start by implementing a basic version of trio.from_thread.run and friends and deprecating BlockingTrioPortal, and then after that's working we can think about how to integrate the new run_sync_soon API in #1098, and about how to make them support cancellation properly. And in fact, to deprecate BlockingTrioPortal, we don't even need the context manager helpers like trio.from_thread.sync_cm, since BlockingTrioPortal doesn't have those either.

So I think the next step is to implement trio.from_thread.run, trio.from_thread.run_sync, and make it so you can run this program:

import trio

def thread_fn():
    start = trio.from_thread.run_sync(trio.current_time)
    print("In Trio-land, the time is now:", start)
    trio.from_thread.run(trio.sleep, 1)
    end = trio.from_thread.run_sync(trio.current_time)
    print("And now it's:", end)

async def main():
    await trio.run_sync_in_thread(thread_fn)

trio.run(main)

njsmith added a commit to njsmith/trio that referenced this issue Jul 27, 2019

Add new trio.to_thread module for running worker threads
For consistency with trio.from_thread, and to give us a place for
future extensions, like utilities for pushing context managers into
threads.

See python-triogh-810.
@njsmith

This comment has been minimized.

Copy link
Member Author

commented Jul 27, 2019

#1122 converted the old BlockingTrioPortal operations into functions trio.from_thread, and made it so they automagically work if called from inside a thread spawned by Trio. #1156 completes the renaming of trio.run_sync_in_worker_thread to trio.to_thread.run_sync. So I think we're done with refactoring the old stuff.

Not done yet:

  • trio.from_thread.sync_cm, trio.from_thread.async_cm
  • trio.to_thread.sync_cm

These are interesting/tricky because they need to use the same task/thread to call both __enter__ and __exit__.

That means they might overlap with #606, which would also involve using the same task for multiple entries to trio (#606 (comment))

@MauiJerry

This comment has been minimized.

Copy link

commented Aug 6, 2019

latest install 0.12.0, gives message...

TrioDeprecationWarning: trio.run_sync_in_worker_thread is deprecated since Trio 0.12.0; use trio.to_thread.to_thread_run_sync instead (#810)

however, in to_thread.py it says...

from ._threads import to_thread_run_sync as run_sync

thus the correct code should be

      stuff = await trio.to_thread.run_sync(input, cancellable=True)

(from the pynng example pair1_async.py

@njsmith

This comment has been minimized.

Copy link
Member Author

commented Aug 7, 2019

@MauiJerry Nice catch! That should be fixed by #1177.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
6 participants
You can’t perform that action at this time.