trio
Trio provides a set of abstract base classes that define a standard interface for unidirectional and bidirectional byte streams.
Why is this useful? Because it lets you write generic protocol implementations that can work over arbitrary transports, and easily create complex transport configurations. Here's some examples:
trio.SocketStream
wraps a raw socket (like a TCP connection over the network), and converts it to the standard stream interface.trio.SSLStream
is a "stream adapter" that can take any object that implements thetrio.abc.Stream
interface, and convert it into an encrypted stream. In Trio the standard way to speak SSL over the network is to wrap an~trio.SSLStream
around a~trio.SocketStream
.If you spawn a
subprocess <subprocess>
, you can get a~trio.abc.SendStream
that lets you write to its stdin, and a~trio.abc.ReceiveStream
that lets you read from its stdout. If for some reason you wanted to speak SSL to a subprocess, you could use aStapledStream
to combine its stdin/stdout into a single bidirectional~trio.abc.Stream
, and then wrap that in an~trio.SSLStream
:ssl_context = ssl.create_default_context() ssl_context.check_hostname = False s = SSLStream(StapledStream(process.stdin, process.stdout), ssl_context)
It sometimes happens that you want to connect to an HTTPS server, but you have to go through a web proxy... and the proxy also uses HTTPS. So you end up having to do SSL-on-top-of-SSL. In Trio this is trivial – just wrap your first
~trio.SSLStream
in a second~trio.SSLStream
:# Get a raw SocketStream connection to the proxy: s0 = await open_tcp_stream("proxy", 443) # Set up SSL connection to proxy: s1 = SSLStream(s0, proxy_ssl_context, server_hostname="proxy") # Request a connection to the website await s1.send_all(b"CONNECT website:443 / HTTP/1.0\r\n\r\n") await check_CONNECT_response(s1) # Set up SSL connection to the real website. Notice that s1 is # already an SSLStream object, and here we're wrapping a second # SSLStream object around it. s2 = SSLStream(s1, website_ssl_context, server_hostname="website") # Make our request await s2.send_all(b"GET /index.html HTTP/1.0\r\n\r\n") ...
- The
trio.testing
module provides a set offlexible in-memory stream object implementations <testing-streams>
, so if you have a protocol implementation to test then you can can start two tasks, set up a virtual "socket" connecting them, and then do things like inject random-but-repeatable delays into the connection.
trio.abc
Abstract base class | Inherits from... | Adds these abstract methods... | And these concrete methods. | Example implementations |
---|---|---|---|---|
AsyncResource |
~AsyncResource.aclose |
__aenter__ , __aexit__ |
async-file-objects |
|
SendStream |
AsyncResource |
~SendStream.send_all , ~SendStream.wait_send_all_might_not_block |
~trio.testing.MemorySendStream |
|
ReceiveStream |
AsyncResource |
~ReceiveStream.receive_some |
__aiter__ , __anext__ |
~trio.testing.MemoryReceiveStream |
Stream |
SendStream , ReceiveStream |
~trio.SSLStream |
||
HalfCloseableStream |
Stream |
~HalfCloseableStream.send_eof |
~trio.SocketStream , ~trio.StapledStream |
|
Listener |
AsyncResource |
~Listener.accept |
~trio.SocketListener , ~trio.SSLListener |
|
SendChannel |
AsyncResource |
~SendChannel.send |
~trio.MemorySendChannel | |
ReceiveChannel |
AsyncResource |
~ReceiveChannel.receive |
__aiter__ , __anext__ |
~trio.MemoryReceiveChannel |
Channel | SendChannel, ReceiveChannel |
trio.abc.AsyncResource
trio
aclose_forcefully
trio.abc
trio.abc.SendStream
trio.abc.ReceiveStream
trio.abc.Stream
trio.abc.HalfCloseableStream
trio.abc
trio.abc.Listener
trio.abc.SendChannel
trio.abc.ReceiveChannel
trio.abc.Channel
trio
Trio currently provides a generic helper for writing servers that listen for connections using one or more ~trio.abc.Listener
s, and a generic utility class for working with streams. And if you want to test code that's written against the streams interface, you should also check out testing-streams
in trio.testing
.
serve_listeners
StapledStream
The high-level network interface is built on top of our stream abstraction.
open_tcp_stream
serve_tcp
open_ssl_over_tcp_stream
serve_ssl_over_tcp
open_unix_socket
SocketStream
SocketListener
open_tcp_listeners
open_ssl_over_tcp_listeners
Trio provides SSL/TLS support based on the standard library ssl
module. Trio's SSLStream
and SSLListener
take their configuration from a ssl.SSLContext
, which you can create using ssl.create_default_context
and customize using the other constants and functions in the ssl
module.
Warning
Avoid instantiating ssl.SSLContext
directly. A newly constructed ~ssl.SSLContext
has less secure defaults than one returned by ssl.create_default_context
.
Instead of using ssl.SSLContext.wrap_socket
, you create a SSLStream
:
SSLStream
And if you're implementing a server, you can use SSLListener
:
SSLListener
Some methods on SSLStream
raise NeedHandshakeError
if you call them before the handshake completes:
NeedHandshakeError
Trio also has support for Datagram TLS (DTLS), which is like TLS but for unreliable UDP connections. This can be useful for applications where TCP's reliable in-order delivery is problematic, like teleconferencing, latency-sensitive games, and VPNs.
Currently, using DTLS with Trio requires PyOpenSSL. We hope to eventually allow the use of the stdlib ssl module as well, but unfortunately that's not yet possible.
Warning
Note that PyOpenSSL is in many ways lower-level than the ssl module – in particular, it currently HAS NO BUILT-IN MECHANISM TO VALIDATE CERTIFICATES. We strongly recommend that you use the service-identity library to validate hostnames and certificates.
DTLSEndpoint
connect
serve
close
DTLSChannel
do_handshake
send
receive
close
aclose
set_ciphertext_mtu
get_cleartext_mtu
statistics
trio.socket
The trio.socket
module provides Trio's basic low-level networking API. If you're doing ordinary things with stream-oriented connections over IPv4/IPv6/Unix domain sockets, then you probably want to stick to the high-level API described above. If you want to use UDP, or exotic address families like AF_BLUETOOTH
, or otherwise get direct access to all the quirky bits of your system's networking API, then you're in the right place.
Generally, the API exposed by trio.socket
mirrors that of the standard library socket
module. Most constants (like SOL_SOCKET
) and simple utilities (like ~socket.inet_aton
) are simply re-exported unchanged. But there are also some differences, which are described here.
First, Trio provides analogues to all the standard library functions that return socket objects; their interface is identical, except that they're modified to return Trio socket objects instead:
socket
socketpair
fromfd
fromshare(data)
Like socket.fromshare
, but returns a Trio socket object.
In addition, there is a new function to directly convert a standard library socket into a Trio socket:
from_stdlib_socket
Unlike socket.socket
, trio.socket.socket
is a function, not a class; if you want to check whether an object is a Trio socket, use isinstance(obj, trio.socket.SocketType)
.
For name lookup, Trio provides the standard functions, but with some changes:
getaddrinfo
getnameinfo
getprotobyname
Trio intentionally DOES NOT include some obsolete, redundant, or broken features:
~socket.gethostbyname
,~socket.gethostbyname_ex
,~socket.gethostbyaddr
: obsolete; use~socket.getaddrinfo
and~socket.getnameinfo
instead.~socket.getservbyport
: obsolete and buggy; instead, do:_, service_name = await getnameinfo((127.0.0.1, port), NI_NUMERICHOST))
~socket.getservbyname
: obsolete and buggy; instead, do:await getaddrinfo(None, service_name)
~socket.getfqdn
: obsolete; usegetaddrinfo
with theAI_CANONNAME
flag.~socket.getdefaulttimeout
,~socket.setdefaulttimeout
: instead, use Trio's standard support forcancellation
.- On Windows,
SO_REUSEADDR
is not exported, because it's a trap: the name is the same as UnixSO_REUSEADDR
, but the semantics are different and extremely broken. In the very rare cases where you actually wantSO_REUSEADDR
on Windows, then it can still be accessed from the standard library'ssocket
module.
Note
trio.socket.SocketType
is an abstract class and cannot be instantiated directly; you get concrete socket objects by calling constructors like trio.socket.socket
. However, you can use it to check if an object is a Trio socket via isinstance(obj, trio.socket.SocketType)
.
Trio socket objects are overall very similar to the standard
library socket objects <python:socket-objects>
, with a few important differences:
First, and most obviously, everything is made "Trio-style": blocking methods become async methods, and the following attributes are not supported:
~socket.socket.setblocking
: Trio sockets always act like blocking sockets; if you need to read/write from multiple sockets at once, then create multiple tasks.~socket.socket.settimeout
: seecancellation
instead.~socket.socket.makefile
: Python's file-like API is synchronous, so it can't be implemented on top of an async socket.~socket.socket.sendall
: Could be supported, but you're better off using the higher-level~trio.SocketStream
, and specifically its~trio.SocketStream.send_all
method, which also does additional error checking.
In addition, the following methods are similar to the equivalents in socket.socket
, but have some Trio-specific quirks:
connect
Connect the socket to a remote address.
Similar to socket.socket.connect
, except async.
Warning
Due to limitations of the underlying operating system APIs, it is not always possible to properly cancel a connection attempt once it has begun. If connect
is cancelled, and is unable to abort the connection attempt, then it will:
- forcibly close the socket to prevent accidental re-use
- raise
~trio.Cancelled
.
tl;dr: if connect
is cancelled then the socket is left in an unknown state – possibly open, and possibly closed. The only reasonable thing to do is to close it.
is_readable
Check whether the socket is readable or not.
sendfile
We also keep track of an extra bit of state, because it turns out to be useful for trio.SocketStream
:
did_shutdown_SHUT_WR
This bool
attribute is True if you've called sock.shutdown(SHUT_WR)
or sock.shutdown(SHUT_RDWR)
, and False otherwise.
The following methods are identical to their equivalents in socket.socket
, except async, and the ones that take address arguments require pre-resolved addresses:
~socket.socket.accept
~socket.socket.bind
~socket.socket.recv
~socket.socket.recv_into
~socket.socket.recvfrom
~socket.socket.recvfrom_into
~socket.socket.recvmsg
(if available)~socket.socket.recvmsg_into
(if available)~socket.socket.send
~socket.socket.sendto
~socket.socket.sendmsg
(if available)
All methods and attributes not mentioned above are identical to their equivalents in socket.socket
:
~socket.socket.family
~socket.socket.type
~socket.socket.proto
~socket.socket.fileno
~socket.socket.listen
~socket.socket.getpeername
~socket.socket.getsockname
~socket.socket.close
~socket.socket.shutdown
~socket.socket.setsockopt
~socket.socket.getsockopt
~socket.socket.dup
~socket.socket.detach
~socket.socket.share
~socket.socket.set_inheritable
~socket.socket.get_inheritable
trio
Trio provides built-in facilities for performing asynchronous filesystem operations like reading or renaming a file. Generally, we recommend that you use these instead of Python's normal synchronous file APIs. But the tradeoffs here are somewhat subtle: sometimes people switch to async I/O, and then they're surprised and confused when they find it doesn't speed up their program. The next section explains the theory behind async file I/O, to help you better understand your code's behavior. Or, if you just want to get started, you can jump down to the API overview <async-file-io-overview>
.
Many people expect that switching from synchronous file I/O to async file I/O will always make their program faster. This is not true! If we just look at total throughput, then async file I/O might be faster, slower, or about the same, and it depends in a complicated way on things like your exact patterns of disk access, or how much RAM you have. The main motivation for async file I/O is not to improve throughput, but to reduce the frequency of latency glitches.
To understand why, you need to know two things.
First, right now no mainstream operating system offers a generic, reliable, native API for async file or filesystem operations, so we have to fake it by using threads (specifically, trio.to_thread.run_sync
). This is cheap but isn't free: on a typical PC, dispatching to a worker thread adds something like ~100 µs of overhead to each operation. ("µs" is pronounced "microseconds", and there are 1,000,000 µs in a second. Note that all the numbers here are going to be rough orders of magnitude to give you a sense of scale; if you need precise numbers for your environment, measure!)
And second, the cost of a disk operation is incredibly bimodal. Sometimes, the data you need is already cached in RAM, and then accessing it is very, very fast – calling io.FileIO
's read
method on a cached file takes on the order of ~1 µs. But when the data isn't cached, then accessing it is much, much slower: the average is ~100 µs for SSDs and ~10,000 µs for spinning disks, and if you look at tail latencies then for both types of storage you'll see cases where occasionally some operation will be 10x or 100x slower than average. And that's assuming your program is the only thing trying to use that disk – if you're on some oversold cloud VM fighting for I/O with other tenants then who knows what will happen. And some operations can require multiple disk accesses.
Putting these together: if your data is in RAM then it should be clear that using a thread is a terrible idea – if you add 100 µs of overhead to a 1 µs operation, then that's a 100x slowdown! On the other hand, if your data's on a spinning disk, then using a thread is great – instead of blocking the main thread and all tasks for 10,000 µs, we only block them for 100 µs and can spend the rest of that time running other tasks to get useful work done, which can effectively be a 100x speedup.
But here's the problem: for any individual I/O operation, there's no way to know in advance whether it's going to be one of the fast ones or one of the slow ones, so you can't pick and choose. When you switch to async file I/O, it makes all the fast operations slower, and all the slow operations faster. Is that a win? In terms of overall speed, it's hard to say: it depends what kind of disks you're using and your kernel's disk cache hit rate, which in turn depends on your file access patterns, how much spare RAM you have, the load on your service, ... all kinds of things. If the answer is important to you, then there's no substitute for measuring your code's actual behavior in your actual deployment environment. But what we can say is that async disk I/O makes performance much more predictable across a wider range of runtime conditions.
If you're not sure what to do, then we recommend that you use async disk I/O by default, because it makes your code more robust when conditions are bad, especially with regards to tail latencies; this improves the chances that what your users see matches what you saw in testing. Blocking the main thread stops all tasks from running for that time. 10,000 µs is 10 ms, and it doesn't take many 10 ms glitches to start adding up to real money; async disk I/O can help prevent those. Just don't expect it to be magic, and be aware of the tradeoffs.
If you want to perform general filesystem operations like creating and listing directories, renaming files, or checking file metadata – or if you just want a friendly way to work with filesystem paths – then you want trio.Path
. It's an asyncified replacement for the standard library's pathlib.Path
, and provides the same comprehensive set of operations.
For reading and writing to files and file-like objects, Trio also provides a mechanism for wrapping any synchronous file-like object into an asynchronous interface. If you have a trio.Path
object you can get one of these by calling its ~trio.Path.open
method; or if you know the file's name you can open it directly with trio.open_file
. Alternatively, if you already have an open file-like object, you can wrap it with trio.wrap_file
– one case where this is especially useful is to wrap io.BytesIO
or io.StringIO
when writing tests.
Path
open_file
wrap_file
Asynchronous file interface
Trio's asynchronous file objects have an interface that automatically adapts to the object being wrapped. Intuitively, you can mostly treat them like a regular file object
, except adding an await
in front of any of methods that do I/O. The definition of file object
is a little vague in Python though, so here are the details:
- Synchronous attributes/methods: if any of the following attributes or methods are present, then they're re-exported unchanged:
closed
,encoding
,errors
,fileno
,isatty
,newlines
,readable
,seekable
,writable
,buffer
,raw
,line_buffering
,closefd
,name
,mode
,getvalue
,getbuffer
. - Async methods: if any of the following methods are present, then they're re-exported as an async method:
flush
,read
,read1
,readall
,readinto
,readline
,readlines
,seek
,tell
,truncate
,write
,writelines
,readinto1
,peek
,detach
.
Special notes:
- Async file objects implement Trio's
~trio.abc.AsyncResource
interface: you close them by calling~trio.abc.AsyncResource.aclose
instead ofclose
(!!), and they can be used as async context managers. Like all~trio.abc.AsyncResource.aclose
methods, theaclose
method on async file objects is guaranteed to close the file before returning, even if it is cancelled or otherwise raises an error. - Using the same async file object from multiple tasks simultaneously: because the async methods on async file objects are implemented using threads, it's only safe to call two of them at the same time from different tasks IF the underlying synchronous file object is thread-safe. You should consult the documentation for the object you're wrapping. For objects returned from
trio.open_file
ortrio.Path.open
, it depends on whether you open the file in binary mode or text mode: binary mode files are task-safe/thread-safe, text mode files are not. Async file objects can be used as async iterators to iterate over the lines of the file:
async with await trio.open_file(...) as f: async for line in f: print(line)
- The
detach
method, if present, returns an async file object.
This should include all the attributes exposed by classes in io
. But if you're wrapping an object that has other attributes that aren't on the list above, then you can access them via the .wrapped
attribute:
wrapped
The underlying synchronous file object.
Trio provides support for spawning other programs as subprocesses, communicating with them via pipes, sending them signals, and waiting for them to exit.
Most of the time, this is done through our high-level interface, trio.run_process. It lets you either run a process to completion while optionally capturing the output, or else run it in a background task and interact with it while it's running:
trio.run_process
trio.Process
returncode
wait
poll
kill
terminate
send_signal
Note
~subprocess.Popen.communicate
is not provided as a method on ~trio.Process
objects; call ~trio.run_process
normally for simple capturing, or write the loop yourself if you have unusual needs. ~subprocess.Popen.communicate
has quite unusual cancellation behavior in the standard library (on some platforms it spawns a background thread which continues to read from the child process even after the timeout has expired) and we wanted to provide an interface with fewer surprises.
If trio.run_process is too limiting, we also offer a low-level API, trio.lowlevel.open_process. For example, if you want to spawn a child process that will outlive the parent process and be orphaned, then ~trio.run_process can't do that, but ~trio.lowlevel.open_process can.
All of Trio's subprocess APIs accept the numerous keyword arguments used by the standard subprocess
module to control the environment in which a process starts and the mechanisms used for communicating with it. These may be passed wherever you see **options
in the documentation below. See the full list or just the frequently used ones in the subprocess
documentation. (You may need to import subprocess
in order to access constants such as PIPE
or DEVNULL
.)
Currently, Trio always uses unbuffered byte streams for communicating with a process, so it does not support the encoding
, errors
, universal_newlines
(alias text
), and bufsize
options.
The command to run and its arguments usually must be passed to Trio's subprocess APIs as a sequence of strings, where the first element in the sequence specifies the command to run and the remaining elements specify its arguments, one argument per element. This form is used because it avoids potential quoting pitfalls; for example, you can run ["cp", "-f", source_file, dest_file]
without worrying about whether source_file
or dest_file
contains spaces.
If you only run subprocesses without shell=True
and on UNIX, that's all you need to know about specifying the command. If you use shell=True
or run on Windows, you probably should read the rest of this section to be aware of potential pitfalls.
With shell=True
on UNIX, you must specify the command as a single string, which will be passed to the shell as if you'd entered it at an interactive prompt. The advantage of this option is that it lets you use shell features like pipes and redirection without writing code to handle them. For example, you can write Process("ls | grep some_string", shell=True)
. The disadvantage is that you must account for the shell's quoting rules, generally by wrapping in shlex.quote
any argument that might contain spaces, quotes, or other shell metacharacters. If you don't do that, your safe-looking f"ls | grep {some_string}"
might end in disaster when invoked with some_string = "foo; rm -rf /"
.
On Windows, the fundamental API for process spawning (the CreateProcess()
system call) takes a string, not a list, and it's actually up to the child process to decide how it wants to split that string into individual arguments. Since the C language specifies that main()
should take a list of arguments, most programs you encounter will follow the rules used by the Microsoft C/C++ runtime. subprocess.Popen
, and thus also Trio, uses these rules when it converts an argument sequence to a string, and they are documented alongside the subprocess
module. There is no documented Python standard library function that can directly perform that conversion, so even on Windows, you almost always want to pass an argument sequence rather than a string. But if the program you're spawning doesn't split its command line back into individual arguments in the standard way, you might need to pass a string to work around this. (Or you might just be out of luck: as far as I can tell, there's simply no way to pass an argument containing a double-quote to a Windows batch file.)
On Windows with shell=True
, things get even more chaotic. Now there are two separate sets of quoting rules applied, one by the Windows command shell CMD.EXE
and one by the process being spawned, and they're different. (And there's no shlex.quote
to save you: it uses UNIX-style quoting rules, even on Windows.) Most special characters interpreted by the shell &<>()^|
are not treated as special if the shell thinks they're inside double quotes, but %FOO%
environment variable substitutions still are, and the shell doesn't provide any way to write a double quote inside a double-quoted string. Outside double quotes, any character (including a double quote) can be escaped using a leading ^
. But since a pipeline is processed by running each command in the pipeline in a subshell, multiple layers of escaping can be needed:
echo ^^^&x | find "x" | find "x" # prints: &x
And if you combine pipelines with () grouping, you can need even more levels of escaping:
(echo ^^^^^^^&x | find "x") | find "x" # prints: &x
Since process creation takes a single arguments string, CMD.EXE
's quoting does not influence word splitting, and double quotes are not removed during CMD.EXE's expansion pass. Double quotes are troublesome because CMD.EXE handles them differently from the MSVC runtime rules; in:
prog.exe "foo \"bar\" baz"
the program will see one argument foo "bar" baz
but CMD.EXE thinks bar\
is not quoted while foo \
and baz
are. All of this makes it a formidable task to reliably interpolate anything into a shell=True
command line on Windows, and Trio falls back on the subprocess
behavior: If you pass a sequence with shell=True
, it's quoted in the same way as a sequence with shell=False
, and had better not contain any shell metacharacters you weren't planning on.
Further reading:
- https://stackoverflow.com/questions/30620876/how-to-properly-escape-filenames-in-windows-cmd-exe
- https://stackoverflow.com/questions/4094699/how-does-the-windows-command-interpreter-cmd-exe-parse-scripts
trio
open_signal_receiver