New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add sendfile() support #143
Comments
On 12/20/16, giampaolo ***@***.***> wrote:
On the other hand I see that curio explicitly promotes the usage of a thread
pool when it comes to dealing with files (`aopen()` etc.).
I'm unaware that Curio promotes any threaded stuff for files, yet happily
using aopen and friends in pure linear curio code. Can you provide some
reference for such promotion?
- if sendfile() fails on the first call (because fd is not a regular fd),
internally we fallback on using the plain `chunk = file.read(maxsize) ->
sock.sendall(chunk)` methodology.
This is for/while loop with pure curio code. So no need to think about side
stepping to a thread.
- add a `Socket.sendfile(fd, maxsize=65536)` method which by default tries
to send the file using sendfile()
- by default `Socket.sendfile` *does not* use a thread pool, but if the user
is worried about blocking he/she can optionally do so with `await
run_in_thread(s.sendfile, fd)`.
Native sendfile will block curio until it returns. fd -> Socket
dumping will take time. So side stepping to a thread is needed.
|
I'm talking about: Line 48 in db2aa37
...which wraps all file methods (including open() ) into run_in_thread() .The code may be linear but a thread is still involved. My proposal was to avoid that ( run_in_thread ) by default and leave that choice to the user which can either do this if the disk is fast enough:
...or this if it's not:
My point is that modern disks should be fast enough so that the blocking time of a single |
On 12/20/16, giampaolo ***@***.***> wrote:
My point is that modern disks should be fast enough so that the blocking
time of a single `sendfile()` or `file.read()` call does not represent a
problem for the main IO loop.
"Disks are fast" is a subjective decleration. I personally
would not like the idea of blocking "Whole Curio"
when doing disk io. (including portions which have
nothing to do with disk io)
And note that native sendfile will block while reading from disk
AND SENDING IT TO THE SOCKET. This will not give back
control until ALL fd is send through the socket. That's a lot
of time. Till then you can't be doing anything.
Knowing this indirectly you are relating this to threading.
If native sendfile will be used then there aren't many choices
other than threads.
Or if you are happy with serving single file at a time then that's ok.
But note that if receiving side hangs or is sluggish, all
your server is bricked.
|
Just to clarify, when using sendfile() with a non-blocking socket sendfile() does not send the whole file in one shot. It sends a chunk of it (typically 65k) even if you specify a bigger buffer size, then it returns the number of bytes being sent (e.g. see pyftpdlib implementation). As to whether the read() part is "fast enough" I currently have no documentation to back this up so that should indeed be investigated. My point was trying to provide an API flexible enough so that you can choose which approach to use (threads vs. non-threads), depending on how much you trust your disk speed. |
I would say that we don't know yet whether blocking file I/O is a good idea
for curio programs. Certainly in many cases it's possible to get away with
it, but it's clearly a bit dicey -- the are tons of situations where disk
access has much higher latency than network access. The only real reason we
think "disk = blocking, network = non-blocking" is because of unix APIs
inherited from decades ago when system architectures were very different.
OTOH we are stuck with those APIs in many cases, and running on systems
with overallocation and paging (-> which implies that memory access can be
silently converted into a blocking disk read), and thread pools add their
own overhead, so... it's an open question.
I guess one option would be to check whether the file passed to sendfile is
a sync or async file object? Not sure what you do with a raw fd then, or if
someone wants to implement their own non-blocking file io, but it would
handle the 95% case.
|
On 12/20/16, giampaolo ***@***.***> wrote:
> This will not give back control until ALL fd is send through the socket.
Just to clarify, when using sendfile() with a non-blocking socket sendfile()
does not send the whole file in one shot. It sends a chunk of it (typically
65k) even if you specify a bigger buffer size, then it returns the number of
bytes being sent (e.g. see [pyftpdlib
Oh. Sorry.. From python documentation:
os.sendfile(): Copy count bytes from file descriptor in to file descriptor out
starting at offset. RETURN THE NUMBER OF BYTES SEND.
It runs in chunks! Not as bulk. Ok then..
It is a matter of how serious is that blocking on a chunk. You say it is
tolerable.. But still "Blocking problem of Reading the chunk and sending to
socket" is valid. Blocking problem is still there. But for a smaller size.
|
It is for sure, it should just be "tolerable most of the times". Or at least this is my experience with pyftpdlib running on SSD disks where the sendfile() call shows acceptable timings (< 0.1 secs per sendfile() call) also under heavy loads. Unfortunately and astonishingly a real solution for reliably doing kernel-supported non-blocking file IO under UNIX / Linux still does not exist (...and AIO sucks). It appears it should possible on BSD via kqueue(). Funnily Windows is the only one who managed to do this right. |
Unfortunately and astonishingly a real solution for
reliably doing kernel-supported non-blocking file IO under UNIX / Linux
still does not exist (...and AIO sucks).
Linus' comment for AIO guys is something like this:
"Do something real; like reading and writing..." :))
https://lwn.net/Articles/671649/
It appears it should possible on BSD via kqueue().
Funnily Windows is the only one who managed to do this
right.
They oversee the whole thing.
We are trying to glue syncs to asyncs here..
It should be a ground up design, which is well beyond the
scope of Curio.
|
This used to bother me a lot more until I realized that from the kernel point of view, one-thread-per-I/O-request is basically isomorphic to what Windows is doing. Basically you have to store the temporary state involved in keeping track of a half-finished I/O operation somewhere. Windows does this explicitly with some sort of manually maintained data structures that it switches between manually; Linux uses the C stack. If you're a kernel, though, then the C stack itself basically is just another data structure that you switch between... so the "just spawn a kernel thread for each request" approach described in that LWN article is basically as good as anything. How does this compare to a user-space threadpool? User-space threads are a little more expensive to context-switch between. And they take up user memory for their stack. But on a lazy-allocating system like Linux, this is only ~1 page per thread in practice, which is nothing for a reasonably sized thread pool. And they take up virtual address space, but on a 64-bit system there's plenty to spare. For programs that are close to the metal and pushing the limits of the hardware, these kinds of small overheads do matter... but for Python programs I doubt the kernel/user-space thread distinction is important. |
I see this was closed. FWIW asyncio recently implemented sendfile support (including on windows by using TransmitFile). |
Mainly closed because of inactivity. Will look at the asyncio implementation. Open to pull requests as well. |
Hello David,
would you be interested in a patch adding sendfile() support?
To read more about sendfile you can take a look at my python wrapper which I successfully used it pyftpdlib. sendfile is also available as os.sendfile() starting from python 3.3.
Some considerations:
file.read()
As such I'm not sure what's best to do. Having used sendfile() for years in pyftpdlib my impression is that disk IO is usually fast enough to not represent an actual problem for the main IO loop.
On the other hand I see that curio explicitly promotes the usage of a thread pool when it comes to dealing with files (
aopen()
etc.). So here's a possible proposal:Socket.sendfile(fd, maxsize=65536)
method which by default tries to send the file using sendfile()chunk = file.read(maxsize) -> sock.sendall(chunk)
methodology. This mimics the semantic of socket.sendfile() which I contributed some time ago and which is currently not exposed in theSocket
class.Socket.sendfile
does not use a thread pool, but if the user is worried about blocking he/she can optionally do so withawait run_in_thread(s.sendfile, fd)
.Thoughts?
The text was updated successfully, but these errors were encountered: