Non-blocking/Evented I/O #395

Closed
jnicholls opened this Issue Mar 24, 2015 · 127 comments

Projects

None yet
@jnicholls

Hyper would be far more powerful of a client & server if it was based on traditional event-oriented I/O, single-threaded or multi-threaded. You should look into https://github.com/carllerche/mio or a wrapper around libuv or something of that sort.

Another option is to split hyper up into multiple crates, and refactor the client & server to abstract the HTTP protocol handling (reading & writing requests/responses onto a Stream) so that someone can use the client and/or server logic on top of their own sockets/streams that are polled in an event loop. Think, libcurl's multi interface + libuv's uv_poll_t.

@seanmonstar
Member

We agree. We're actively looking into it. Mio looks promising. We also need
a Windows library, and a wrapper combining the two.

On Tue, Mar 24, 2015, 6:06 AM Jarred Nicholls notifications@github.com
wrote:

Hyper would be far more powerful of a client & server if it was based on
traditional event-oriented I/O, single-threaded or multi-threaded. You
should look into https://github.com/carllerche/mio or a wrapper around
libuv or something of that sort.

โ€”
Reply to this email directly or view it on GitHub
#395.

@hoxnox
hoxnox commented Apr 23, 2015

Do you already have a vision how to embed mio into hyper? I'm very interested in async client and have enough time to contribute some code.

@seanmonstar
Member

I don't have a vision; I haven't looked that hard into how mio works. I'd love to hear suggestions.

@seanmonstar seanmonstar modified the milestone: Rust 1.0 Apr 27, 2015
@seanmonstar seanmonstar modified the milestone: Rust 1.0, 1.0 May 6, 2015
@jnicholls

mio will be adding Windows support in the near future, so depending upon it should be a safe bet.

The API surface of hyper's server will not have to change much if at all, but the client will need an async interface, either in the form of closure callbacks, a trait handler, something like a Future or Promise return value, etc.

@dcsommer
dcsommer commented Jul 3, 2015

+1 for prioritizing a trait handler. Futures have some amount of overhead, and closure callbacks have even more overhead and can lead to callback hell. If the goal is maximum performance, an async handler interface would be a natural starting point.

@jnicholls

Yeah honestly a trait handler with monomorphization/static dispatch is the only way to go.

@bfrog
Contributor
bfrog commented Jul 10, 2015

+1 for the async handler Trait

@talevy
talevy commented Jul 25, 2015

This is very much premature, but figured any activity to this thread is a positive!

I have been playing around with what it would look like to write an asynchronous hyper client.

here it goes: https://github.com/talevy/tengas.

this has many things hardcoded, and is not "usable" by any means. Currently it does just enough
to get an event loop going and allows to do basic GET requests and handle the response within a callback function.

I tried to re-use as many components of hyper as possible. Seems to work!

I had to re-implement HttpStream to use mio's TcpStream instead of the standard one.

I plan on making this more generic and slowly match the original hyper client capabilities.

Any feedback is welcome! Code is a slight mess because it is the first pass at this to make it work.

@seanmonstar
Member

I've been investigating mio support, and fitting it in was actually pretty simple (in a branch). I may continue the branch and include the support with a cargo feature flag, but I can't switch over completely until Windows support exists.

@jnicholls

A feature flag makes great sense in this case then. There are plenty of
people who would be able to take advantage of hyper + mio on *nix systems;
probably the vast majority of hyper users in fact.

On Sun, Jul 26, 2015 at 5:22 PM, Sean McArthur notifications@github.com
wrote:

I've been investigating mio support, and fitting it in was actually pretty
simple (in a branch). I may continue the branch and include the support
with a cargo feature flag, but I can't switch over completely until Windows
support exists.

โ€”
Reply to this email directly or view it on GitHub
#395 (comment).

@jdm
jdm commented Jul 27, 2015

Servo would be super interested in hyper + mio to reduce the thread bloat :)

@gobwas
gobwas commented Jul 27, 2015

hyper + mio looks very promising =) ๐Ÿ‘

@bfrog
Contributor
bfrog commented Jul 27, 2015

I would assume there would be some number of threads with event loops handling http requests rather than one thread with one event loop?

@talevy
talevy commented Jul 27, 2015

@seanmonstar is this branch public somewhere?

@seanmonstar
Member

Not yet. It doesn't use an event loop yet, I simply switched out usage of
std::net with mio::tcp. Which works fine for small requests that don't
block...

On Mon, Jul 27, 2015, 8:56 AM Tal Levy notifications@github.com wrote:

@seanmonstar https://github.com/seanmonstar is this branch public
somewhere?

โ€”
Reply to this email directly or view it on GitHub
#395 (comment).

@bfrog
Contributor
bfrog commented Jul 27, 2015

If hyper can add that feature I'd basically consider it usable for myself in production, otherwise it would probably cause a great deal of thread context switching for my use case (lots and lots and lots of short lived connections)

@gobwas
gobwas commented Aug 6, 2015

By my own benchmarks with lots of http connections, rust will be fastest way, if it will have async io:

@jnicholls

It is interesting how stable express, vanilla, and spray are in terms of
response times over time. I'm surprised nickel and iron are not equally as
stable; interestingly enough they both have the same shape, so my guess is
it's identical behavior on their primary dependency: hyper :)

On Thu, Aug 6, 2015 at 5:33 AM, Sergey Kamardin notifications@github.com
wrote:

By my own benchmarks with lots of http connections, rust will be fastest
way, if it will have async io:

https://camo.githubusercontent.com/698a9884f2e7734d4fd27b8af45a4b79ef06c3bd/68747470733a2f2f73332e616d617a6f6e6177732e636f6d2f662e636c2e6c792f6974656d732f30543230316f3053336c324633653149336631782f254430254131254430254244254430254238254430254243254430254245254430254241253230254431253844254430254241254431253830254430254230254430254244254430254230253230323031352d30382d303625323025443025423225323031322e33322e35362e706e67

โ€”
Reply to this email directly or view it on GitHub
#395 (comment).

@gobwas
gobwas commented Aug 6, 2015

@jnicholls fair enough ๐Ÿป

@tailhook
tailhook commented Aug 8, 2015

@seanmonstar

I don't have a vision; I haven't looked that hard into how mio works. I'd love to hear suggestions.

I have a vision. In short it boils down to splitting hyper into three logical parts:

  1. Types (Headers, Status, Version..., may be some generic version of Request)
  2. Logic. For example the function determining what HTTPReader is used, should be decoupled from real streams. I.e. there should be enum like HTTPReaderKind which then is turned into current HTTPReader with a simple method like kind.with_stream(stream)
  3. And code handling real streams, with convenience Request objects implementing Read, buffering an so on.

The first item is basically ok, except maybe put types into separate crates. But logic is too coupled with streams. Decoupling it should also simplify testing AFAIU.

Then we can do competing experimental asychronous I/O implementations without rewriting too much of hyper. (I will publish my implementation soon). The biggest question on mio right now is how to make things composable. I.e. you can't mix multiple applications in same async loop, until some better abstractions are implemented, so I'm currently experimenting with it.

How does this sound? What I need to start contributing these changes into hyper?

@jnicholls

I agree that hyper should decouple the logic of composing and parsing HTTP
requests/responses from the actual I/O. This is what I alluded to in my
original request. Such a change would make it possible to run any kind of
I/O model (in-memory, blocking I/O, non-blocking I/O, etc.) and any
sub-variants thereof (unix readiness model, windows callback/IOCP model)
with any stack that a user would prefer to use (mio, curl multi-interface +
libuv, etc.)

That's a lot of freedom offered by simply splitting up the composition and
parsing logic from the I/O logic. I agree with Paul.

On Fri, Aug 7, 2015 at 8:14 PM, Paul Colomiets notifications@github.com
wrote:

@seanmonstar https://github.com/seanmonstar

I don't have a vision; I haven't looked that hard into how mio works. I'd
love to hear suggestions.

I have a vision. In short it boils down to splitting hyper into three
logical parts:

  1. Types (Headers, Status, Version..., may be some generic version of
    Request)
  2. Logic. For example the function determining what HTTPReader is used,
    should be decoupled from real streams. I.e. there should be enum like
    HTTPReaderKind which then is turned into current HTTPReader with a simple
    method like kind.with_stream(stream)
  3. And code handling real streams, with convenience Request objects
    implementing Read, buffering an so on.

The first item is basically ok, except maybe put types into separate
crates. But logic is too coupled with streams. Decoupling it should also
simplify testing AFAIU.

Then we can do competing experimental asychronous I/O implementations
without rewriting too much of hyper. (I will publish my implementation
soon). The biggest question on mio right now is how to make things
composable. I.e. you can't mix multiple applications in same async loop,
until some better abstractions are implemented, so I'm currently
experimenting with it.

How does this sound? What I need to start contributing these changes into
hyper?

โ€”
Reply to this email directly or view it on GitHub
#395 (comment).

@seanmonstar
Member

That actually sounds quite feasible. I'll think more on the relationship between the 2nd and 3rd crates. But the first crate sounds simple enough: method, uri, status, version, and headers. Need a proper name, and to figure out the least annoying way to publish multiple crates at a time.

@jnicholls

If you do separate crates instead of modules, I would group #1 and #2 into
a crate, and #3 in a separate crate (http_proto & hyper, for example, where
hyper is the actual client/server I/O logic).

node.js put their http_parse into a separate project from the node.js
project in a similar fashion.

On Saturday, August 8, 2015, Sean McArthur notifications@github.com wrote:

That actually sounds quite feasible. I'll think more on the relationship
between the 2nd and 3rd crates. But the first crate sounds simple enough:
method, uri, status, and headers. Need a proper name, and to figure out the
least annoying way to publish multiple crates at a time.

โ€”
Reply to this email directly or view it on GitHub
#395 (comment).

Sent from Gmail Mobile

@tailhook

Okay, I've just put some code of async HTTP handling online:
https://github.com/tailhook/rotor-http
It's not generally usable, just put here to encourage you to split the hyper. It uses Headers from hyper. And I would probably better help to refactor hyper rather than rewriting whole logic myself.

The "http_proto" name is probably good for crate that contains types and abstract HTTP protocol logic (like determining length of request body).

@seanmonstar
Member

I'd like to push a branch up with a mio feature. To start, I think hyper should be agnostic to what sort of IO abstraction is used, whether it's with callbacks, promises, streams, or whatever. To do that, I imagine this list is what I need to implement (still reading mio docs, so help would be appreciated):

  • Evented for server::{Request, Response, Server}
  • Evented for client::{Request, Response}
  • trait NetworkStream: Evented (+ others) {}

Hyper does do some reading and writing uncontrolled by the user, such as parsing the a request head, before handing it to server::Handler. So perhaps internally hyper will need to pick a way to handle async reads/writes.

@reem
Member
reem commented Aug 17, 2015

To be truly agnostic, we'd need to move the request head parsing logic into the public API, and have those reads only execute when the user asks for them. Otherwise, the user won't be able to use whatever event notification mechanism they want.

@reem
Member
reem commented Aug 17, 2015

Also, @seanmonstar I have some experience with mio, so if you have questions please ask.

@tailhook

I second @reem opinion. You can't just implement Evented, it will not work. Also it's expected that there will be IOCP-based library for windows that has very different interface than mio.

@reem
Member
reem commented Aug 18, 2015

The secondary issue is ensuring that we don't do things like:

let buf = get_buf();
try!(req.read(&mut buf));
// process part of buf but don't save progress anywhere outside this call
try!(req.read(&mut buf)); // could yield wouldblock, and we would lose the info from the first read
@seanmonstar seanmonstar self-assigned this Sep 8, 2015
@seanmonstar
Member

The adventurous can try out the mio branch. The 2 server examples work, but a ton is missing. Also, just to get things moving, I chose to use eventual to provide the asynchronous patterns.

Missing:

  • Keep alive
  • timeouts
  • the entire Client
@Ogeon
Contributor
Ogeon commented Sep 10, 2015

Cool stuff! I may feel adventurous enough to try this in a branch of Rustful. ๐Ÿ˜„ I have really been looking forwards to this, so I would be happy to give it a spin.

By the way, I see that mio still doesn't seem to support Windows. Can Hyper still support Windows, without mio doing it?

@seanmonstar
Member

@Ogeon no, but alexcrichton has been working on windows support in mio, so it's coming. https://github.com/carllerche/mio/commits/master?author=alexcrichton

@Ogeon
Contributor
Ogeon commented Sep 11, 2015

That's great! I'll probably not be able to stop myself from trying this within the coming days... I'll be in touch if I bump into any problems.

@seanmonstar
Member

@Ogeon I'm sure you will (bump into problems). :)

@tailhook

@seanmonstar, few questions:

  1. From quick skimming, it looks like you made the mio-only version, rather than making mio support optional, right? Is it generally accepted strategy?
  2. Quick benchmarking of hello.rs shows that it's much slower (<1k against 40K sync version), whereas my version and coroutine-based version does same order of magnitude requests per second. Any ideas?
@seanmonstar
Member

@tailhook

  1. The strategies of using blocking io and an event loop are quite different, and so supported both is complicated. It's a whole lot easier to implement a sync API using the event loop, by just blocking on the Future. Also, I'm currently just trying to get it working, and worrying about the rest after.
  2. How are you benchmarking? Also, so far, this version of the Server is only using 1 thread, whereas the Server in 0.6 uses threads that scale to your cores. Using more threads (and more event loops) could probably help. However, my super simple ab checking hasn't showed such a slow down (though my test machine has 1 core, so it doesn't use multiple threads in the sync version).
@seanmonstar
Member

Update: current mio branch no longer using futures, and seeing a significant performance improvement. My linux box has horrible specs, so I won't post benchmarks from it.

@bfrog
Contributor
bfrog commented Sep 22, 2015

I see an error when compiling the latest mio branch with rustc 1.3

~/s/hyper git:mio โฏโฏโฏ git rev-parse HEAD
c60cc831269d023b77f0013e6c919dbfefaf031d
~/s/hyper git:mio โฏโฏโฏ cargo build
   Compiling hyper v0.7.0-mio (file:///home/tburdick/src/hyper)
src/http/conn.rs:6:21: 6:23 error: expected `,`, found `as`
src/http/conn.rs:6 use http::h1::{self as http, Incoming, TryParse};
                                       ^~
Could not compile `hyper`.

To learn more, run the command again with --verbose.
~/s/hyper git:mio โฏโฏโฏ
@seanmonstar
Member

Ah, woops. I've been doing all this work on nightly, and that syntax is allowed on nightly, but not stable yet. I'll try to push soon such that it builds on stable.

@tailhook

Update: current mio branch no longer using futures, and seeing a significant performance improvement. My linux box has horrible specs, so I won't post benchmarks from it.

I can confirm that benchmarks are fine now.

I'm curious why the difference is so drastic without futures? Is it because of inlining, or because of the different structure of the code? Is it just overhead of lambdas?

By the way code looks super-similar to what I've written about and working on. And it would be nice to join the effort. So have you seen that? Does you see any inherent flaws in what I'm doing? Or does the absence of documentation is the main cause that stopped you from using my library?

@seanmonstar
Member

@tailhook

I'm curious why the difference is so drastic without futures? Is it because of inlining, or because of the different structure of the code? Is it just overhead of lambdas?

I never profiled, so I can't say for certain. Some things I can guess about: eventual::Future has to store the callbacks as Box<Fn> internally, which means allocations (likely tiny), and dynamic dispatch (likely medium). Additionally, the core of the Future uses atomics to keep it in sync, which makes sense, but just wasn't necessary in the event loop, where it was all on a single thread.

This isn't to say Futures are bad, just that they aren't a cost-free abstraction, and the abstraction wasn't worth the cost in this case. I'm still considering exposing higher-level methods on Request and Response that use a Future, such as req.read(1024).and_then(|bytes| { ... }), which would have the Future tying into the events in the event loop.

By the way code looks super-similar to what I've written about and working on. And it would be nice to join the effort. So have you seen that? Does you see any inherent flaws in what I'm doing? Or does the absence of documentation is the main cause that stopped you from using my library?

My main inspiration when writing the tick crate was Python's asyncio. I did see that rotor seemed to in a similar vein, but I wrote tick for these reasons:

  • This branch has been all about fast prototypes to test ideas, and so I wanted to be able to tweak the event loop design as needed.
  • I wanted the ability to pause transports, which I see is a TODO in greedy_stream :)
  • The amount of generics I saw trying to read rotor's source left me often confused.

I'm sure our efforts could be combined in this area. My main reason here was to be able to prototype while understanding what's going on internally, instead of needing to ask on IRC.

@tailhook

I wanted the ability to pause transports, which I see is a TODO in greedy_stream :)

Yes. It was done to keep the scope of the protocol smaller for quick experiments. Easy to fix. I need to get messaging between unrelated connections right; then I will make a non-greedy stream that is pausable and has an idle timeout.

The amount of generics I saw trying to read rotor's source left me often confused.

Well, yes, I'm trying to build a library that allows you to combine multiple independent things to create an app (One of the things will be an HTTP library). So the problem is not an easy one, and require some amount of generics. But it's not that much in user code. I'm also looking forward to moving some of the generics to associated types, to make code simpler.

I'm sure our efforts could be combined in this area. My main reason here was to be able to prototype while understanding what's going on internally, instead of needing to ask on IRC.

Yes, that sounds reasonable. Let me know if I could be of any help.

@mlalic
Contributor
mlalic commented Sep 23, 2015

@seanmonstar

I've also been following how the mio branch has been unfolding and thinking about how it interacts with supporting HTTP/2, as well.

From the aspect of HTTP/2 support, the new approach in tick is definitely better than the previous future-based one. It boils down to the fact that now it is more or less explicit that a single event loop owns the connection and that all events on a single connection (be it writes or reads) will, therefore, be necessarily serialized (as opposed to concurrent).


I would like to throw in just a remark on something that would need to be supported in some way by any async IO implementation that we end up going with here if it is to also back HTTP/2 connections efficiently.

Something that seems to be missing in both tick, as well as rotor, is the option to send messages to the protocol/event machine.

For example, in HTTP/2 we would like to be able to issue new requests on existing connections. This is, after all, one of the main selling points of HTTP/2! In order for a new request to be issued, we require unique access to the state of the connection. This is because issuing a new request always needs to update the state (as well as read it). Examples are deciding on the new stream's ID (and updating the next available ID), possibly modifying the flow control windows... Therefore, issuing the request must execute on the same event loop and by the Protocol/EventMachine, as that is what effectively owns the connection state.

Another example would be sending request body/data. This cannot be simply written out directly onto the socket like in the case of HTTP/1.1 for multiple reasons (flow control, framing, priority, etc.), all of which come down to the fact that writing a data chunk requires unique/mutable access to the state. Thus, each request should notify the protocol (which owns the HTTP/2 connection) that it has data to be written and then the protocol itself should decide when exactly to perform the write onto the underlying async socket...

This actually goes for writing out responses on the server side, as well, since from HTTP/2's point of view, the difference is quite negligible (both are considered outbound streams). Basically, the AsyncWriter that is there currently is insufficient to support HTTP/2.

As far as I can tell, the best way to do this would be to be able to dispatch a protocol-specific message onto the event loop, which when received and processed by the loop ends up notifying the Protocol (and passing the message onto it). The type of the message would ideally be an associated type of the Protocol to allow for different protocols having different custom-defined messages.

Of course, there might be a different way to achieve this, but for now I can't see what would be more efficient, given that there are operations in HTTP/2 which necessarily need unique/mutable access to the connection state, which would be owned by the event loop...

I made a minimal prototype of this, just to verify that it would work, as well as to see what kind of changes would be required in solicit [1]. I don't want to get too involved in the exact specifics of the async IO implementation that we end up going with here (avoid the whole "too many cooks..." situation), but it'd be good if these requirements could be considered already to minimize any churn or duplication required to also support HTTP/2.

[1] It out turns that by only adding a couple of helper methods it all works out quite nicely already, given that solicit was not coupled to any concrete IO implementation. I'll put in the work to adapt the part in hyper once the async IO approach is closer to being decided on and finalized...

@tailhook

As far as I can tell, the best way to do this would be to be able to dispatch a protocol-specific message onto the event loop, which when received and processed by the loop ends up notifying the Protocol (and passing the message onto it).

Yes. All the same issues with websockets. This is a thing that I'm going to support in rotor because the library is basically useless without the functionality (I. e. you can't implement a proxy). As I've said, it's my next priority.

However, flow control may be done another way. You could just have a Vec<Handler> and/or Vec<OutputStream> in connection state. So readiness handler can supply data chunk to any handler and can choose the stream to send data from for writing. It's easy to group state machines in rotor as long as they all share the same connection (and same thread of execution).

@dcsommer

I'd like to just note that having a way for direct, 2-way communication
along the callback chain is very important for efficiency reasons. The
additional overhead of enqueueing events in the event loop rather than
executing them directly on a parent has been a performance bottleneck in
the past for async webserver code I've written in C++. Unfortunately, I
haven't yet seen a way to do this safely in Rust.

On Wed, Sep 23, 2015 at 11:09 AM, Paul Colomiets notifications@github.com
wrote:

As far as I can tell, the best way to do this would be to be able to
dispatch a protocol-specific message onto the event loop, which when
received and processed by the loop ends up notifying the Protocol (and
passing the message onto it).

Yes. All the same issues with websockets. This is a thing that I'm going
to support in rotor because the library is basically useless without the
functionality (I. e. you can't implement a proxy). As I've said, it's my
next priority.

However, flow control may be done another way. You could just have a
Vec and/or Vec in connection state. So readiness
handler can supply data chunk to any handler and can choose the stream to
send data from for writing. It's easy to group state machines in rotor as
long as they all share the same connection (and same thread of execution).

โ€”
Reply to this email directly or view it on GitHub
#395 (comment).

@tailhook

@dcsommer

I'd like to just note that having a way for direct, 2-way communication
along the callback chain is very important for efficiency reasons.
[ .. snip .. ]
Unfortunately, I haven't yet seen a way to do this safely in Rust.

If I understand you right, then there is a way in rotor. In the article there are are two cases of communication:

  1. From parent to child, you just pass value as an argument to callback
  2. From child to parent either you return a value (like in body_finished callback), or a Option<State> like in almost every other example there (the latter is a form of communication too).

But in fact you may return a tuple if you need two things:

struct StreamSettings { pause_stream: bool }
trait RequestHandler {
    fn process_request(self) -> (StreamSettings, Option<Self>);
}

Or you might pass a mutable object:

trait RequestHandler {
    fn process_request(self, s: &mut StreamSettings) -> Option<Self>;
}

(the latter is used for Transport object in rotor)

@dcsommer

@tailhook yeah, I read the article. It was really good, and I'm excited to see people take async IO seriously in Rust. My issue with point 2 is for the case where you aren't yet ready to perform a state transition. How can the child inform the parent of state transitions that don't originate with a call from the parent? For instance, what if your request handler has to perform some async operation to calculate the response?

@tailhook

@dcsommer, basically the parent need to be prepared for the situation. And it's communicated either by return value (i.e. turn Some/None into NewState/Wait/Stop) or by transport.pause(). Which way to choose depends on which layer this is (or, in other words, is the transport passed down here or is it hidden in the layers below). I'll put an example in rotor soon.

Overall, I feel it's a little bit offtopic here. Feel free to open an issue in rotor itself.

@seanmonstar
Member

@tailhook actually, I think there could be some performance gains if some reading and writing directly to the stream could be overridden. (From tick's perspective).

trait On<T: Read + Write> {
    fn on_readable(&mut self, socket: &mut T) -> io::Result<bool>;
    fn on_writable(&mut self, socket: &mut T) -> io::Result<()>; 
}

I know in hyper, implementing this instead of Protocol::on_data would prevent a copy, since hyper still needs to parse the data as possibly chunked, and could skip the intermediary buffer that Protocol provides. Likewise when writing, since hyper may need to wrap the data in "chunks".

The cool part about all this stuff, is that I believe it can be contained in the event loop and hyper's http module, without affecting user-facing API in Request/Response. It would just get faster.

@tailhook

@tailhook actually, I think there could be some performance gains if some reading and writing directly to the stream could be overridden

I have few thoughts about that. It's not my top priority, but let me share some brain-dump:

  1. I think that it's possible to have InfiniBand or userspace TCP stack buffers in Transport (by changing transport and event loop, but keeping Protocol same). But in your example it's not. (However, this thesis should be confirmed).
  2. It's might be interesting to just move chunked encoding to lower layer, i.e. to transport itself. Just like we usually do that for encryption (which interacts with chunked encoding in some subtle ways too).
  3. The write optimization is probably useless without writev (i.e. sending multiple buffers at once to the kernel space)

I'm also only getting basics in rotor. So I'm trying to make protocol writer's life easier for now. For protocols that need last bits of performance, it's possible to "squash" two layers of abstraction. This is inherent to how rotor is designed.

@yazaddaruvala

Any updates? I'm really looking forward to this!

@tailhook

@yazaddaruvala, I've just published a follow-up article and fundamental update to the rotor. I'm going to build few small applications with it and HTTP. Overall, it's not very solved problem in rust so it will take some time.

On the other hand we haven't agreed on any kind of collaboration with hyper. So I'm not sure if we will duplicate the work.

@arthurprs

@tailhook great post, did it show up in HN and /r/rust already?

@tailhook

@arthurprs, no not yet, according to Medium stats.

@arthurprs

@tailhook I'll help with the second one them.

Also, I'm curious, what kind of performance do you get w/ Golang in the same machine?

@alubbe
alubbe commented Dec 9, 2015

Quick update that mio 0.5 is out and supports windows.

@seanmonstar
Member

Yea! I'll be meeting with @carllerche tomorrow to discuss my WIP integration, and hope to have usable versions soon after.

@sorenhoyer

Can't wait!

@tailhook tailhook referenced this issue in canndrew/gnunet-rs Dec 14, 2015
Open

Non-blocking IO #1

@jnicholls

Looking forward to hearing more about your integration @seanmonstar.

@tailhook
tailhook commented Jan 2, 2016

Hi, I have a quick status update:

  1. Current master rotor-http has most features of the HTTP implemented (server-side). Of course, they are largely untested, but it hope to make tests shortly.
  2. It uses only hyper::{version,status,method,header} from hyper. It would be great if those tools be a separate library.
  3. The rotor-library itself is now super-small. It would be nice if we agree on the interface and start building apps that can co-exist in the same main loop.
  4. Another article and more docs will be done soon. I just put it here in case anyone want's to take an early look.

P.S.: I've noticed that wrk may behave very slow in case you're closing connection which should not be closed (e.g. has no Connection: close header). I've not rechecked but it may be the reason of slowness of the test in the branch of hyper that was based on eventual io.

@KodrAus
KodrAus commented Jan 2, 2016

Sounds good! Are there plans for client side? I'd be interested to see what an evented model for outgoing requests will look like.

@tailhook
tailhook commented Jan 2, 2016

@KodrAus there is an example in rotor-stream. The full implementation of HTTP client will eventually be in rotor-http too. But DNS resolver is a prerequisite.

@alubbe
alubbe commented Jan 2, 2016
@Keats
Contributor
Keats commented Jan 2, 2016

@tailhook are you planning another article to sum up the whole thing now that you are happy with it?
Interested in benchmark comparisons for hyper and rotor master branches as well

@seanmonstar
Member

@tailhook very cool! The rotor-http state machines look very similar to what I'm developing for hyper. One difference I have is giving direct read and write access to the underlying socket, since that's a requirement another use case has for eeking out as much control as possible.


For hyper, I'd settled on an internal state machine, with callback-style Request and Response APIs to interact with it. You can see it in the current mio branch. The examples/hello.rs fairs much better in benchmarks than master.

However, on my machine, it was still performing at around 60% of my ideal target (a super simple state machine that ignores https semantics). I want hyper to be a low level HTTP library for Rust. People shouldn't be skipping hyper because it's not fast enough. At a recent Mozilla work week, others expressed interest in building a reverse proxy that could actually compete with nginx. If hyper cannot help to do that, then someone will just have to re-implement HTTP all over again, but with less costs.

This does mean the ergonomic API for Request and Response will have to become a little less ergonomic, but don't fret! I've prototyped similar APIs on top of this state machine approach, and it wasn't hard at all to do. Even blocking IO was quite simple to emulate, using threads and channels.

@tailhook
tailhook commented Jan 3, 2016

Okay here is the third article, including some benchmark vs golang, nginx and hyper.

Hopefully, it will answer some questions here.


@seanmonstar

One difference I have is giving direct read and write access to the underlying socket, since that's a requirement another use case has for eeking out as much control as possible.

Current rotor core library gives direct access to the socket. You can build on top of that. I'm not sure what is the reason because the performance of rotor-http that is build on top of rotor-stream (the latter does buffering and higher level abstractions, including hiding away the sockets) is decent. The benchmarks are in the article.

Could you give some insight on a use case you are talking about? I'm looking forward to integrating sendfile() and memory mapped files with rotor-stream. But I think this use case is quite niche (although, it is probably required to compete with nginx)

However, on my machine, it was still performing at around 60% of my ideal target (a super simple state machine that ignores https semantics).

Well, that script linked gives 600k requests per second on my laptop, comparing to 65k for nginx. I don't think it's much useful to compete with something that pushes lots of responses as fast as a connection is opened. Or do you say that you can get 60% of that performance on normal http request parsing?

At a recent Mozilla work week, others expressed interest in building a reverse proxy that could actually compete with nginx.

Sure, writing a HTTP implementation that has competitive performance comparing with nginx is my goal too.

Even blocking IO was quite simple to emulate, using threads and channels.

Well, I think it's possible to write blocking client implementation that wraps mio loop without threads and channels. And I doubt that blocking IO for servers is something useful. Although, offloading work to the thread pool is simple indeed.

@seanmonstar
Member

@tailhook

Current rotor core library gives direct access to the socket.

Oh neat! I hadn't really noticed you split the code into rotor and rotor-stream. I'll have to take a look at the split. I had mostly looked through rotor-http when writing my last comment.

Could you give some insight on a use case you are talking about?

Yep. The WebPush team at Mozilla has to run servers where every single instance of Firefox must keep an open socket to the Push server. They want to reduce cost, and are thinking Rust can help do that. They want as little memory usage as possible for every connection, so that means controlling every allocation, including buffers used to read and write.

Well, that script linked gives 600k requests per second on my laptop, comparing to 65k for nginx.

Ha, well then I'm sure part of it is the terrible machine I'm running these benchmarks on. I'm using a small Linode VM with 1 core and 1GB of RAM. When running the Tick example, i get around 18,000 requests per second. The hello server in hyper's mio branch gets me around 12,000. I'd love to bench on my desktop with all its cores, but I use Windows, and wrk won't run on that...

I doubt that blocking IO for servers is something useful.

I agree. Blocking servers is almost never useful. I've just seen people beg that the option still be there. And I meant also that it's not hard for a callback API, or Futures, or whatever to be built on top.

@tailhook
tailhook commented Jan 3, 2016

Yep. The WebPush team at Mozilla has to run servers where every single instance of Firefox must keep an open socket to the Push server. They want to reduce cost, and are thinking Rust can help do that. They want as little memory usage as possible for every connection, so that means controlling every allocation, including buffers used to read and write.

Sure, keeping keep-alive connections with minimum overhead is my goal too. This is how netbuf (the buffer object that rotor-stream uses) is designed: it deallocates the buffer when there are no bytes. This was doubtful trade-off. But it looks like okay in the bencharks. AFAICS in rotor-http there are no heap-allocated stuff per state machine, except buffers.

The size of state machine in hello-world is 288 bytes which is probably not the smallest possible (you may probably get as small as 8 or 16 bytes), but is perhaps less than most current servers have. You also have an overhead of Slab and timer slab, and probably larger message queue, all of that are currently fixed-size in mio.

Anyway according to quick test a hello world example configured for 1M connections takes about 350M RSS (379M of virtual memory, in case you doubt that something is non-initialized yet). I haven't done real connections, though. I believe there is much more overhead in the kernel space.

@alubbe
alubbe commented Jan 3, 2016
@jwilm
Contributor
jwilm commented Jan 14, 2016

@seanmonstar any ideas about schedule for this feature? Would love to know which pieces are still missing and if there are any opportunities to contribute.

Thanks!

@lilianmoraru

@tailhook Could you please also benchmark against Facebook's Proxygen?

@seanmonstar
Member

Here is the current hello.rs and server.rs examples in the wip branch. It doesn't look as elegeant as a blocking, synchronous API, but it does give the performance. On my terrible build server, a wrk bench shows ~9% more requests per second than the branch using callbacks.

And again, it's possible to build higher level APIs on top of this. This just gives the performance to those who need it.

@KodrAus
KodrAus commented Jan 20, 2016

Awesome stuff. It looks like what you'd expect working with mio at a lower level to me. Do you have plans for an evented client sample?

@ivanShagarov

This just gives the performance to those who need it.

Most of people and organisations who want to move to Rust-lang have only one goal - improve the performance :)

@tailhook tailhook referenced this issue in tailhook/rotor-http Jan 24, 2016
Closed

effective request uri #8

@seanmonstar
Member

I've (force) pushed to the mio branch, which has the server examples working. I've also been following rotor's development, and feel like it and my current branch are at a point that I could switch out the internal use of 'tick' to 'rotor', with basically no change to the exposed API. This would just reduce duplicate effort in state machine development.

@alubbe
alubbe commented Jan 28, 2016

Thanks for the update, I was able to get the examples to work.
One of the things I noticed was that server cannot utilize more than one cpu core now. I assume this is because of mio. Are there any plans to build a master/cluster process on top to run multiple, isolated event queues for the new hyper?

@tailhook

my current branch are at a point that I could switch out the internal use of 'tick' to 'rotor', with basically no change to the exposed API. This would just reduce duplicate effort in state machine development.

Great. Should I cut a new release of rotor? I mean do you have any outstanding questions/issues with API, so I can release all API changes in one bulk.

One of the things I noticed was that server cannot utilize more than one cpu core now. I assume this is because of mio. Are there any plans to build a master/cluster process on top to run multiple, isolated event queues for the new hyper?

You can easily run multiple event loops, each in it's own thread. Here is an example in rotor:
https://github.com/tailhook/rotor-http/blob/7a24c516e30cdb6773584f465d4a8cccd8435fdc/examples/threaded.rs#L132

That's not hyper, but I believe you can do the same with @seanmonstar's branch.

@alubbe
alubbe commented Jan 28, 2016

That's pretty cool. So you would leave the implementation of how to distribute the load (e.g. round-robin) to the library consumer?

@tailhook

That's pretty cool. So you would leave the implementation of how to distribute the load (e.g. round-robin) to the library consumer?

I'm not sure I understand question well. But you have several options for load distribution:

  1. In the example OS distributes sockets somehow by letting each thread accept on it's own. It works well for tiny asynchronous tasks
  2. Another option is to use SO_REUSEPORT in linux. AFAIU linux distributes connections equally (by a hash ring) between sockets (which are equal to threads in the example)
  3. For more complex processing you probably want to read the request in one of the IO threads which are distributed by option (1) or (2) and send job to worker threads which are running some MPMC queue (just an example)
@seanmonstar
Member

@alubbe yes, hyper is probably going to stop guessing at how many threads to use, and instead let the user decide to run servers in as many threads as they'd like. I could add that to the hello.rs server example, I suppose.


Swapping in rotor was actually pretty quick work (besides the time I've spent reading the source of rotor to understand it's concepts). As I said, it didn't change the Server api at all.

One thing that I noticed, but it could be just my terrible test machine: the swap to rotor meant my benchmark lost ~1-2% in requests per second. Maybe a proper production machine wouldn't notice this. If it did, I imagine the difference is that perhaps some function didn't get inlined, or something else minor that I'm sure we can fix. If you wish to look yourself, you can compare the current mio branch with the current mio-rotor branch.

@bfrog
Contributor
bfrog commented Feb 1, 2016

@seanmonstar the common usage of a state machine framework like rotor is a nice touch

Excited for the day I can put this to work!

๐Ÿป

@tailhook
tailhook commented Feb 2, 2016

Okay, I've done some quick tests https://gist.github.com/tailhook/f1174e1a3e8b340d1e1f
Shortly:

  • hyper-mio: 63450.68/63454.14/65230.27
  • hyper-rotor: 66627.13/65829.25/67861.90
  • rotor-http: 65600.74/66425.12/64309.17/69134.65

It's on i5-3230M CPU @ 2.60GHz / Linux 4.3.3, versions of the libraries are in gist

The tests were run for each of the version once, then next round. Also I've discarded some outliers with much lower RPS (all examples generated 61-62k sometimes).At the end of the day, I would say that variablility of the value is more than the difference, and I'm not sure I've captured fastest samples. Anyway. it doesn't look like slower. May be I'll try to find some time to make non-laptop test (test on laptops are always wrong because of powersaving).

In the meantime I've published rotor 0.5.0 on crates.io.

besides the time I've spent reading the source of rotor to understand it's concepts

Any hints to start with? I know I need a lot more for documentation, but maybe some quick pointers. Like should I better make a tutorial or fill in more comprehensive coverage of API. Are standalone examples good enough?

@lilianmoraru I've seen your request to test proxygen, but I can't find time to get it. It may help, if you create a vagga.yaml so it would be easy for me to test.

@dashed
dashed commented Feb 2, 2016

@tailhook A state diagram to visualize the rotor API might be a good start. On earlier iterations of rotor, I had to draw out state diagrams to grok the API.

@seanmonstar
Member

Excellent. Like I said, it's probably that my test Linux box is complete
junk.

Well, I couldn't find the docs hosted, so I had to resort to the source.
But, even with docs, I like reading source. I'm a weirdo who likes to pick
modules from the rust repo for some good-night reading in bed.

On Mon, Feb 1, 2016, 4:41 PM Alberto Leal notifications@github.com wrote:

@tailhook https://github.com/tailhook A state diagram to visualize the
rotor API might be a good start. On earlier iterations of rotor, I had to
draw out state diagrams to grok the API.

โ€”
Reply to this email directly or view it on GitHub
#395 (comment).

@lilianmoraru

@tailhook Here is a "basic" configuration: gist. It compiles proxygen and rotor but the examples of course need to be modified because they do not do the same thing logic-wise.

@tailhook
tailhook commented Feb 2, 2016

Thanks @lilianmoraru . So here is a test having a response with similar number of bytes (more details https://git.io/vgIpo):

  • rotor-http: 57k (requests per second)
  • proxygen: 28k

From quick skimming it looks like proxygen accepts a connection by one thread and hands it off to another thread for processing (is it even async?), this explains why performance is 2x slower on this microbenchmark (i.e. 2 thread wake-ups instead of one).

The proxygen compiled with default options, which includes -O3 as far as I can see.

@debris debris referenced this issue in ethcore/parity Feb 25, 2016
Closed

Fix rpc server dependency loop #513

@yazaddaruvala

Hey @seanmonstar, good read: http://seanmonstar.com/post/141495445652/async-hyper. Thanks for the update, and all the hard work!

Next::end() covers the situation where the server wants to end the connection. However, one thing that isn't explicitly obvious is how the server handles a connection that gets closed prematurely from the client's side?

Would this be handleable in on_request_readable during match transport.read(...)? or would(should) Handler need an on_request_closed?

@seanmonstar
Member

@yazaddaruvala there's 2 ways that may appear. If it occurs while waiting for readable or writable, then the event loop will notice. hyper will call Handler::on_error(Error::Io(e)). If the socket has already triggered a readable or writable event, and thus the handler is trying to read or write, then that will likely trigger an io::Error during read or write.

@Ogeon
Contributor
Ogeon commented Mar 23, 2016

Cool! Looks like it's time to give this another try. Just one question:

Is there a reason for making the Request components private, besides preventing accidental changes? Are there any side effects from them being changed? I ask because this prevents me from repackaging them in the Rustful Context type without making references (which may prevent even more things), and it makes it makes it impossible to write pre-routing that actually modifies the request.

I'm somewhat prepared to lessen the power of the pre-routing filters, since I'm not even sure how useful the are, but I'm still not much of a fan of doing it. The filters would then only be able to observe and abort.

The request problem could be worked around by including it as it is, instead of splitting it, and maybe add accessors as shortcuts. The host name+port injection would also have to change. I may have missed it by I don't think Hyper does this, and it's sometimes a useful header. It could, of course, live outside the Request, but that would be a bit awkward.

All I would need is a way to destructure the Request, and it doesn't matter if it's a method. That worked fine for the previous version of Response. Second best would be mutable accessors, but that would somewhat defeat the purpose of keeping everything immutable.

Anyway, nice to see that this is almost complete. ๐Ÿ‘

@seanmonstar
Member

@Ogeon there were no side effects to them being changed, I've just seen people in IRC make changes to the Request when they meant to change the Response, and were left confused why nothing happened. The change to making the fields private was to prevent people from doing things that did nothing.

I'm surprised you need to inject the Host header into requests. HTTP/1.1 is pretty clear that clients must include the Host header.

Perhaps a deconstruct method would be warranted, that would return a tuple of the internal fields. That should be clear that modifying it won't do anything, but still allow you to do so if you really want?

@seanmonstar seanmonstar modified the milestone: Async IO, 1.0 Mar 24, 2016
@Ogeon
Contributor
Ogeon commented Mar 24, 2016

@seanmonstar Yeah, I think the decision to make them immutable was a good idea from that perspective. A deconstruct method is all I need and it would be great if it was added. The critical part is the Headers struct, which may become expensive to clone. The URI is parsed, and is cloned at the same time, and the method should be cheap enough to clone in the most common situation. The best would be if I didn't have to clone anything. ๐Ÿ˜„

I have, by the way, started to port everything to the new system and it's working quite well, so far. I have just managed to make the main functionality work, and the simplest examples are compiling. The only thing I really haven't figured out a good solution for, at the moment, is file uploading. I'm not yet sure how to coordinate the file reading and sending in a good way that doesn't clog everything up. I'll probably figure something out after some sleep.

I'm surprised you need to inject the Host header into requests. HTTP/1.1 is pretty clear that clients must include the Host header.

I should look into that again. I don't remember why I added that part (I rediscovered it when I wrote that comment), so it may actually be unnecessary. Does Hyper check that the Host header exist?

Oh, and another quick question: What happens with the request body if the handler decides to return Next::end() before reading it? Will Hyper take care of it automatically?

@seanmonstar
Member

@Ogeon hyper does not check for the Host header.

What happens with the request body if the handler decides to return Next::end() before reading it? Will Hyper take care of it automatically?

If the Handler ends, no more reading is done. (Reading only happens when the handle calls decoder.read() anyways). If there is still bytes waiting (determined by whether there was a Content-Length header, or if chunked, if there was a 0\r\n\r\n sequence already read), then the socket will not be used for keep-alive.

@Ogeon
Contributor
Ogeon commented Mar 26, 2016

@seanmonstar Alright, so I take it that I have to read the body to make sure the socket is reused for keep-alive. That's good to know ๐Ÿ‘ Thanks!

@matt2xu
matt2xu commented Mar 26, 2016

It seems to me that this is the same behavior with the current (blocking), at least that's what I observed empirically: if your server handles a POST request but does not read it's body, the method field of the subsequent POST request is "body_of_previous_requestPOST". Not sure if this should be considered a bug or not, so I implemented Drop for my Request so it reads the body in case it has not been read before.

@Ogeon
Contributor
Ogeon commented Mar 26, 2016

I got the impression that it was done automatically in 0.8.x and earlier, but I may be wrong. It would be nice if the responsibilities of the handler were documented, to make it more clear what's automatic and what's not. I asked because some of the examples skips the body.

@seanmonstar
Member

@Ogeon hyper has not automatically read the body. The problem with that is what to do when someone POSTs a 4GB file to your server. If hyper were to automatically read the whole body, that'd block the server for a while, and waste bandwidth. Deciding on a proper maximum size to automatically read is difficult, so hyper has so far chosen that all reading must be done by the user.

@Ogeon
Contributor
Ogeon commented Mar 26, 2016

@seanmonstar Looks like I misunderstood it, then ๐Ÿ˜„ Probably a leftover assumption from rust-http, which used to include the body in the request.

@Ogeon
Contributor
Ogeon commented Mar 26, 2016

@seanmonstar I have bumped into a problem with zero sized response bodies. I don't know if it's intentional, but it seems like I have to call encoder.write(...) at least once for the response to be properly sent. It will otherwise result in "empty response" errors.

Do you want me to file separate issues for things like this?

@seanmonstar
Member

Have you updated recently? I hit a similar error this morning, and fixed it
(at least in my case), and pushed.

On Sat, Mar 26, 2016, 12:51 PM Erik Hedvall notifications@github.com
wrote:

@seanmonstar https://github.com/seanmonstar I have bumped into a
problem with zero sized response bodies. I don't know if it's intentional,
but it seems like I have to call encoder.write(...) at least once for the
response to be properly sent. It will otherwise result in "empty response"
errors.

โ€”
You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#395 (comment)

@Ogeon
Contributor
Ogeon commented Mar 26, 2016

I hadn't and did now. Part of problem is still there. The general state flow is as follows:

  1. In on_request -> Next::wait,
  2. immediately wake up and go into write,
  3. in on_response -> Next::write,
  4. do nothing in on_response_writable, since body length is 0, and -> Next::end

Skipping step 4 and calling Next::end in 3 seems to work after the update. Calling encoder.write with an empty slice in 4 causes the headers to be sent and the returned value from write is WouldBlock.

@seanmonstar
Member

@Ogeon I believe I've not fixed this on the branch.

@Ogeon
Contributor
Ogeon commented Mar 29, 2016

Yeah, it's still there. Another thing I've noticed is that wrk reports read errors. I haven't done any deeper investigations into those, though, but I guess you have seen them as well.

@seanmonstar
Member

@Ogeon As I should have done, I've added a test in the server tests for that exact instance, and I can now say it is fixed.

As for read errors, you mean errors printed by env_logger from hyper (that the socket closed?), or the read errors reported directly from wrk? The former, I've fixed (it was a useful error log when first working on the branch). The latter, I haven't seen in a long time.

@Ogeon
Contributor
Ogeon commented Mar 31, 2016

@Ogeon As I should have done, I've added a test in the server tests for that exact instance, and I can now say it is fixed.

Ah, nice! I can confirm that it works now.

As for read errors, you mean errors printed by env_logger from hyper (that the socket closed?), or the read errors reported directly from wrk? The former, I've fixed (it was a useful error log when first working on the branch). The latter, I haven't seen in a long time.

It's directly reported from wrk. It can look like this for my hello_world example, after a run with 456888 requests:

Socket errors: connect 0, read 456873, write 0, timeout 0

It looks like it's not one for each request, so I'm not really sure what would trigger them. Could it be something I have missed, when implementing it?

@seanmonstar
Member

Hm, read errors from wrk are reported each time 'read(fd)' returns -1. Does
that mean the sockets may be closing before having written the end of the
response?

I don't see this in the hello world server example in hyper. What response
are you writing? Headers and body?

On Thu, Mar 31, 2016, 3:54 AM Erik Hedvall notifications@github.com wrote:

@Ogeon https://github.com/Ogeon As I should have done, I've added a
test in the server tests for that exact instance, and I can now say it is
fixed.

Ah, nice! I can confirm that it works now.

As for read errors, you mean errors printed by env_logger from hyper (that
the socket closed?), or the read errors reported directly from wrk? The
former, I've fixed (it was a useful error log when first working on the
branch). The latter, I haven't seen in a long time.

It's directly reported from wrk. It can look like this for my hello_world
example, after a run with 456888 requests:

Socket errors: connect 0, read 456873, write 0, timeout 0

It looks like it's not one for each request, so I'm not really sure what
would trigger them. Could it be something I have missed, when?

โ€”
You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#395 (comment)

@Ogeon
Contributor
Ogeon commented Mar 31, 2016

It's the hello_world example from Rustful, and it's the same for other examples there, as well. It should just write "Hello, stranger!" and the headers should just be date, content length, content type, and server name. It looks fine in the browser, so that's why I'm a bit confused. I tried some of the Hyper examples and they didn't generate the errors, so there must be something with how I'm doing it. I'll investigate more later, and see what I can find.

@Ogeon
Contributor
Ogeon commented Apr 1, 2016

Ok, I found something. The "simplified" handler goes into wait mode if no body should be read, and waits for a signal that tells it that it's time to start the writing the response. Changing this to make it immediately go into write mode (which will mess with the flow, but should work for hello_world) makes the errors disappear.

I tried to replicate the hello example from Hyper, using the more advanced handler method. It did not call Next::wait and it did not cause any errors. I changed it to call Next::wait, and made it call control.ready(Next::write()) from a thread, and that made the errors appear again.

What do you think?

@seanmonstar
Member

@Ogeon hm, I am able to reproduce if I wait() and spawn a thread to call ready(), as you say. When I make a tiny Rust program to open a tcp stream, write a request, and read a response, I don't get any io errors, or reads of 0. Can you see something else here that would make wrk goto error: https://github.com/wg/wrk/blob/master/src/wrk.c#L426-L448

@Ogeon
Contributor
Ogeon commented Apr 2, 2016

Hard to say. The error conditions seems to be either a network error (return -1), a 0 read before finishing the response, or some failure in http_parser_execute. There shouldn't be anything to read (unless the errors appears when data is finally received), so the only logical failures would be the network error or the 0 read, given that http_parser_execute returns the number of parsed bytes. It's quite complex, though, so it's hard to say exactly what happens. Either way, 0 reads will still cause problems.

If it does read something, then the problem should be caused by something in http_parser_execute. It's huge and has many jumps to error, but it looks like it would report those errors to the user.

I tried to pass the --timeout 10s option, but that didn't make a difference. That should show up in the timeout counter, instead, so I didn't really expect anything.

@seanmonstar
Member

It looks like they forgot to check errno in sock_read to return RETRY, and
so the EAGAIN error is being reported as a read error.

On Sat, Apr 2, 2016, 4:03 AM Erik Hedvall notifications@github.com wrote:

Hard to say. The error conditions seems to be either a network error
(return -1), a 0 read before finishing the response, or some failure in
http_parser_execute. There shouldn't be anything to read (unless the
errors appears when data is finally received), so the only logical failures
would be the network error or the 0 read, given that http_parser_execute
returns the number of parsed bytes. It's quite complex
https://github.com/wg/wrk/blob/master/src/http_parser.c#L627, though,
so it's hard to say exactly what happens. Either way, 0 reads will still
cause problems.

If it does read something, then the problem should be caused by something
in http_parser_execute. It's huge and has many jumps to error
https://github.com/wg/wrk/blob/master/src/http_parser.c#L2069, but it
looks like it would report those errors to the user.

I tried to pass the --timeout 10s option, but that didn't make a
difference. That should show up in the timeout counter, instead, so I
didn't really expect anything.

โ€”
You are receiving this because you were mentioned.

Reply to this email directly or view it on GitHub
#395 (comment)

@Ogeon
Contributor
Ogeon commented Apr 2, 2016

Hmm, yeah, looks like that's the case. It's also static, so I guess it won't be replaced by an other function.

@bfrog
Contributor
bfrog commented Apr 4, 2016

I've noticed that the latency, and variation of latency with the mio branch is much higher and seems to have a larger spread than whats currently on master. I ran cargo --release --example server in each branch then ran wrk against it.

Mio

Running 30s test @ http://127.0.0.1:1337/
  10 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    22.23ms   34.30ms   1.03s    99.67%
    Req/Sec     4.76k   693.41     8.08k    88.07%
  1421686 requests in 30.06s, 122.02MB read
Requests/sec:  47290.16
Transfer/sec:      4.06MB
wrk -c 1000 -t 10 -d 30s "http://127.0.0.1:1337/"  2.80s user 11.56s system 47% cpu 30.111 total

Master

Running 30s test @ http://127.0.0.1:1337/
  10 threads and 1000 connections
  Thread Stats   Avg      Stdev     Max   +/- Stdev
    Latency    54.80us   93.95us  22.16ms   98.91%
    Req/Sec    84.24k    15.81k   97.11k    78.33%
  2512893 requests in 30.07s, 215.68MB read
Requests/sec:  83560.27
Transfer/sec:      7.17MB
wrk -c 1000 -t 10 -d 30s "http://127.0.0.1:1337/"  5.07s user 22.54s system 91% cpu 30.122 total

Can anyone else confirm this? Is this a known issue?

@arthurprs

The mio branch cpu usage is like half of the threaded. That might be related

@seanmonstar
Member

It may help to edit the async example to spawn a server in several threads, so as to compare to the master example which is using 1.25 threads per cpus.

@Ogeon
Contributor
Ogeon commented Apr 11, 2016

I just noticed that the way Control::ready behaves has changed. It's no longer possible to call it from on_request and then make on_request return Next::wait() (small example). It won't wake up after on_request. Is this a bug?

@seanmonstar
Member

@Ogeon It was indeed a bug. I added the use of an AtomicBool to reduce the amount of usage on the mio's bounded queue (to reduce errors from it being full), and had an incorrect if check. Fixed.

@Ogeon
Contributor
Ogeon commented Apr 11, 2016

Alright, nice! ๐Ÿ˜„ It works just as before, now.

@Ogeon
Contributor
Ogeon commented Apr 13, 2016

I've been trying to re-enable OpenSSL support in Rustful today, and it went alright, except when it came to dealing with the different Transport types. The fact that many parts of the API depends on it gives me three alternatives:

  1. Make almost everything in Rustful generic over Transport. This would work if it didn't cause a bunch of coherence problems in some important parts.
  2. Make Http/Https enums for Encoder and Decoder. This would work if the Transport type for HTTPS was public. I tried with <Openssl as Ssl>::Stream, as well, but it looks like it counts as a type parameter and caused conflicting implementations of From.
  3. Mask Encoder<T> and Decoder<T> as Read and Write. My current solution is a variant of this, but it's not so nice, and won't work well if more than simple read/write functionality is ever expected.

It would be nice if the Transport type could be made public for the OpenSSL case, allowing solution 2, or if both HTTP and HTTPS used the same type (I saw HttpsStream in there, but I guess that's something else), allowing simple type aliases for Encoder and Decoder. A completely different solution is, of course, also welcome.

@seanmonstar
Member

@Ogeon ah, I hadn't noticed that the OpensslStream type wasn't being exported. Just doing that should be enough, right? Then for https, (if you wish to utilize only openssl), you could use HttpsStream<OpensslStream<HttpStream>>.

@Ogeon
Contributor
Ogeon commented Apr 13, 2016

That should do it, as far as I could tell. It did look like it was just OpensslStream<HttpStream>, though, but it could also have been some misleading error messages.

@erikjohnston

Ignore me if you've already done this, but if you're using non-blocking sockets with OpenSSL you can't just use the io::{Read, Write} traits, you need to port the code to use ssl_read and ssl_write [1] instead. This is because OpenSSL can return special error codes that indicate that the code must call read if they're currently writing or write if they're currently reading. (Or at least that's my understanding).

If you don't, it will work most of the time and then occasionally fail.

[1] http://sfackler.github.io/rust-openssl/doc/v0.7.9/openssl/ssl/struct.SslStream.html#method.ssl_read

@seanmonstar
Member

@erikjohnston yep, I know about it. I haven't handled it yet, as I slowed down trying to design a generic way for the Transport trait to report it, and moved on to other pieces that were missing. I won't ship the version without getting that in place, however.


Basically, my initial thoughts were something like this:

trait Transport {
    // ... 
    fn blocked(&self) -> Option<Blocked> {
        // default implementations assume nothing special
        None
    }
}

enum Blocked {
    Read,
    Write,
}

The blocked method could be checked by the event loop when deciding what events to listen for. If the result is Some, the event will be added to the event list.

So an implementation for openssl could do something like this:

struct OpensslStream {
    stream: openssl::SslStream<HttpStream>,
    blocked: Option<Blocked>,
}

impl Read for OpensslStream {
    fn read(&mut self, buf: &mut [u8]) -> io::Result<usize> {
        match self.stream.ssl_read(buf) {
            Ok(n) => Ok(n),
            Err(openssl::Error::WantWrite(io)) => {
                self.blocked = Some(Blocked::Write);
                return Err(io);
            },
            // ...
        }
    }
}

Does this seem like it would work for any other transport implementation, like schannel or security-framework? @sfackler @frewsxcv

@sfackler
Contributor

That should work, yeah, but it's unfortunate that there's that out of band signaling of what you're blocked on. Maybe there should be a tweaked Read trait that could return the proper information?

@seanmonstar
Member

@sfackler well, the stream is present to the Handler as T: Read or T: Write, so user code would be using those traits. This design was to hopefully remove the need for a Handler to notice if the underlying stream protocol needed to read, even if the user just wants Next::write() a bunch.

@Ogeon
Contributor
Ogeon commented Apr 13, 2016

@seanmonstar I could implement the nicer solution nr. 2 now, when OpensslStream is public, so all is well. ๐Ÿ˜„ It's an ok solution. Also, I had to use Encoder<OpensslStream<HttpStream>> and Encoder<OpensslStream<HttpStream>>, as I vaguely mentioned before. I'm not sure if that was what you actually meant.

@seanmonstar seanmonstar closed this in #778 May 16, 2016
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment