Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Adds a persist flag in FileDescriptorEvent. Fixes #695 #763

Closed
wants to merge 3 commits into from

Conversation

oglu
Copy link

@oglu oglu commented Aug 4, 2014

This fixes the high cpu usage when using FileDescriptorEvent.

It adds a persist flag (true by default, so as to preserve the original behavior) that controls whether the event is persisted with libevent or reattached after each callback.

I found it useful with zmq and createFileDescriptor

@s-ludwig
Copy link
Member

What I wonder is if we actually need to keep backwards compatibility. Since wait is the only interface to do something based on the activation state, it shouldn't make a difference to the user (apart from possibly high CPU usage) unless I'm missing something.

@oglu
Copy link
Author

oglu commented Sep 18, 2014

The flag (and the difference between persisted/non persisted event) has mainly to do with edge-triggered events. What I found with zmq was that when the socket is ready for read it will keep firing the event, peaking the cpu.

The patch was written that way to keep the option open between level triggered and edge triggered events. The backwards compatibility preservation is mostly a side-effect, chosen so as not to disrupt other codebases.

@s-ludwig
Copy link
Member

Right, but would the user of a FileDescriptorEvent ever be able to observe the difference between the two modes? Apart from looking at the CPU load that is.

@oglu
Copy link
Author

oglu commented Sep 18, 2014

It depends on the "thing" that raises the events. By detaching, some events may be lost, which may not be desirable. There are cases (like mine with zmq) that we only care for the initial triggering and we know how the socket will react until the next event.

For the latter case someone could just destroy/create FileDescriptorEvents but that's kind of inefficient.

@s-ludwig
Copy link
Member

Considering that the event is never created as edge-triggered, do you have any concrete example where events would get lost? All use cases that I've had so far always behaved socket like, so that the following principle should work with and without the changes correctly:

while (!tryReadOrWrite(fd))
    fdevent.wait();

@oglu
Copy link
Author

oglu commented Sep 18, 2014

There are events which are edge-triggered (zmq). The thing with zmq (I don't know if this happens in other libraries) is that, if you want to integrate it with an existing event-loop, it gives you a file-descriptor which behaves a bit like a notification mechanism (not the actual socket and not quite socket-like) that notifies you when the socket is ready for read/write.

I do not have a concrete case for the changes + lost events and most probably with a socket-like fd there would be no problem. My (theoritical concern) was that I should preserve the original behavior in case there is some code assuming that the event is persisted and the callback is fired for every event (when in case of non-persisted event, an event could be lost).

@s-ludwig
Copy link
Member

Would you use anything else than a loop like the one above to work with such kinds of events? I see how someone might depend on the old behavior (although I think the number of users of this particular feature is pretty small, so it's hopefully not that likely), but it's a bit unfortunate that it hasn't been working more like an edge-triggered event from the beginning.

If we could reasonably rule out the chance than someone is realistically relying on the old behavior, I'd rather like to completely switch over and properly document the actual guarantees of wait - after all, the behavior was completely unspecified so far.

@etcimon
Copy link
Contributor

etcimon commented Sep 18, 2014

I'm currently working on this one and I would have assumed that these fd would be edge triggered with the option of unregistering them automatically from the event loop after the first shot. If you're monitoring a custom socket for example, you also can put a pointer in the event and you need the event type, so wait alone isn't really going to cut it.

@oglu
Copy link
Author

oglu commented Sep 18, 2014

I agree that the semantics of a non-persistent event fit better the wait(). Regarding existing usages, I'd guess that could will be some subtle breakages in code with certain assumptions, but it could be communicated if the wait behavior is specified.

@etcimon
Copy link
Contributor

etcimon commented Sep 18, 2014

I would have thought there would be a CLOSED trigger when the descriptor went away, and some way of throwing on the task if there was an error (errno for posix, or GetLastError() for windows, etc).

@s-ludwig
Copy link
Member

What about a hybrid approach - FileDescriptorEvent would keep the event active after wait finishes, and then, when the event fired without anyone waiting, sets a flag and removes it. Next time wait is called, it would check the flag and return immediately (+ reset the flag) if it is set. Otherwise it waits normally.

This flag would probably have to be per "trigger bit", so it would be a bit more complicated in practice. But otherwise?

BTW, is there a reason why the event is recreated in every call to _armEvent (and the old one not deleted)?

@etcimon: Not sure what you mean with the pointer, but the FileDescriptorEvent is generally just meant to abstract the wait part of the non-blocking I/O. Everything else is supposed to be handled using native APIs. Where exactly is it missing flexibility? Any missing error code handling should be considered a bug. That indeed seems to be missing in the current implementation.

@oglu
Copy link
Author

oglu commented Sep 19, 2014

Regarding the hybrid approach, I find it a bit complex with no great benefit. I would prefer simpler semantics. Now, the new event in _armEvent is a bug. It should reuse the event.

@s-ludwig
Copy link
Member

Well... the benefit would be to have no high CPU usage, but still have the guarantee to not lose any events.

@oglu
Copy link
Author

oglu commented Sep 19, 2014

For sure it would enable both. My comment was more like a vote/preference. I'd just prefer the simpler semantics of "the event is reattached on wait".

Thinking about the matter abstractly, the only thing that makes sense to be persisted are the timer events. For file descriptor events, either the descriptor is ready for action when you need it (read/write) or not.

@s-ludwig
Copy link
Member

Well, there are other event types like listening sockets where persistent events make sense. For normal I/O it also usually makes sense (as long as there are no sleep() calls or similar involved), but I'm not sure how costly event_add/event_remove are(?), so maybe not.

But let's focus on the issue here for now. What I'd like to have, if possible, is a single set of semantics that works for all platforms/underlying implementations and for all practical use cases. Adding optional legacy behavior is something that will just cause problems and more work down the line. The proposed hybrid approach is just one possibility to both fix the reported issue, while not causing any potential hangs in legacy applications.

At the same time, the documentation could explicitly mention that it is not guaranteed that events between calls to wait are caught, so that any future applications know what to expect, and so that future back end implementations are not bound to the EV_PERSIST legacy behavior.

@oglu
Copy link
Author

oglu commented Sep 19, 2014

You are right, my abstract thinking missed a couple of cases :) .

I was just skimming through the libev documentation and it doesn't seem to have the notion of non-persistent callbacks. If there are gonna be different backend implementations I guess the abstraction shouldn't mention something about persistency. Is that true ?

On the other hand, given how vibe.d code is written and run, FileDescriptorEvent.wait() (as an abstraction over the actual implementation) means "wait until this file descriptor can be operated on and then do something with it". Even if there are intermediate events there aren't many things that can be done with them, so the wait() can be considered a wait for a singular event. Am I missing something ?

@s-ludwig
Copy link
Member

Yes "wait for a singular event" would be the kind of guaranteed semantics that I have in mind. It might return for other reasons, so it wouldn't guarantee that the FD is ready after the wait. What I would imagine as the documentation for wait would be something along the lines of:

Starts waiting for any of the given triggers. Depending on the underlying file descriptor, a trigger that occurs between two calls to wait may get lost and wait may not return until the trigger gets activated again. Because of this it is generally necessary to always check the trigger condition directly before calling wait (without any I/O, wait/sleep operations or yield calls in-between). Also not that the trigger condition is not guaranteed to be true after wait has returned, so that multiple waits may be necessary.

Example for a socket:

ubyte[] readFromSock(int sock, FileDescriptorEvent sock_event) {
    ubyte[256] buf;
    // repeatedly try to read until successful
    while (true) {
        // first try to read from the socket
        int ret = recv(sock, buf.ptr, buf.len);
        if (ret > 0) return buf[0 .. ret].dup;
        enforce(ret < 0 && errno == EAGAIN, "Error reading from socket");
        // only if no data is available (EAGAIN), wait for the socket to become ready
        sock_event.wait(Trigger.read);
    }
}

@etcimon
Copy link
Contributor

etcimon commented Sep 19, 2014

What about a hybrid approach - FileDescriptorEvent would keep the event active after wait finishes, and then, when the event fired without anyone waiting, sets a flag and removes it.

Disabling edge triggering is preferable for applications that fail to use a buffer. From my experience, emptying the socket right after receiving the event is the most efficient way to do it rather than calling wait() and read() sequentially every time data is necessary, because otherwise it breeds a huge amount of system calls.

So, I'm not sure if allowing this behavior in favor of flexibility will end up backfiring in code quality.

@etcimon: Not sure what you mean with the pointer, but the FileDescriptorEvent is generally just meant to abstract the wait part of the non-blocking I/O.

I meant using delegates to get event notifications, it's irrelevant and was a bad idea I had because I was exhausted and frustrated by the directory watcher implementation I was working on at the time.

@s-ludwig
Copy link
Member

Disabling edge triggering is preferable for applications that fail to use a buffer
This whole issue is not really exactly about using edge or level triggers, but just how to handle events that occur in-between calls to wait (no matter if edge or level triggered). It just happens that level triggered events with only one consumer would typically behave the same with or without the change in this PR, but only "typically" and "with only one consumer"...

From my experience, emptying the socket right after receiving the event is the most efficient way to do it rather than calling wait() and read() sequentially every time data is necessary, because otherwise it breeds a huge amount of system calls.

I'm quite not sure what you mean there - do you mean the example I posted? That would typically generate a sequence of ((recv)+(wait))* (pseudo regex syntax), so it would in fact always empty the socket before calling wait. Then there is of course potentially always one recv call that just yields EAGAIN, which may mean an additional system call in case one has to wait. But the alternative would be to call the poll/select mechanism after each read, which would possibly mean a system call after every read. Which one is more efficient of course depends on the exact conditions - unless I'm missing something of course.

So, I'm not sure if allowing this behavior in favor of flexibility will end up backfiring in code quality.

Can you state exactly the behavior you mean? My idea is to keep the requirements on the implementation as slim as possible to avoid any possible future road blocks as far as possible.

@etcimon
Copy link
Contributor

etcimon commented Sep 19, 2014

Well no, I guess there's no problem with that syntax and the example. I'm just thinking freeing/re-adding the event could be avoided if edge triggered EV_ET is set, it'll stop the event from coming in after the first time until the next user activity on the descriptor registers it again, even if it's always on ie. with EV_PERSIST.

@s-ludwig
Copy link
Member

Oh okay, yeah that was my first idea, too. EV_ET just isn't supported for all backends. It would be interesting to know at which point EV_PERSIST is more efficient than re-adding the event continuously (I could imagine that for many open FDs, having all of them active all the time may have some overhead, but I didn't look at the libevent implementation).

@s-ludwig
Copy link
Member

Got implemented with a more general API as part of #1596.

@s-ludwig s-ludwig closed this Aug 11, 2017
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants