New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Event handler plugins #536

Merged
merged 49 commits into from Dec 13, 2016

Conversation

Projects
None yet
@lminiero
Copy link
Member

lminiero commented May 23, 2016

Hi all,

this is a pretty major change in Janus, as it introduces a new category of plugins: event handlers. More specifically, now Janus (and plugins) can originate events of specific types (e.g., session created, handle detached, peerconnection going up, media available, packet loss, plugin-specific data, etc.). These events are then passed to all the event handler plugins that subscribe to that type, so that they can be, indeed, "handled" somehow.

Since the Admin API is pull-based, this should be useful/helpful for several reasons. You may want, for instance, to use some events as a "trigger" to do an Admin API pull, rather than polling continuously; or you may want to save these events to a database, so that you have a view of what happened in each media session; or send them to an external component for processing; etc. I expect a typical usage will actually be implementing plugins that interface to external services (e.g., Homer, callstats.io, etc.) in order to have them analyzed/processed there.

At the moment I only implemented a very simple sample plugin, that basically forwards any event it receives to a web backend via HTTP POST. If you want to give it a quick test, enable it in janus.eventhandler.sampleevh.cfg (it's disabled by default), set a valid backend (e.g., http://localhost:8080), and wait for events to be received by your web server. A very simple way to set one up just for testing purposes is this node.js code, that will just print whatever it receives on standard output:

var http = require('http');

http.createServer(function (req, res) {
    var body = "";
    req.on('data', function (chunk) {
        body += chunk;
    });
    req.on('end', function () {
        console.log(body);
        res.writeHead(200);
        res.end();
    });
}).listen(8080);

That said, this is a very rough first attempt at defining how this new interface could be implemented. For instance, the available event types are limited to what came to my mind when I started writing the code, and so can definitely be extended. At the moment the kind of events (macrocategory, if you will) you can receive in an event handler are:

  • session related events (e.g., session created/destroyed, etc.)
  • handle related events (e.g., handle attached/detached, etc.)
  • JSEP related events (e.g., got/sent offer/answer)
  • WebRTC related events (e.g., PeerConnection up/down, ICE updates, DTLS updates, etc.)
  • media related events (e.g., media started/stopped flowing, stats on packets/bytes, etc.)
  • events originated by plugins (at the moment, all of them, no way to pick)
  • events originated by transports (at the moment, all of them, no way to pick)

Not sure if we can have more, or if they can fit in there. The idea is that the core generates events of a certain macrocategory, each with different details. Event handler plugins can then subscribe to those macrocategories (none, some, all) in order to be notified. I realize this is pretty crude (you can't filter which plugins you're interested in, for instance: it's all or none), but for the sake of simplicity (handler plugins edit a mask, the core checks it before relaying) I chose to do it the way it is now for the moment.

Events are always provided as json_t objects: I chose to do that instead of creating a new struct for the purpose as (i) a JSON object is generic enough to cover pretty much everything, and (ii) its json_incref/json_decref mechanism allowed for a much easier management of referencing these events (the same event may be shared across multiple handlers, which may handle them at any time). Besides, it makes it easier for plugins that just want to serialize the data and shoot it somewhere (as my sample plugin does, incidentally). I tried to define a common syntax to refer to, though, and specifically:

{
    "type" : <numeric event type identifier>,
    "timestamp" : <monotonic time of when the event was generated>,
    "session_id" : <unique session identifier>,
    "handle_id" : <unique handle identifier, if provided/available>,
    "event" : {
        <event body, custom depending on event type>
    }
}

The monotonic timer is there in case event handler plugins need to know how much time passed since the event was generated and when they actually managed to handle it. You can see the exact format of the events I implemented so far by looking at the messages my sample plugin handler shoots, as explained above.

As to which events are available, I only implemented only a few of them, in this first version, and specifically:

  • session created/destroyed/timeout
  • handle attached/detached
  • JSEP local/remote offer/answer
  • ICE states, DTLS states, selected ICE pair, PeerConnection up/down
  • media started/stopped flowing

Relevant messages that are missing are, for instance, statistics of some sort, and other will definitely come to mind, but I guess it's a good start for the moment. As a side note, for what concerns originating events in other plugins I only integrated the "notify event" feature in the EchoTest plugin, for the moment: specifically, any time you send the EchoTest plugin a new message, it sends the core the current internal configuration (audio/video/bitrate) as an event. Of course the plan is to use the feature for many more events that may be relevant to specific plugins, e.g., SIP calls in the SIP plugin, people joining/leaving/publishing/etc. in any of the communication plugins, and so on. This really will depend on what people will think is necessary, and what isn't.

I believe that's all, for now. I don't expect it to be very optimized right now, and I haven't even started figuring out if it has any impact on performance, and how much. At the moment each event is generated as soon as the "thing" happens, and as such is originated by whatever thread is in control of the activity. No message bus or anything of that sort is implemented: this means that, when receiving an event, handler plugins MUST just increase the reference to the object and just enqueue it, to then process it ASAP in a thread of their own (pretty much as most of the media plugins do anyway). This should help avoid slowing Janus down, as doing any I/O or event processing in the callback itself would instead.

Any comment? Feedback? Suggestion?

@kris-lab kris-lab referenced this pull request May 23, 2016

Closed

META: Upgrade to newest janus #89

0 of 1 task complete
@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented May 26, 2016

... no opinion at all on this?! 😉

@saghul

This comment has been minimized.

Copy link
Contributor

saghul commented May 26, 2016

I had a quick glance. Do you have anything other than monitoring in mind? As for sending the events, maybe it's worth it to have an auxiliary thread pulling the events out of a queue and dispatching then? This way senders can send an event from anywhere, but receivers will always get them from the same thread, serialized.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented May 26, 2016

Monitoring and statistics collection, e.g., for ex-post troubleshooting, are the main things that come to mind. The auxiliary thread you mention could be something like the "message bus" I cited, and is one of the enhancements I could make along the road. I'd still require handler plugins to handle events separately, though, as while they would not delay Janus by itself now, they'd still be able to slow or even lock (in the worst case) the auxiliary thread itself.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Jun 7, 2016

Added new events (SIP messages, some RTCP and media related stuff, and others), and disabled the feature by default. It can be enabled either by explicitly setting broadcast=yes in [events] in janus.cfg, or by passing -e on the command line.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Jun 7, 2016

Also added a thread which does the actual broadcasting: core and plugins just enqueue events, and the thread shoots them out.

@nowylie

This comment has been minimized.

Copy link
Contributor

nowylie commented Jun 8, 2016

This looks like a pretty good idea, the only change I might suggest is not hard coding event types.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Jun 8, 2016

This looks like a pretty good idea, the only change I might suggest is not hard coding event types.

Not sure what you mean by that? Without a formal definition of events, event handlers would not be able to recognize them reliably and handle them. If you mean the fact they're numeric (and more specifically a bit map), that was a choice to make them easier/lighter to subscribe to/unsubscribe from, and filter when notifying to just the interested handlers.

@nowylie

This comment has been minimized.

Copy link
Contributor

nowylie commented Jun 8, 2016

Your formal definition for events doesn't need to be in the event dispatching code though. Let the sender structure the event, the dispatcher can pass around arbitrary JSON data and a numerical event type, and the receiver can interpret the structure. The sender and receiver can agree on the format out of band through the event type.

Then if you wanted to make things really flexible, let the numeric values assigned to event types be dynamic by introducing some kind of event type registry.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Jun 8, 2016

But not all event handlers will just send stuff out exactly as it is like our demo plugin does. Other event handlers will receive these events, parse them expecting to know how the core formatted the stuff, and then use the info in order to do something (e.g., send out aggregated stuff, saving processed data somewhere, etc.). For this to work, the JSON data these handlers receive cannot be arbitrary (apart from the plugin specific info, of course), otherwise handlers won't know, for instance, how to figure out when a handle was attached, a PeerConnection went up, or media stopped flowing.

@nowylie

This comment has been minimized.

Copy link
Contributor

nowylie commented Jun 8, 2016

This is what I meant by agreeing on the format out of band. For example if the audiobridge plugin started dispatching events of the type AUDIOBRIDGE_PARTICIPANT_JOINED. It's reasonable to assume that my event handler, that registered interest in AUDIOBRIDGE_PARTICIPANT_JOINED events, will understand the format of any events it receives with that type.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Jun 8, 2016

I feel very dumb but I stil don't understand the problem 😄

Plugins already send events that are opaque to this mechanism. Type is always JANUS_EVENT_TYPE_PLUGIN there and plugins can choose what to notify and how (event handlers are supposed to be familiar with the specific plugin itself to know how to process those events). For events defined in the core, type is more constrained, and already works the way you mean: e.g., if the core sends an event of type JANUS_EVENT_TYPE_WEBRTC, the event handler knows what this is about and the format it will have.

The main reason to have these types defined in advance, is to allow event handlers to tell the core what they're interested in and what not. The core may (and will) generate many events for each call, so if your handler is not interested in all of them, no point in flooding it with info it will discard, it's a waste of resources.

Or did you mean something else?

@nowylie

This comment has been minimized.

Copy link
Contributor

nowylie commented Jun 8, 2016

I think our confusion might be in the definition of 'event type'. I'm imagining an event type to describe 'what happened', where I think you're saying that it means 'what kind of event happened'. Is this right?

I only advocate moving the formal definitions out of the event dispatching code and into the event generating code. Mainly as a means for separation of concerns.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Jun 8, 2016

What do you mean by event dispatching, specifically?

@nowylie

This comment has been minimized.

Copy link
Contributor

nowylie commented Jun 8, 2016

In events.c:
void janus_events_notify_handlers(int type, guint64 session_id, ...) could be void janus_events_notify_handlers(int type, json_t *data)

Then somewhere else you could have something like:

json_t *new_session_event(guint64 session_id, char *name) {
    json_t *data = json_object();
    json_object_set_new(data, "type", json_integer(JANUS_EVENT_TYPE_SESSION));
    json_object_set_new(data, "timestamp", json_integer(janus_get_monotonic_time()));
    json_object_set_new(data, "session_id", json_integer(session_id));
    json_object_set_new(data, "name", json_string(name));
}
...
janus_events_notify_handlers(JANUS_EVENT_TYPE_SESSION, new_session_event(session_id, name));
@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Jun 8, 2016

Ah, got it. The reason why we use variable arguments is that it makes it much easier to trigger event from within the core. We call what looks like a function, without having to worry about generating a json_t ourselves, which is something events.c can do for us.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Oct 31, 2016

Added a simple retransmission mechanism that is configurable. You can choose:

  1. how many times the last event (or bunch of events) should be retransmitted;
  2. the exponential backoff step (in milliseconds) to retry again.

By default it's 5 and 100ms, so retransmissions in case of errors are done after 100ms (1st time), 200ms (2nd time), 400ms (3rd time) and so on. Of course, in case a transmission or retransmission succeeds, this mechanism is reset. The fact that it's configurable should allow people to adapt it to their needs.

Anything more complex than that can be done in custom plugins.

@joshdickson40

This comment has been minimized.

Copy link
Contributor

joshdickson40 commented Oct 31, 2016

Haven't gotten a chance to look at this yet but thank you very much for looking at the retry mechanism. I will try to pull this into staging in the next day or two.

@joshdickson40

This comment has been minimized.

Copy link
Contributor

joshdickson40 commented Nov 13, 2016

We've been running this last build with event retransmission and everything has looked good on our end. Just wanted to report back that all looks OK! This PR has made life infinitely easier for how I like to manage these calling instances. Perhaps we can get this merged back into mainline soon :) Great work, Lorenzo.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Nov 13, 2016

@joshdickson40 tnx for the feedback! I'm abroad for the IETF and then will be away for some more days after that, but if everything's still fine I'll merge when I get back to the office, then.

@mirkobrankovic

This comment has been minimized.

Copy link
Contributor

mirkobrankovic commented Dec 1, 2016

Just started testing this branch.

Just as a quick thought, is there a need to enable plugin but also to enable it in main janus.cfg config file?
*one should be enough :)

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Dec 1, 2016

Yes there actually is. Core and plugins are different things: janus.cfg drives whether or not notifications are globally enabled, while each plugin can then be configured to send notifications or not. If you only left janus.cfg, you could end up with plugins spamming the core with info nobody needs. If you only left the configuration in plugins, there would be no way to control whether or not the core should notify, as the core has no view of plugin internals.

@mirkobrankovic

This comment has been minimized.

Copy link
Contributor

mirkobrankovic commented Dec 1, 2016

aaah yes, i see the point, since I was only comparing janus.eventhandler.cfg and janus.cfg.

I see now the sampleevh is just an http post example how to get them from core/A to B.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Dec 6, 2016

Just FYI, I plan to merge this either tomorrow or the day after that, so if there's anything that you feel should be fixed before please shout! This will also result in the current master to be tagged as v.0.2.1

@joshdickson40

This comment has been minimized.

Copy link
Contributor

joshdickson40 commented Dec 6, 2016

I have not yet pulled 876304f and 3fb6eb0 into our code base, are those low risk enough that they shouldn't affect the stability of this PR relative to previous commits? I have been running a custom build based on 55f6c13 and have nothing but positive things to say 👍

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Dec 6, 2016

They aggregate multiple pull requests and features so I can't say for sure, but it's all relatively minor changes anyway: support for RTP extensions in a couple of plugins (merged today), better support for RTSP restreaming (authentication, timeout management), "smarter" checks in configure, and a few fixes here and there. None of this should impact any existing application.

Thanks for testing this branch for so long by the way!

@joshdickson40

This comment has been minimized.

Copy link
Contributor

joshdickson40 commented Dec 6, 2016

Oh no problem, it made API design here so much easier, and again, thanks for taking a look at the retry mechanism so I could get this out into production usage. I can probably get these final commits pushed out tomorrow for production usage. If you'd like I could report back with a final sanity check that this all is working okay on my end. If you're not too worried about the minor changes since then (I've quickly browsed the PRs) feel free not to wait for that. Either way, great work on this.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Dec 6, 2016

If you're willing to give the final version a go merging can wait: more testing can only help! In a couple of days it's going to be a holiday in Italy anyway, so I can merge this (if everything's ok) when I then get back to the office on Monday.

@joshdickson40

This comment has been minimized.

Copy link
Contributor

joshdickson40 commented Dec 6, 2016

Done, will report back with info when I have it.

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Dec 12, 2016

@joshdickson40 any (good) news for us? 😄

@joshdickson40

This comment has been minimized.

Copy link
Contributor

joshdickson40 commented Dec 12, 2016

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Dec 12, 2016

Thanks! I'll work on a couple more pending fixes and possibly later today I'll merge then 👍

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Dec 13, 2016

Merging right now. For those that don't want to use this new version yet, I've tagged the latest version as v0.2.1: https://github.com/meetecho/janus-gateway/tree/v0.2.1

@lminiero lminiero merged commit 6a99815 into master Dec 13, 2016

@lminiero lminiero deleted the event-handlers branch Dec 13, 2016

@soulfly

This comment has been minimized.

Copy link
Contributor

soulfly commented Dec 13, 2016

Are there any versions changelogs?

@lminiero

This comment has been minimized.

Copy link
Member

lminiero commented Dec 13, 2016

No, sorry, the easiest way is to look at the commits or at the discussions here. Anyway, I'm writing a post on meetecho-janus that summarizes the changes and how people can use the new feature.

@lminiero

This comment has been minimized.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment