Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

RESTful polling #52

Open
slef opened this issue May 12, 2013 · 26 comments
Open

RESTful polling #52

slef opened this issue May 12, 2013 · 26 comments

Comments

@slef
Copy link

slef commented May 12, 2013

It would be great to have a RESTful agent capable of long polling.
This would, for example, allow me to have huginn watch lights turn on and off in my house using the openremote rest API
http://www.openremote.org/display/docs/Controller+2.0+HTTP-REST-JSONP
I'm sure there would be many more applications.
I'm new to huginn and to Ruby but if someone can point me to parts of the code I should look at, I'm willing to give it a try.

@cantino
Copy link
Member

cantino commented May 12, 2013

Interesting! Which direction would this be polling? So the Agent would open up a URL and provide a long-polling endpoint for another service? Or it would subscribe to a long-polling endpoint and wait for events?

@slef
Copy link
Author

slef commented May 12, 2013

What I would like for my application is to have the huginn agent subscribe to a long polling endpoint. This way, it could instantly create events when it is notified of changes in my lights, blinds or temperature.

If I understood the procedure correctly, the (e.g. Huginn) agent sends a HTTP GET request to the server (e.g. openremote) and the server only responds whenever a status change occurs, at which point it sends the new information in JSON or XML format. If no change occurs within a prescribed amount of time, the server timeouts.
In both cases, the client sends again the same GET request, waiting for the next update and so on.

There is also the option of getting the current status of a sensor with a GET request to a different url, or to POST to a switch (e.g. to turn on the light etc). This would allow for example to turn off all the lights when I leave from home etc.

@cantino
Copy link
Member

cantino commented May 15, 2013

That's totally achievable. Right now the only Agent that runs a continuous
process is the Twitter Agent, but I'd like to make that more generic
eventually.

On Sun, May 12, 2013 at 2:54 PM, slef notifications@github.com wrote:

What I would like for my application is to have the huginn agent subscribe
to a long polling endpoint. This way, it could instantly create events when
it is notified of changes in my lights, blinds or temperature.

If I understood the procedure correctly, the (e.g. Huginn) agent sends a
HTTP GET request to the server (e.g. openremote) and the server only
responds whenever a status change occurs, at which point it sends the new
information in JSON or XML format. If no change occurs within a prescribed
amount of time, the server timeouts.
In both cases, the client sends again the same GET request, waiting for
the next update and so on.

There is also the option of getting the current status of a sensor with a
GET request to a different url, or to POST to a switch (e.g. to turn on the
light etc). This would allow for example to turn off all the lights when I
leave from home etc.


Reply to this email directly or view it on GitHubhttps://github.com//issues/52#issuecomment-17786100
.

@cantino
Copy link
Member

cantino commented May 19, 2013

Is this something you'd like to work on?

@slef
Copy link
Author

slef commented May 20, 2013

I would, but this week will be a bit busy for me. If someone else is eager to get it done, please go ahead, otherwise I will try to do it whenever I find some time.
I started looking at the twitter agent. Could you point me to the piece of code that makes it run continuously?

@cantino
Copy link
Member

cantino commented May 20, 2013

It'd be great if you want to take crack at it.

The primary code for the Twitter Agent is run by bin/twitter_stream.rb and launched from the Procfile.

@cantino
Copy link
Member

cantino commented May 20, 2013

Maybe the right model is to allow any Agent to be declared continuous! and then to have an optional hook where it can register EventMachine timers or other callbacks. Have you worked with EventMachine before?

@slef
Copy link
Author

slef commented May 21, 2013

Yes that sounds reasonable.
I don't know EM. Reading about it now.

@slef
Copy link
Author

slef commented Jun 2, 2013

I just looked at the code of Huginn and EventMachine. If I understood correctly, Huginn propagates events every 5m and in between nothing happens. It this correct?
In the case we start implementing continuous agents, it would probably make sense to activate them as soon as some events they consume become available (even events generated by huginn itself).

@cantino
Copy link
Member

cantino commented Jun 2, 2013

That's correct for the scheduler, but if you look at the
bin/twitter_stream.rb, that one runs continuously. The right approach
would probably be to add a new mode for the scheduler that allows Agents to
register EM hooks in a continuous fashion.

On Sun, Jun 2, 2013 at 9:22 AM, slef notifications@github.com wrote:

I just looked at the code of Huginn and EventMachine. If I understood
correctly, Huginn propagates events every 5m and in between nothing
happens. It this correct?
In the case we start implementing continuous agents, it would probably
make sense to activate them as soon as some events they consume become
available (even events generated by huginn itself).


Reply to this email directly or view it on GitHubhttps://github.com//issues/52#issuecomment-18808840
.

@slef
Copy link
Author

slef commented Jun 7, 2013

I see how to use EM to maintain network connections, but I can't find anything on managing internal events. Do we have to create a new connection class?

@cantino
Copy link
Member

cantino commented Jun 10, 2013

To clarify, are you referring to a possible internal EM handler for Huginn events? Or do you mean for consuming an external stream and creating Huginn events?

@slef
Copy link
Author

slef commented Jun 11, 2013

Yes, I was referring to possible internam EM handler for Huginn events (of both actually).
For example, my OpenRemote long polling could create a lightOn event when I turn the light on. I would like that event to be instantly treated by some other agent, for example to send me an email etc.
In the first case, Huginn events are created continuously (as already happens with twitter_agent). In the second case, I need an agent to consume Huginn events continuously, as far as I can see, Huginn does not do that yet.

Looking at EM, it looks like only scheduled or I/O events are managed (the main loop runs select* on file descriptors). Is this correct or is there some way to trigger EM connections that I do not see?
Wouldn't it be more efficient to just handle Huginn events separately? Whenever events are created, check if they can be fed to some agent and awaken them...

Although if we want to do that, we should probably think of some mechanism to avoid creating loops that might hog the system.

@slef
Copy link
Author

slef commented Jun 23, 2013

Ah ok, it seems like this is doable just using EM:Queue to queue the processing of events that have been created.

This seems to be a pretty fundamental change to the main loop of Huginn. I have no time to look at it further right now, I'll try again if I can find some time in a week or so.

@slef
Copy link
Author

slef commented Feb 3, 2014

I finally found some time to get back to this (added a pull request with the resulting changes).
The new agent can poll my OpenRemote server and creates events instantaneously whenever I press a light switch. As you suggested, I added the continuous! declaration and an em_start function.
Of course this would be much more useful if events could trigger agents instantaneously rather than the current 5m interval. Maybe we could add a instantaneous! declaration for such agents and handle them in EM as well (e.g., using an EM:Queue).

@cantino
Copy link
Member

cantino commented Feb 3, 2014

Cool! Interesting timing on the discussion. See #157

@0xdevalias
Copy link
Member

@cantino Is this still relevant given #157 / #167 ?

@cantino
Copy link
Member

cantino commented Jun 1, 2014

@slef's approach of using EventMachine will still be faster than our current system, as you could get data into Huginn from a remote source in the sub-second range. Right now Agents don't run more often than once a minute. It would be a pretty big change to how Huginn works, though, I think.

@dsander
Copy link
Collaborator

dsander commented Jun 1, 2014

I think we should add a way to support "long running" agents at some point. Long polling is one use case, webhooks, chat (IRC) consumers are a few others which would be nice to have. I though about this a little over the weekend and might add some basic "framework" to support this over the next week.

@slef
Copy link
Author

slef commented Jun 2, 2014

Actually if you look at my code in #158, I added an EM to schedule.rb so you can define any agent to be continuous! using em_start and em_stop and this will activate it in EM. Using this I think it shouldn't be too hard to reimplement twitter_agent without needing the separate script, or to implement IRC and Jabber clients.
This takes care of generating events from the outside instantly, and #167 takes care of the fast propagation.

@cantino
Copy link
Member

cantino commented Jun 2, 2014

Does Rails work okay when inside an EM loop? What happens if you have two workers (like Unicorn instances)? Or should the EM loop be in a separate process?

@dsander
Copy link
Collaborator

dsander commented Jun 2, 2014

I do not think the EM loop will cause any problems with rails it is run in the same process (or thread with the threaded workers #318) as the scheduler, so it should not make a difference how many unicorn instances are running.
Nevertheless I think that it should be ran as a different process, because the code inside the event loop will block the scheduler from running which might delay propagation of agents etc. It would also be nice to spawn a thread for every agent so that one does not need to use a EM enabled library.

@slef
Copy link
Author

slef commented Jun 3, 2014

As far as I understand it, in the current schedule.rb, rufus already runs in a separate thread, and EM runs in the main thread so one shouldn't block the other.
Right now all the continuous! agents run in the same EM loop but this shouldn't be a problem as long as each agent uses non-blocking code. Now if we want to protect the EM loop from badly implemented agents, we could run one EM loop per agent but I don't know how this will affect performance.

@cantino
Copy link
Member

cantino commented Jun 3, 2014

Can you run multiple EM loops?

@dsander What do you think of how this is setup now?

@dsander
Copy link
Collaborator

dsander commented Jun 3, 2014

@slef I also think rufus spawns a worker thread to run the scheduled events, but the problem is that the GIL (at least for the MRI) will block all other threads for the time ruby code is executed. I would like to have it as a separate module/class and a small wrapper to run it (like we have for the scheduler, twitter stream, etc) and add it to the threaded worker by default.
That way we would have the same smaller memory footprint by default, but if a user needs the performance he could fall back to the old separated workers.

@cantino I think you can run multiple event loops if you run them in separate threads, but I have never done that

@cantino
Copy link
Member

cantino commented Jun 4, 2014

@dsander I agree that we should make it separate, but run together by default, as you described.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants