Reactive Query #1227

Closed
lvca opened this Issue Dec 24, 2012 · 14 comments

Projects

None yet

6 participants

@lvca
OrientDB member

It would be cool having a way to provide streamable updates to the clients based on a sort of "Live Views". This idea comes from Ayman and it's very powerful!

Let's think at this simple example. A client is interested to this query:

select from order where customer.city.name = 'Rome' and amount > 100000

But it doesn't want to poll for new items, but rather be asynchronously notified when new entries match the query. Here we can apply the classic publish/subscribe pattern.

To handle it we could save 2 object in memory:
1) the target: Order
2) the filter: customer.city.name = 'Rome' and amount > 100000

In the new "OViewHook" class I could check for each view if the record's class is a sub-class of "target" (in this case Order). If this applies i could check the filter against the record. If it returns true it could be sent to the requester.

Other things to do are:

  • A good way to define the view. Probably a new asynchronous command like "STREAM select from order where customer.city.name = 'Rome' and amount > 100000"
  • On startup the hook must be installed and have to handle all the publish/subscribe mechanism
  • Provide a new REST way to access it. Something like /stream//. This has to support the asynchronous way
  • Update the JS driver to support it
@mattaylor

Perhaps these 'live views' could be implenmted as an OrientDB plugin for ESPER. http://esper.codehaus.org.

Esper is a mature GPL java implementation of a complex event processing engine which can connect to several popular databases using a SQL like DSL for processing events with support for POJO event objects and sliding time windows. eg
select avg(price) from StockTickEvent.win:time(30 sec)

More details available at http://esper.codehaus.org/tutorials/tutorial/tutorial.html

Elasticsearch.org also has a similar feature they call 'percolation' (http://www.elasticsearch.org/blog/2011/02/08/percolator.html)

@mattaylor

If esper is too heavyweight for this, (and too the GPL license too restrictive) at least the concept of 'windows' could be added to support simple event stream processing. The basic window views supported by ESPER are..

  • win:length,
  • win:length_batch,
  • win:time,
  • win:time_batch,
  • win:time_length_batch,
  • win:time_accum,
  • win:ext_timed, etc..

These properties could be perhaps be considered functional members of an OStream class, and the time related functions could take cron like timers as arguments.

More info here http://esper.codehaus.org/esper-4.7.0/doc/reference/en-US/html/epl_clauses.html

@lvca
OrientDB member

I've started to work to see how to implement this task as the best and I've a working prototype I can push on a separate branch. The trick was the usage of the Asynchronous Queries supported by OrientDB since release 0.8. Probably the hardest part now is to let Node.js and other drivers to support it.

@mattaylor

Some more thoughts that might help on the RESTfull design side of this one.

The Atmosphere framework (https://github.com/Atmosphere/atmosphere) is a good OS java implmentation of websockets, comet, long polling http that might be usefull to extend OrientDB to support real time event streams over http.

NGINX also has a well designed streaming module that we use here
https://github.com/wandenberg/nginx-push-stream-module

To define a stream
POST /stream/{db} << {filter}

To publish to a stream
`POST /stream/{db}/{id} << {message}

To subscribe to a stream
GET /stream/{db}/{id}

@mattaylor

What do you think about adding real time pub/sub support to the http api using long polling, comet, websockets etc..? The atmosphere framework could really help here.

@mattaylor

Did this feature make it to the 1.4 branch? Are there any docs for this?

@mattaylor

Could we get an update on why this feature was pushed back from 1.4 to 2.0 - is it just too complex?

@lvca lvca modified the milestone: 2.1, 2.0 Mar 28, 2014
@lvca lvca added the enhancement label Oct 4, 2014
@lvca lvca modified the milestone: 2.2, 2.1 Oct 4, 2014
@lvca lvca changed the title from Live Views: streaming queries to Reactive Views: streaming queries Apr 8, 2015
@lvca lvca modified the milestone: 2.1-rc1, 2.2 Apr 8, 2015
@luigidellaquila luigidellaquila was assigned by lvca Apr 8, 2015
@lvca lvca changed the title from Reactive Views: streaming queries to Reactive Query Apr 8, 2015
@lvca
OrientDB member

This feature will be part of 2.1 as Experimental. In 2.2 we plan to put it as stable.

@lvca lvca modified the milestone: 2.1-rc1, 2.1-rc2 Apr 16, 2015
@jwarkentin

Full disclosure: I'm new to OrientDB so forgive me if something here is way off base.

I've read over this and #2652 and I have a concern. I've spent a lot of time building and maintaining realtime systems. It's important to have a feature like this in the database itself because there are two problems that can't otherwise be solved and I'm concerned that the second one isn't being addressed.

  1. The first is that when someone (like a dev) makes a change directly in the database instead of through some application layer, caches don't get invalidated and realtime updates don't happen. By having the database push updates it ensures that all changes get pushed through to the applications interested in data changes.

  2. The second, and most significant problem, is that of displaying filtered lists of items. Let's say you're building a realtime application where you are displaying filtered lists of different types of items (like lists of articles in my case). As soon as any item of a given type is added or updated, you have to invalidate ALL lists of the given type because you have no idea how the change affects any given list. Does the new or updated item belong in a given list? Where does the new item go in any given ordered list? Since it's difficult to impossible to know exactly how a given change affects each filtered list you must invalidate all of them and re-run all queries for the given type of item.

I don't know if there's documentation somewhere for the current implementation of this feature, but my question is, does it report where the new item should go in a list or where an updated item should go in a sorted list?

@jwarkentin

Also, the guys at RethinkDB have a very good understanding of the realtime problem and have some excellent things to consider here: http://rethinkdb.com/blog/realtime-web/

@lvca lvca modified the milestone: 2.1-rc2, 2.1 GA May 5, 2015
@lvca lvca modified the milestone: 2.1 GA, 2.1-rc4 Jun 16, 2015
@gadisridhar

ESPER EPL LIKE QUERY IS CASE SENSITIVE. HOW DO I SOLVE IT. EX - select * from StockTick having symbol like '%YHOO%' . HAVE EVENTS SYMBOL AS 'yhoo'. GUYS SUGGEST. MANY THANKS

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment