perposle r.return() and r.tunnel() #1986

wojons · 2014-02-21T21:56:58Z

This is simular to #1653 the idea behind this is that i may have some very busy database nodes and dont wanna put any more storess on them but the current machine issueing the query is totally fine. So i issue the query and then the results. maybe some super light filter and then return the results back to calling node. And it can do some crazy stuff from there maybe there is a complex r.match().

The way this works normally would be like.

r.table().filter().return().filter(r.match())

everything after the return takes place locally assuming the data was on another node and not this one.

There can also be a few other good uses like.

r.tunnel('lax.*').table().filter().return().filter(r.match())

depending on how this should be done it will pick any node in the dc lax have it run the query and then have it retuned to its self it would process the data and then the results of that would automatically be sent back to the local machine.

the other option is that it would try to parallelize the query in lax so if there are a few nodes there they all get the different docs back and run the finally filter yes this is a pretty crazy idea but also pretty powerful

the return function should have a few options maybe you wanna return one step up or all the way back to the original machine depends.

The text was updated successfully, but these errors were encountered:

coffeemug · 2014-02-24T10:19:57Z

I believe that coerceTo('array') will currently accomplish what you want with return:

r.table().filter().coerceTo('array').filter(r.match())

wojons · 2014-02-24T10:23:00Z

@coffeemug from what i understand from @danielmewes is that using coerceTo('array') will remove all parrllelizion. and the entire data set will have to fit into memory and i am not sure if that also means it messes up with pipelining so if its a large data setit needs to be pulled into one array first before it continues processing on the parent node. it can get the same results i want from above depending on the data set size and other stuff but not as flexible as what i am purposing.

coffeemug · 2014-02-24T10:27:16Z

Ah I see. I have to mull over this for a bit. We purposely made data flow in the cluster completely opaque. Making some of it controllable is an intriguing idea.

wojons · 2014-02-24T10:33:34Z

@coffeemug i know ever user may use rethinkdb differently from your original vision. I use it for a database but also as an anyaltics engine/framework. As we all know PHP is not the language to handle large data processing which is why PHP was able to handle its own when paired with MySQL because you have MySQL handle all that data processing for you. With all the new ways the web works and terms like "webscale" it makes mysql a less popular thing to scale your php with. I personally install rethinkdb on all my application nodes in my cluster and have them connect locally. I even processing that is not built into the php framework to the local database to processes it for me sometimes that is going out to another server where the data is sometimes using it to sanitize a users input using things like

r.json().map()

coffeemug · 2014-02-24T10:38:06Z

@wojons -- that's awesome, that's actually really helpful. RethinkDB can act as a general purpose distributed computation engine, but it is missing a few control primitives for that. We'd have to think through how to properly add these, but the possibilities are pretty damn cool.

(Also, it's a matter of timing/marketing/etc. which can be surprisingly nuanced)

wojons · 2014-02-24T10:46:17Z

@coffeemug yeah exactly i understand it needs to be timed right and so on. Yeah distributed computation engine is a really good way to explain it its the core of my application. rewriting the application would not take long in any language since most of the important stuff is in rethinkdb. I know its an out there feature and there are somethings need to happen before then and until then I can wait.

coffeemug added this to the backlog milestone Feb 24, 2014

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

perposle r.return() and r.tunnel() #1986

perposle r.return() and r.tunnel() #1986

wojons commented Feb 21, 2014

coffeemug commented Feb 24, 2014

wojons commented Feb 24, 2014

coffeemug commented Feb 24, 2014

wojons commented Feb 24, 2014

coffeemug commented Feb 24, 2014

wojons commented Feb 24, 2014

perposle r.return() and r.tunnel() #1986

perposle r.return() and r.tunnel() #1986

Comments

wojons commented Feb 21, 2014

coffeemug commented Feb 24, 2014

wojons commented Feb 24, 2014

coffeemug commented Feb 24, 2014

wojons commented Feb 24, 2014

coffeemug commented Feb 24, 2014

wojons commented Feb 24, 2014