Opt out of tree based RidBags at the query level #2315

phpnode · 2014-05-05T11:10:46Z

While cool in many ways, the tree based RidBag feature can be a pain to deal with for languages that enforce or encourage asynchronous APIs for IO (e.g. node.js). It's painful because if I want to offer a consistent API between embedded RidBags and tree based RidBags, I have to make the embedded API async even though I already have the data. This means that I cannot, for example, reliably JSON.stringify() a record containing a bag.

For the most common cases, I also think that the default bonsai tree threshold is too low. At 80 items and after base64 encoding, the tree weighs in at little more than 1Kb, this is not a lot of data, and from the clients point of view it's going to be more efficient to lazily decode that blob than to fetch it remotely from the server. This will hold true even with 10,000 records in the bag.

It would be awesome if it was possible to selectively increase the threshold for the tree based RidBag feature or skip it entirely for certain clients or queries. I think it's really useful only for very huge data sets.

The text was updated successfully, but these errors were encountered:

phpnode · 2014-11-26T11:08:34Z

@laa @lvca this is causing problems for basically everyone using Oriento, and it's not something we can trivially solve. I think increasing the default threshold would be a good step to at least mitigate the issue. From what I've seen, virtually everyone will hit the 80 item limit in production but not necessarily in testing/development, so it's a nasty surprise waiting to bite people.

lvca · 2014-11-26T11:17:50Z

@phpnode you're right. @laa WDYT?

StarpTech · 2015-05-06T08:02:37Z

@lvca @laa How is the status of this issue? thanks.

phpnode · 2015-05-06T08:08:21Z

One unpleasant part of this issue is that if you specify a fetch plan that fetches a tree larger than the threshold, you'll get the fetched records back but have no way of reconstructing the structure because the ridbag is still on the server

StarpTech · 2015-06-13T13:41:20Z

Status?

seeden · 2015-06-15T00:15:41Z

+1

dehbmarques · 2015-06-15T13:12:58Z

+1

IgitDanny · 2015-07-24T08:57:12Z

+1

a-unite · 2015-07-25T03:41:44Z

we are dependent on this issue too

seeden · 2015-07-25T03:56:22Z

+1

whatyouhide · 2015-07-26T00:10:32Z

Gonna have to leave my 👍 here as well, this API is very painful to work with from the perspective of a binary driver.

lvca · 2015-07-26T16:08:36Z

What about if we could provide a C/C++ driver with such API so all the drivers can use it and it would be also super fast?

whatyouhide · 2015-07-26T16:27:06Z

Mmm I don't think adding a dependency can make things much better. Drivers should still have to work on bindings to such C driver.

I'm sure a very big first step would be providing a decent documentation (if not thorough :D) for this part, which as of now is inexistent (LinkBag is an emtpy section in the schemaless serialization section of the guide. Updating the docs doesn't require any version bump and doesn't introduce any bugs so I guess it could be done relatively quickly.

That said, I think that the problem is that the API is too low level and not customizable enough. For example, a way to tell the server the embedded threshold per wurry would already be a big step but I realize the protocol changes wouldn't be trivial. I'm sorry but this is a problem for which I only offer complaints, no solutions :).

hilkeheremans · 2015-08-19T19:50:06Z

Another +1 here. This is quite a PITA.

@lvca I don't mind an extra dependency, as long as we get a driver with a reliable and consistent API.

gustavolanna · 2015-12-02T19:55:33Z

Status?

austinsmorris · 2016-06-08T23:03:40Z

Any update on this?

smolinari · 2016-06-09T04:59:19Z

👍

Scott

saeedtabrizi · 2017-02-04T18:55:11Z

@laa , @lvca , @maggiolo00 Is there any plan to close this issue ? is there any new status ?

andreafalzetti · 2017-03-06T11:47:12Z

+1

laa · 2017-03-06T12:00:50Z

Hi guys,

Could you explain to me how do you see this will happen, is it correct that main idea of this change to convert on the fly tree based rid bags into embedded ridbags on query level ?

smolinari · 2017-03-06T12:52:55Z

@laa

I am no expert, but this is my take with some questions.

I don't think anyone was asking for an "on-the-fly" conversion between a tree-based and embedded ridbag. Could that be done and also perform well, if the ridbag is big?

When do embedded ridbags start to become a performance concern? That should be the (much higher) default threshold in ODB to switch over the ridbag to the tree type. That was the original request. Actually, the request was to selectively change the threshold. However, I can imagine the threshold would basically be based on when embedded rigbags become "too heavy".

Or asking with another possibility in mind...how bad is the overhead, when using the tree based ridbag compared to the embedded ridbag? If it is negligible for smaller data sets, then maybe the embedded ridbag should be completely dropped? This would require a migration of data from older ODB versions into 3.0, but it might also solve this problem completely for the future.

Scott

laa added the enhancement label Aug 29, 2014

lvca assigned laa Nov 26, 2014

laa added storage team and removed storage team labels Apr 12, 2016

laa removed the storage team label Sep 30, 2019

laa closed this as completed Aug 4, 2021

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Opt out of tree based RidBags at the query level #2315

Opt out of tree based RidBags at the query level #2315

phpnode commented May 5, 2014

phpnode commented Nov 26, 2014

lvca commented Nov 26, 2014

StarpTech commented May 6, 2015

phpnode commented May 6, 2015

StarpTech commented Jun 13, 2015

seeden commented Jun 15, 2015

dehbmarques commented Jun 15, 2015

IgitDanny commented Jul 24, 2015

a-unite commented Jul 25, 2015

seeden commented Jul 25, 2015

whatyouhide commented Jul 26, 2015

lvca commented Jul 26, 2015

whatyouhide commented Jul 26, 2015

hilkeheremans commented Aug 19, 2015

gustavolanna commented Dec 2, 2015

austinsmorris commented Jun 8, 2016

smolinari commented Jun 9, 2016

saeedtabrizi commented Feb 4, 2017

andreafalzetti commented Mar 6, 2017

laa commented Mar 6, 2017

smolinari commented Mar 6, 2017 •

edited

Loading

Opt out of tree based RidBags at the query level #2315

Opt out of tree based RidBags at the query level #2315

Comments

phpnode commented May 5, 2014

phpnode commented Nov 26, 2014

lvca commented Nov 26, 2014

StarpTech commented May 6, 2015

phpnode commented May 6, 2015

StarpTech commented Jun 13, 2015

seeden commented Jun 15, 2015

dehbmarques commented Jun 15, 2015

IgitDanny commented Jul 24, 2015

a-unite commented Jul 25, 2015

seeden commented Jul 25, 2015

whatyouhide commented Jul 26, 2015

lvca commented Jul 26, 2015

whatyouhide commented Jul 26, 2015

hilkeheremans commented Aug 19, 2015

gustavolanna commented Dec 2, 2015

austinsmorris commented Jun 8, 2016

smolinari commented Jun 9, 2016

saeedtabrizi commented Feb 4, 2017

andreafalzetti commented Mar 6, 2017

laa commented Mar 6, 2017

smolinari commented Mar 6, 2017 • edited Loading

smolinari commented Mar 6, 2017 •

edited

Loading