-
Notifications
You must be signed in to change notification settings - Fork 870
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Opt out of tree based RidBags at the query level #2315
Comments
@laa @lvca this is causing problems for basically everyone using Oriento, and it's not something we can trivially solve. I think increasing the default threshold would be a good step to at least mitigate the issue. From what I've seen, virtually everyone will hit the 80 item limit in production but not necessarily in testing/development, so it's a nasty surprise waiting to bite people. |
One unpleasant part of this issue is that if you specify a fetch plan that fetches a tree larger than the threshold, you'll get the fetched records back but have no way of reconstructing the structure because the ridbag is still on the server |
Status? |
+1 |
2 similar comments
+1 |
+1 |
we are dependent on this issue too |
+1 |
Gonna have to leave my 👍 here as well, this API is very painful to work with from the perspective of a binary driver. |
What about if we could provide a C/C++ driver with such API so all the drivers can use it and it would be also super fast? |
Mmm I don't think adding a dependency can make things much better. Drivers should still have to work on bindings to such C driver. I'm sure a very big first step would be providing a decent documentation (if not thorough :D) for this part, which as of now is inexistent (LinkBag is an emtpy section in the schemaless serialization section of the guide. Updating the docs doesn't require any version bump and doesn't introduce any bugs so I guess it could be done relatively quickly. That said, I think that the problem is that the API is too low level and not customizable enough. For example, a way to tell the server the embedded threshold per wurry would already be a big step but I realize the protocol changes wouldn't be trivial. I'm sorry but this is a problem for which I only offer complaints, no solutions :). |
Another +1 here. This is quite a PITA. @lvca I don't mind an extra dependency, as long as we get a driver with a reliable and consistent API. |
Status? |
Any update on this? |
👍 Scott |
+1 |
Hi guys, Could you explain to me how do you see this will happen, is it correct that main idea of this change to convert on the fly tree based rid bags into embedded ridbags on query level ? |
I am no expert, but this is my take with some questions. I don't think anyone was asking for an "on-the-fly" conversion between a tree-based and embedded ridbag. Could that be done and also perform well, if the ridbag is big? When do embedded ridbags start to become a performance concern? That should be the (much higher) default threshold in ODB to switch over the ridbag to the tree type. That was the original request. Actually, the request was to selectively change the threshold. However, I can imagine the threshold would basically be based on when embedded rigbags become "too heavy". Or asking with another possibility in mind...how bad is the overhead, when using the tree based ridbag compared to the embedded ridbag? If it is negligible for smaller data sets, then maybe the embedded ridbag should be completely dropped? This would require a migration of data from older ODB versions into 3.0, but it might also solve this problem completely for the future. Scott |
While cool in many ways, the tree based RidBag feature can be a pain to deal with for languages that enforce or encourage asynchronous APIs for IO (e.g. node.js). It's painful because if I want to offer a consistent API between embedded RidBags and tree based RidBags, I have to make the embedded API async even though I already have the data. This means that I cannot, for example, reliably
JSON.stringify()
a record containing a bag.For the most common cases, I also think that the default bonsai tree threshold is too low. At 80 items and after base64 encoding, the tree weighs in at little more than 1Kb, this is not a lot of data, and from the clients point of view it's going to be more efficient to lazily decode that blob than to fetch it remotely from the server. This will hold true even with 10,000 records in the bag.
It would be awesome if it was possible to selectively increase the threshold for the tree based RidBag feature or skip it entirely for certain clients or queries. I think it's really useful only for very huge data sets.
The text was updated successfully, but these errors were encountered: