-
-
Notifications
You must be signed in to change notification settings - Fork 480
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Materialized view proposal #1997
Comments
@d12frosted sounds good! Thanks a lot for the benchmarks too, it's useful to know how much queries will speed up without the joins, and thankfully you have already implemented the feature elsewhere. My desired solution would have minimal additional handwritten writes to any tables. I've been thinking of doing materialized views directly via SQL, I'm pretty sure it's possible to have sqlite handle everything, since we obtain our nodes table via a SQL query anyway. |
This question was already raised on discourse. See my response here. What do you think? |
@d12frosted you're right, I tried googling around for "sqlite3 materialized views"and it seems that sqlite3 doesn't support it. I'm all for optimizing node reads if writes don't get too expensive, hope to see a contained solution! |
I implemented Materialized View using Sqlite Triggers to offload computation to Sqlite https://org-roam.discourse.group/t/materialized-view-for-org-roam-db-using-sqlite-triggers/3483/25
Please have a look at the post for more detail - implementing it in elisp should be trivial from here, leaving it here for further consideration. Thanks and Best, |
Thanks for the bump and the proposal! I still think that's a good idea, and it could and should be implemented! So here's my +1. |
Brief Abstract
Materialized view is a table where each row contains all information about node, including information from the following tables: nodes, aliases, citations, refs, tags and links.
Benefits
This would improve performance of many query operations, where we rely on multiplication of multiple tables. See vulpea#116 for benchmarks of possible implementation. Those benchmarks are using some
vulpea
functions, but in short it compares approach of multiplication and view table on db of 9554 notes:filter-on-tags-1
filter-on-tags-2
filter-on-links-1
filter-on-links-2
As you can see, when all notes needs to be traversed, view table provides x4.595 performance improvement.
Who would benefit?
The following group of users would benefit from this feature:
Long Description
Right now when we need information from multiple tables, we use table multiplication. But the more tables we want to multiply the slower this query becomes.
So instead of doing this 'multiplication' on the read side, we could maintain a separate table that contains all this information in one place. The schema would look like:
Proposed Implementation (if any)
See vulpea#116 as example of implementation.
Implementation would consist of 2 parts (can and should be released separately):
org-roam
code base where query happens (e.g. reading).Writing
Whenever the note is being synced, we also add all relevant information into this view table. I suspect that the sync routine needs to be modified a little bit, so we can avoid double parsing or non atomic inserts.
Reading
Instead of doing horrific SQL multiplication, we will use
org-roam-query
with this new materialized view table.FAQ
What about write performance?
You might noticed that in vulpea#116 db sync performance degrades quite a bit. This is explained by the fact that I simply duplicated buffer parsing, so it takes almost twice the time. If view table is implemented in
org-roam
, the footprint of view table should be minimal, e.g. hardly noticeable.Any volunteers?
Me 😸 If you think that it worth including such table in
org-roam
I would gladly work on that. Especially since I already have a working implementation that I use on a daily basis.Please check the following:
@jethrokuan Please let me know what you think. If something is not clear, just let me know 😄 I would gladly work on this feature for
org-roam
. In case you think that read performance doesn't cost data duplication, I have another proposal - to ease adding extra tables inorg-roam
db (and btw I believe it would be nice to have on its own regardless of this proposal - I will send it later) - with this materialized view can come as an extension.cc @publicimageltd as you were interested in this happening
The text was updated successfully, but these errors were encountered: