You signed in with another tab or window. Reload to refresh your session.You signed out in another tab or window. Reload to refresh your session.You switched accounts on another tab or window. Reload to refresh your session.Dismiss alert
I’m look at creating a new Memory Store that is configurable so that it can be optimised for any scenario.
Here is what I’m thinking of doing:
B+ tree indexes (because they support range queries)
Thread based indexing (async)
Deletes in diff index (merged when index.size > some max)
Hashmap index for exists(triple1) queries
I don’t want to have transactional support. I use transactions with disk based databases, but the memory store I usually only use for embedded purposes that are single threaded.
In general, are there any recommendations or requirements that others might have before I get too committed?
I also know that a lot of triple stores convert IRIs and literals to a hash/integer so that the hash/integer is stored in the indexes and the IRIs/literals are stored in a lookup table. I’m considering doing this too, but it might not be of much benefit unless I migrate to manual memory management using the unsafe library. Any thoughts on using sun.misc.Unsafe?
A maybe not relevant suggestion : you should have a look at mapDB (and similar libraries) that could be helpful to implement such things (a nice feature is that mapDB could be used with both memory only and disk-based storages).
My work from this resulted in the ExtensibleStore.
What I managed to implement:
A B+tree in-memory store - the issue here was that comparator code we had was slow (it's now fast, but my B+tree code is stale as hell)
An Adaptive Radix Tree (ART), which was also very performant but I couldn't get is quite as performant as I wanted
In general I was able to get better performance on loading data for the ART based store, and for both stores I got better performance when selecting for more than one statement component at a time. Eg. something like ex:a foaf:knows ?b.
My biggest hurdle was the SPARQL engine which would need a lot of work to take advantage of ordered data structures.
A lot of the configurable aspect of this task is now a lot easier to implement as new custom stores on top of the ExtensibleStore. Eg. if we wanted an in-memory store based on a TreeSet or one that was optimised for read heavy workloads.
I'm closing this specific issue though since I'm not likely to keep looking into this anymore.
I’m look at creating a new Memory Store that is configurable so that it can be optimised for any scenario.
Here is what I’m thinking of doing:
I don’t want to have transactional support. I use transactions with disk based databases, but the memory store I usually only use for embedded purposes that are single threaded.
Scenarios I want to support:
Timeline for this task is 3-6 months.
The text was updated successfully, but these errors were encountered: