Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

shorten batch to 5 indices #168

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

adriano-di-giovanni
Copy link

@mcollina, this is my pull request about #167.

As you can see, I removed last item from the defs array. Doing so, I shortened the batch to 5. I updated tests accordingly and everything seems to work.

I'd like to revisit the optimization with you: if I understand your code, you cycle the defs array to find just ONE index to query. If I'm correct, when input pattern is { subject: 'subject', object: 'object' } there's no chance for OSP index to be elected (the last item, the one I removed); you always elect SOP. That's why I think that 5 indices suffice.

Generally speaking, the input pattern can be one of S, P, O, SP, SO, PO, SPO (order doesn't matter because you use an object). If you elect an index to query for each and every pattern, you will end up with 5 indices instead of 6. Please, try yourself to confirm.

A last word on the undocumented feature preferiteIndex

  1. it will be affected by the optimization because you can't specify OSP anymore;
  2. I think it should be called preferIndex or preferredIndex.

Thanks for levelgraph and please, let me know.

@mcollina
Copy link
Collaborator

I'd like @BigBlueHat and @jmatsushita to review this. If you folks are happy, we can merge and release.

I think this is a semver-major change.

@jmatsushita
Copy link
Collaborator

Hi there,

The reasoning seems spot on and I've played with changing the index resolution too to introduce a sequence number and when I did I was fairly impressed by the tests.

I agree about the preferiteIndex name change proposal which also made me wonder for a bit.

Also agree this should be a major semver change.

Thanks for contributing!

Jun

lib/utilities.js Outdated
@@ -32,7 +32,7 @@ var CallbackStream = require('callback-stream')
pos: ['predicate', 'object', 'subject'],
pso: ['predicate', 'subject', 'object'],
ops: ['object', 'predicate', 'subject'],
osp: ['object', 'subject', 'predicate']
// osp: ['object', 'subject', 'predicate']
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

just delete this :)

@mcollina
Copy link
Collaborator

If you would like to do the renaming and document the feature, go ahead!

Would you like to help maintaining this module? this is Open Open Source!

@mcollina
Copy link
Collaborator

I'll leave this open for a couple of days to leave everyone the chance to say something, then I'll merge and release.

Is there anything else we want to push out as semver-major?

@adriano-di-giovanni
Copy link
Author

@mcollina removed the line and changed preferiteIndex to preferIndex.

I'd love to maintain it but I think it'd very hard for me to, at the moment.
I mean, I stumbled upon levelgraph while searching for the hexastore but I don't plan to use it because I'm implementing a redis-backed hexastore. Thus, I won't have the opportunity to learn from production use and contribute with fixes/improvements during my working time. Nonetheless, I read at your code to learn from it and I'll be very happy to contribute again to levelgraph while developing the redis version.

@mcollina I'd like to have a chat with you about your talk for jsDay. Do you think it's possible?

@mcollina
Copy link
Collaborator

@adriano-di-giovanni ping me via email.

@adriano-di-giovanni
Copy link
Author

I've played with changing the index resolution too to introduce a sequence number and when I did I was fairly impressed by the tests.

@jmatsushita what do you mean by changing the index resolution and introduce a sequence number, please?

@adriano-di-giovanni
Copy link
Author

adriano-di-giovanni commented Mar 17, 2017

Let's wait to merge. Maybe we can just use 4 3 indices. I'm asking for confirmation to the authors of the paper. Let's keep in touch.

In the meanwhile, let me recap to have you onboard :)

You can search the hexastore by S, P, O, SP, SO, PO, SPO. Order of the elements doesn't matter. Otherwise you can't use an POJO to specify the search criteria.

@giggioz and me ended up using only three indices to perform all of the above searches. The indices are SPO, POS, OSP.

Search by Using
S SPO
P POS
O OSP
SP SPO
SO OSP
PO POS
SPO any of the indices

You can try yourself. How's it possible?

@jmatsushita
Copy link
Collaborator

@jmatsushita what do you mean by changing the index resolution

I meant changing the defs array and the preferIndex function.

and introduce a sequence number, please?

I was experimenting with the idea to use levelgraph as an event store. This would require generating a unique index (sequential is nice and simple) for each submission. With these semantics submitting the same exact triple twice would create 2 different entries in levelgraph and it would be possible to read the store "from the beginning" to reconstruct some state. This makes even more sense with a quad where the metadata/graph attached to the triple might be different. Basically something like what datomic does.

@mcollina
Copy link
Collaborator

@adriano-di-giovanni there is a catch. The query-planner can optimize some queries by using the additional indexes and leveraging the fact that the indexes are ordered. I'm not sure if this change affects that.

@adriano-di-giovanni
Copy link
Author

@mcollina one of the authors replied. They use 6 indices because of an implementation detail (they use hashes to store them). Otherwise, 3 indices suffice (SPO, POS, OSP). Please, let me know if you want me to update the pull request.

@mcollina
Copy link
Collaborator

@adriano-di-giovanni please do. I fear breakages, but let's try :)

@adriano-di-giovanni
Copy link
Author

I tried. There's a failing test. I need time to understand why. In the meanwhile you can try yourself changing defs in order to have SPO, POS and OSP indices only.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

None yet

3 participants