Add support for vertex - centric indexes in graphdb. #1895

laa opened this Issue Dec 18, 2013 · 14 comments

4 participants

OrientDB member
  • 1. Implement IndexMap and integrate it with bonsai collection manager (2 day)
  • 2. Implement atomic updates, so now we have atomic updates for single db component (2 days)
  • 3. [EXCLUDED] graphdb integration (4 days)
  • 4. Create MT test for IndexMap (2 days)
  • 5. Implement JSON serialization (2 days)
  • 6. Implement means to work and serialize IndexMap through remote connection (3 days)
  • 7. Partition SBTree data structure by clusters (4 h)
  • 8. [EXCLUDED] Implement embedded version of IndexMap (2 days)
  • 9. If document is removed all IndexMap entries should be removed too (2 days). - DONE
  • 10. Create MT test for the whole Graph DB which is based on IndexMap (2-3 days). - DONE
  • 11. [EXCLUDED] Add embedded IndexMap lazy serialization/deserialization (3 days) - DONE
  • 12. [EXCLUDED] Distributed storage support
  • 13. Make iterator based on cursor (port from sbtree) (3 days)
  • 14. Support cursor based iteration in remote storage (2 days)
@enisher enisher was assigned Dec 18, 2013
OrientDB member

Once we've the RIDBag this issue is "only" to put this in BP right? Or we need some other structure to have vertex-centric indexes?

OrientDB member

we do not need other structures only integration code for BP.



I'm going through the issues set for 2.0. Can you please explain this a bit. I'm hoping it addresses a issue we have :).


OrientDB member

Thanks to the RIDBag we already have vertix-centric index structure, but we should support it in Blueprints interface in GraphQuery impl.

OrientDB member

@enisher What's the ETA for this?

@enisher enisher added the ETA: 15days label May 15, 2014
OrientDB member

I've a very simple idea to address this feature with low effort. Please let me know what do you think. We could support VCI (Vertex Centric Indexes) by creating 2 composite NOT-UNIQUE indexes per class where we index:

  • out + any other edge's property
  • in + any other edge's property

In this way if I want to cross all my favorites friends, I could:

select from outE('Friend')[favorite = true].inV() from #13:33

Or, by using indexes:

select expand( rid ) from index:Profile_out_friend_favorite where key = [ #13:33, 'Friend', true ]

We could do the same by indexing target vertex's properties. AFAIK this is something Titan doesn't support. Example. If I want to cross all my friends that lives in Rome I could:

select from out('Friend')[city = 'Rome'] from #13:33

If we could support the dot notation "in-city" to create index, like:

CREATE INDEX index:Profile_out_friend_city ON Profile (@class, not unique

At this point we could use such index:

select expand( rid ) from index:Profile_out_friend_city where key = [ #13:33, 'Friend', 'Rome' ]

All this could be managed automatically by using conventions with index names. WDYT?


@lvca not so efficient as vertex centric indexes, but I like it.

However I'm not sure that it is easy in a common use case, for example for more complex queries like:

select from in("...").out("...").out('Friend')[city = 'Rome'].out("...") from #13:33

Or for cases when this construction is used in where predicate or let clause.

The other issue is that by current design out function doesn't know about following [filter clause], so we have to create a layer that will be able to optimize expression execution.

So my point is: we can easily implement it for specific case, but implement it in general would not be a low effort task with current SQL Engine design.

OrientDB member

@enisher I agree with you. We should probably "compile" the expression or just create a new method as: OSQLFilterItemField.compile() that optimizes the own chain of fields in better way.

Usage of indexes allows us to create multiple indexing against the same collection of links. With classic VCI this is not possible.

OrientDB member

Guys, seems we miss really significant part of design: tracking of related document changes.
That is really tricky part and vertex centric indexes is not possible without it.
@laa, @lvca let's add some extra time to create design for it.

OrientDB member

@enisher what do you mean with tracking? We've the other issue to avoid incrementing the version of vertex in case we add an edge to boost performance on insertion and reducing conflicts.


@lvca I mean keeping vertex centric index up to date when related document fields changed.
It is not an easy issue because there are could be thousands of vertexes related to changed one, so how to find indexes that should be updated.
I think we need to allocate a time for this design issue.

@enisher enisher added the 2 - Sprint label Jul 14, 2014
@lvca lvca removed the 0 - Backlog label Aug 1, 2014
@lvca lvca modified the milestone: 2.0rc1, 2.1 Aug 28, 2014
@lvca lvca assigned laa and unassigned enisher Aug 28, 2014
@lvca lvca modified the milestone: 3.1, 2.1 Jan 31, 2015
@lvca lvca removed the ETA: >5days label Mar 1, 2015
OrientDB member

@luigidellaquila will follow up with a new issue about this.

@laa laa modified the milestone: 2.2, 3.1 Mar 19, 2015
@lvca lvca modified the milestone: 3.1, 2.2 Jul 10, 2015
OrientDB member

We don't need this anymore, but rather the MATCH executor will use indexes on edges if any.

@lvca lvca closed this Sep 15, 2015
@lvca lvca added the wontfix label Sep 15, 2015
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment