Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Use of Neo4j indices for HasLabel/Has traversals #46

Open
edkan2016 opened this issue Jan 5, 2017 · 10 comments
Open

Use of Neo4j indices for HasLabel/Has traversals #46

edkan2016 opened this issue Jan 5, 2017 · 10 comments

Comments

@edkan2016
Copy link

edkan2016 commented Jan 5, 2017

I see that traversals with HasLabel/Has steps are not making use of the existing indices on the nodes. Comparing the traversal plans of this plugin and the Gremlin plugin by Thinkaurelius (https://github.com/thinkaurelius/neo4j-gremlin-plugin), it seems that it is not using a strategy similar to org.apache.tinkerpop.gremlin.neo4j.process.traversal.strategy.optimization.Neo4jGraphStepStrategy to fold the HasLabel/Has conditions into the V() step.
Unfortunately, the Thinkaurelius plugin only works for Neo4j 2.x. Is there anything that can be done to incorporate the use of indices into the traversal plan? Thanks!!

@edkan2016 edkan2016 changed the title EK Use of Neo4j indices for HasLabel/Has traversals Jan 5, 2017
@rjbaucells
Copy link
Contributor

Yes, gremlin traversals do not use indices on the neo4j database. The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

https://github.com/SteelBridgeLabs/neo4j-gremlin-bolt/blob/master/src/main/java/com/steelbridgelabs/oss/neo4j/structure/Neo4JGraphFeatures.java#L154

Help is welcome in case you are interested...

@MohitMehta1986
Copy link

MohitMehta1986 commented May 28, 2018

I am facing same issue. I have below query
g.V().has("id", "100300").out("state").values("name").
It is giving 1600 ms to fetch result.
But if i execute using cypher query below
graph.cypher('Match(cp:counterparty)-[s:state]->(cps:counterpartystate) where cp.id="100300"return cps.name')
It is taking 1ms to give result.

Please let me know if we can increase query performance.

I am using gremlin console 3.3.2 version

@MohitMehta1986
Copy link

MohitMehta1986 commented May 28, 2018

Tried beow steps

  1. using subgrabh startegy like below
    g=graph.traversal().withStrategies(SubgraphStrategy.build().vertices(or(hasLabel('counterparty'),hasLabel('counterpartystate'))).edges(hasLabel('state')).create())

  2. using above startegy and below qury gave result in 1ms
    g.V().dedup().by(id).and(hasLabel('counterparty'),has("id","0100300")).coalesce(out("state")).values("name")

got the reason it is because of using has label and "and" operator. it started using indices

@rjbaucells
Copy link
Contributor

The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

None of the gremlin queries you are issuing will use indexes, all they are doing is loading the graph in memory and doing a client side filter. Look at the documentation on how to enable profiling in this library and you will see this behavior.

@MohitMehta1986
Copy link

When I used normal those are very slow on gremlin console. As soon I used subgraph starteagy and using "has label , "and" operator" got results very fast.

So strategies are contributing in traversal?

Is there any OLAP api implementation in .net which i can reuse?

@itzdinsa
Copy link

itzdinsa commented Oct 3, 2018

Hi,
I am using this query : 'graph.traversal().V().hasLabel("USER").has("ID", "123123")'
But it is taking very long time and using same query in cypher if giving instant result. I have all the necessary index created.

On enabling profiler it is giving me this output:

2018-10-03 16:04:05.598 INFO 7311 --- [nio-8080-exec-1] c.s.o.n.s.summary.ResultSummaryLogger : Profile for CYPHER statement: Statement{text='PROFILE MATCH (n) RETURN n', parameters={}}

+-----------------+----------------+------+---------+-----------+-------+
| Operator + Estimated Rows + Rows + DB Hits + Variables + Other |
+-----------------+----------------+------+---------+-----------+-------+
| +ProduceResults | 8 | 8 | 0 | n | |
| | +----------------+------+---------+-----------+-------+
| +AllNodesScan | 8 | 8 | 9 | n | |
+-----------------+----------------+------+---------+-----------+-------+

Why is gremlin not using indexes. Please help

If I am missing something, please show me the correct way

@rjbaucells
Copy link
Contributor

Same response as before in this thread. This library only implements the Gremlin structure interfaces. You need to use CYPHER like:

Iterator<Vertex> vertices = graph.vertices("MATCH (n:User) WHERE ID(n)={id} RETURN n", Collections.singletonMap("id", 123123));

@tanroopdhillon
Copy link

Yes, gremlin traversals do not use indices on the neo4j database. The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

https://github.com/SteelBridgeLabs/neo4j-gremlin-bolt/blob/master/src/main/java/com/steelbridgelabs/oss/neo4j/structure/Neo4JGraphFeatures.java#L154

Help is welcome in case you are interested...

Yes, gremlin traversals do not use indices on the neo4j database. The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

https://github.com/SteelBridgeLabs/neo4j-gremlin-bolt/blob/master/src/main/java/com/steelbridgelabs/oss/neo4j/structure/Neo4JGraphFeatures.java#L154

Help is welcome in case you are interested...

You can count me in , in case you are looking for a contributor

@rjbaucells
Copy link
Contributor

Sure, feel free to send PR with implementation. we can discuss it and collaborate on the PR.

@tanroopdhillon
Copy link

tanroopdhillon commented Feb 4, 2019

Yes, gremlin traversals do not use indices on the neo4j database. The reason for this behavior is that this project is an OLTP implementation of the gremlin API. To be able to use the full power of neo4j from gremlin we need to implement the OLAP API.

https://github.com/SteelBridgeLabs/neo4j-gremlin-bolt/blob/master/src/main/java/com/steelbridgelabs/oss/neo4j/structure/Neo4JGraphFeatures.java#L154

Help is welcome in case you are interested...

I think it should be a part of OLTP implementation only to make use of indices while using haslabel / has step. It should not be done in OLAP implementation.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

5 participants