Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Read time out while fetching all edges of a SuperNode #1467

Open
Krittam opened this issue Mar 4, 2019 · 8 comments
Open

Read time out while fetching all edges of a SuperNode #1467

Krittam opened this issue Mar 4, 2019 · 8 comments

Comments

@Krittam
Copy link

Krittam commented Mar 4, 2019

I have a graph in which a few nodes have many incoming edges(Supernode). All the edges are of same type/label. There is a query in which i need to report the total no of incoming edges.
I'm using cassandarathrift as storage backend
g.V().has('vid','qwerty').inE().count().next()
This fails with

14108889 [gremlin-server-session-1] WARN org.apache.tinkerpop.gremlin.server.op.AbstractEvalOpProcessor - Exception processing a script on request [RequestMessage{, requestId=c3d902b7-0fdd-491d-8639-546963212474, op='eval', processor='session', args={gremlin=g.V().has('vid','qwerty').inE().count().next(), session=2831d264-4566-4d15-99c5-d9bbb202b1f8, bindings={}, manageTransaction=false, batchSize=64}}].
TimedOutException()
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14696)
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14633)
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:14559)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:741)
at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:725)
at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:143)
at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:100)
at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:82)
at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:129)
at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:288)
at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:285)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
at org.janusgraph.diskstorage.BackendTransaction.edgeStoreMultiQuery(BackendTransaction.java:285)
at org.janusgraph.graphdb.database.StandardJanusGraph.edgeMultiQuery(StandardJanusGraph.java:441)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.lambda$executeMultiQuery$3(StandardJanusGraphTx.java:1054)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.executeMultiQuery(StandardJanusGraphTx.java:1054)
at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.execute(MultiVertexCentricQueryBuilder.java:113)
at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.edges(MultiVertexCentricQueryBuilder.java:133)
at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.initialize(JanusGraphVertexStep.java:95)
at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:101)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:113)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:200)
at java_util_Iterator$next.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117)
at Script13.run(Script13.groovy:1)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:843)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:548)
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233)
at org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:120)
at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:290)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

However g.V().has('vid','qwerty').inE().limit(10000).count().next() gives
==>10000

Now if i wanted to filter all the edges based on some condition i would have used vertex centric indexes but I simply want all the incoming edges.
The said vertex is expected to have millions of such edges.
Please help

@porunov
Copy link
Member

porunov commented Mar 4, 2019

@Krittam , what happens when you use next query?:
g.V().has('vid','qwerty').inE().limit(Long.MAX_VALUE).count().next()

@Krittam
Copy link
Author

Krittam commented Mar 4, 2019

@porunov the said query produced this exception.

TimedOutException()
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14696)
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result$multiget_slice_resultStandardScheme.read(Cassandra.java:14633)
at org.apache.cassandra.thrift.Cassandra$multiget_slice_result.read(Cassandra.java:14559)
at org.apache.thrift.TServiceClient.receiveBase(TServiceClient.java:78)
at org.apache.cassandra.thrift.Cassandra$Client.recv_multiget_slice(Cassandra.java:741)
at org.apache.cassandra.thrift.Cassandra$Client.multiget_slice(Cassandra.java:725)
at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getNamesSlice(CassandraThriftKeyColumnValueStore.java:143)
at org.janusgraph.diskstorage.cassandra.thrift.CassandraThriftKeyColumnValueStore.getSlice(CassandraThriftKeyColumnValueStore.java:100)
at org.janusgraph.diskstorage.keycolumnvalue.KCVSProxy.getSlice(KCVSProxy.java:82)
at org.janusgraph.diskstorage.keycolumnvalue.cache.ExpirationKCVSCache.getSlice(ExpirationKCVSCache.java:129)
at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:288)
at org.janusgraph.diskstorage.BackendTransaction$2.call(BackendTransaction.java:285)
at org.janusgraph.diskstorage.util.BackendOperation.executeDirect(BackendOperation.java:69)
at org.janusgraph.diskstorage.util.BackendOperation.execute(BackendOperation.java:55)
at org.janusgraph.diskstorage.BackendTransaction.executeRead(BackendTransaction.java:470)
at org.janusgraph.diskstorage.BackendTransaction.edgeStoreMultiQuery(BackendTransaction.java:285)
at org.janusgraph.graphdb.database.StandardJanusGraph.edgeMultiQuery(StandardJanusGraph.java:441)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.lambda$executeMultiQuery$3(StandardJanusGraphTx.java:1054)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:98)
at org.janusgraph.graphdb.query.profile.QueryProfiler.profile(QueryProfiler.java:90)
at org.janusgraph.graphdb.transaction.StandardJanusGraphTx.executeMultiQuery(StandardJanusGraphTx.java:1054)
at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.execute(MultiVertexCentricQueryBuilder.java:113)
at org.janusgraph.graphdb.query.vertex.MultiVertexCentricQueryBuilder.edges(MultiVertexCentricQueryBuilder.java:133)
at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.initialize(JanusGraphVertexStep.java:95)
at org.janusgraph.graphdb.tinkerpop.optimize.JanusGraphVertexStep.processNextStart(JanusGraphVertexStep.java:101)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.next(ExpandableStepIterator.java:50)
at org.apache.tinkerpop.gremlin.process.traversal.step.filter.FilterStep.processNextStart(FilterStep.java:37)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.hasNext(AbstractStep.java:143)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ExpandableStepIterator.hasNext(ExpandableStepIterator.java:42)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processAllStarts(ReducingBarrierStep.java:83)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.ReducingBarrierStep.processNextStart(ReducingBarrierStep.java:113)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:128)
at org.apache.tinkerpop.gremlin.process.traversal.step.util.AbstractStep.next(AbstractStep.java:38)
at org.apache.tinkerpop.gremlin.process.traversal.util.DefaultTraversal.next(DefaultTraversal.java:200)
at java_util_Iterator$next.call(Unknown Source)
at org.codehaus.groovy.runtime.callsite.CallSiteArray.defaultCall(CallSiteArray.java:48)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:113)
at org.codehaus.groovy.runtime.callsite.AbstractCallSite.call(AbstractCallSite.java:117)
at Script5.run(Script5.groovy:1)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:843)
at org.apache.tinkerpop.gremlin.groovy.jsr223.GremlinGroovyScriptEngine.eval(GremlinGroovyScriptEngine.java:548)
at javax.script.AbstractScriptEngine.eval(AbstractScriptEngine.java:233)
at org.apache.tinkerpop.gremlin.groovy.engine.ScriptEngines.eval(ScriptEngines.java:120)
at org.apache.tinkerpop.gremlin.groovy.engine.GremlinExecutor.lambda$eval$0(GremlinExecutor.java:290)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.Executors$RunnableAdapter.call(Executors.java:511)
at java.util.concurrent.FutureTask.run(FutureTask.java:266)
at java.util.concurrent.ThreadPoolExecutor.runWorker(ThreadPoolExecutor.java:1149)
at java.util.concurrent.ThreadPoolExecutor$Worker.run(ThreadPoolExecutor.java:624)
at java.lang.Thread.run(Thread.java:748)

@Krittam
Copy link
Author

Krittam commented Mar 13, 2019

I am using janusgraph 0.2.0. I also tried using spark to run OLAP query but i am using yarn to submit spark jobs on my hadoop cluster and there is not enough documentation available for that.
Also in 0.2.0 there are many missing jars which make the problem even worse.

@porunov
Copy link
Member

porunov commented Mar 13, 2019

Can you reproduce this issue with JanusGraph 0.3.1?

@Krittam
Copy link
Author

Krittam commented Mar 13, 2019

on JanusGraph 0.3.1 query fails with this exception:

Frame size (124696054) larger than max length (15728640)!

On increasing the storage.cassandra.frame-size-mb property to 512 the query starts execution but doesn't complete (I waited for around 45 mins before finally giving up on it!)
Also please note that while i tried this with JanusGraph 0.3.1 release, the underlying cassandra db was unchanged (Same as when it was created by the original 0.2.0 release)

@porunov
Copy link
Member

porunov commented Mar 25, 2019

Confirming performance issue in 0.3.1 also.
I've created 1 vertex with 1 million incoming vertices. Count of 1 million vertices took 6 seconds.
Then I've created 1 vertex with 16 million incoming vertices. The count couldn't be executed in 5 minutes.

@porunov
Copy link
Member

porunov commented Mar 25, 2019

Currently I didn't find a good solution to count edges. Vertex centric indexes don't help in count operation also.

@kverma12
Copy link

kverma12 commented Mar 2, 2020

Changing the storage backend to cql and other properties relevant to cql solved the issue for me.
Here are the properties which i used:

storage.backend=cql
storage.cql.keyspace=t_graph
storage.cql.read-consistency-level=ONE

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

No branches or pull requests

4 participants